Tree TopologiesTopNotationCaveats

Caveats

In here we will post some comments or errata about the associated Chapter 15 from the book "Algebraic Statistics for Computational Biology".

Character table
In Section 2, the authors state that the character table of the group Z2×Z2 used to compute the Fourier transform is given by the following matrix.

  1  1  1  1
  1 -1  1 -1
  1  1 -1 -1
  1 -1 -1  1

The authors used this matrix because it is a standard 4 ×4 Hadamard matrix and it is also stable after simultaneously switching rows 2 and 3 and columns 2 and 3. Nevertheless, the Kimura 2 parameter model makes the assumption that the rate parameters for transitions (between purines or between pyrimidines) are equal to each other and different to the rate parameters for transversions (between a purine and a pyrimidine). The character table that agrees with this assumption is given by the matrix

  1  1  1  1
  1  1 -1 -1
  1 -1  1 -1
  1 -1 -1  1

ML degree
One may be tempted to compute the maximum likelihood degree by applying the specialized Fourier transform to the ideal of invariants. Nevertheless, for some models, the ML degree obtained by this process differs from from the ML degree reported previously. For example, it was reported in Chapter 15 that the "comb" model with 3 leaves, uniform root distribution and MC assumption has ML degree 15. Here we report an ML degree of 11 for this model. The difference resides in our interpretation of the indeterminates pi as sum of all probabilities in class i, as opposed to denoting just one single probability (even though Chapter 15 defines the indeterminates pi as sum of all probabilities in a class, some computations were done as if they denoted single probabilities). So for this model, multiplying each column of the specialized Fourier transform by the cardinality of the corresponding class results in a new linear transformation that gives the previously reported ML degree. This transform is given by the matrix

  4 24 12 24
  4 -8 12 -8
  4  8 -4 -8
  4 -8 -4  8

This change works for every model where the specialized Fourier transform is a square matrix, that is, where np=nq. In Table 15.2 of Chapter 15 the authors report ML degrees for the two trees on three taxa and MC assumption. These values were obtained using the Fourier transformation into single probabilities, that is, after multiplying the columns of the specialized Fourier transform by cardinalities of the corresponding classes. The ML degrees reported in this website for these (and every other) models use the specialized Fourier transform to be consistent with our interpretation of the indeterminates pi.

Furthermore, Table 15.1 in Chapter 15 gives the ML degrees of the unique tree on three leaves and no MC and the Quartet tree. These results were based on computations done in the paper "Solving the Likelihood Equations" by Hosten, Khetan, and Sturmfels. This computations were done assuming that the indeterminates pw were sums of probabilities and not single probabilities, even though the authors do not mention explicitly this fact. Otherwise, each linear polynomial in the Fourier transform given before Example 14 in that paper would have to be multiplied by 2 since the authors do not work directly with the 3-leaf tree with unknown root distribution but with the corresponding unrooted 4-leaf tree. Also note that the ML degree in the first row of Table 15.1 corresponds to the 3-taxa tree with no molecular clock and unknown root distribution. So it corresponds to the ML degree of the Degenerated Quartet model.

Nevertheless, the situation when np is not equal to nq is more complicated. In this case, one has to perform two steps, First, reduce the general Fourier transform into a square matrix of size np×np. Second, introduce several equalities among the Fourier parameters to reflect that some of the Fourier parameters associated to the rows of this new matrix are actually equal. This is a long process and has not been automated. It is for this reason that we have decided to report only the ML degree obtained by applying the given specialized Fourier transform.

For example, in "Solving the likelihood equations" in Example 15 and in "Maximum likelihood on four taxa phylogenetic trees" by Chor, Khetan and Snir it is reported an ML degree of 9 for the 4 Comb with MC assumption under the Binary Symmetric model. In this website we report an ML degree of 17. For this model np = 6 and nq = 5. Starting with the general 16×16 Fourier transform, after adding columns corresponding to probabilities in the same class (6 classes) and deleting the rows corresponding to Fourier probabilities equal to zero one obtains the following 8×6 matrix


  2     4     2     2     4     2
  2    -4     2     2    -4     2
  2     0    -2    -2     0     2
  2     0    -2    -2     0     2
  2     0    -2     2     0    -2
  2     0    -2     2     0    -2
  2     4     2    -2    -4    -2
  2    -4     2    -2     4    -2

Form this matrix one obtains the specialized Fourier transform by adding rows according to the Fourier classes and then dividing each entry by the cardinalities of the two corresponding classes. But if one only deleted the two repeated rows and divides each column by the cardinality of the corresponding class one obtains the 6×6 matrix

  1     1     1     1     1     1
  1    -1     1     1    -1     1
  1     0    -1    -1     0     1
  1     0    -1     1     0    -1
  1     1     1    -1    -1    -1
  1    -1     1    -1     1    -1

Applying this Fourier transform to the ideal of invariants generated by q4-q5 and q2q4-q1q6 one obtains the desired result of ML degree equal to 9.

Luis David Garcia, October 5, 2005

Tree TopologiesTopNotationCaveats