Trees and Alignments of Protein Phosphatases

The page archives the trees and the underlying alignments of protein phosphatases.

Human

We infer the trees of human protein phosphatases by fold and family.

Fold Family Alignment PhyML JTT PhyML best model RAxML JTT RAxML best model
CC1 PTP Phy, Fasta JTT LG+I+G JTT LG+I+G
CC1 DSP+PTEN Phy, Fasta JTT LG+G JTT LG+G
CC1 Myotubularin Phy, Fasta JTT LG+G JTT LG+G
CC1 Sac Phy, Fasta JTT LG+G JTT LG+G
CC1 Paladin - - - - -
CC2 - - - - -
CC3 - - - - -
PPPL Phy, Fasta JTT LG+I+G JTT LG+I+G
PPM Phy, Fasta JTT LG+I+G JTT LG+I+G
HAD Phy, Fasta JTT LG+G JTT LG+G
HP Phy, Fasta JTT LG+I+G JTT LG+I+G
AP Phy, Fasta JTT WAG+G JTT WAG+G
PHP - - - - -
RTR1 - - - - -

Notes:

Nine genomes

In addition to human,we infer the trees of the protein phosphatases in nine organisms, including human, sea urchin, fruit fly, Caenorhabditis elegans, the sea anemone Nematostella vectensis, the sponge Amphimedon queenslandica, a unicellular choanoflagellate Monosiga brevicollis, Saccharomyces cerevisiae and Dictyostelium discoideum.

Fold Family Alignment PhyML JTT PhyML best model RAxML JTT RAxML best model
CC1 PTP Phy, Fasta JTT LG+G JTT LG+G
CC1 DSP+PTEN Phy, Fasta JTT LG+G+F JTT LG+G+F
CC1 Myotubularin Phy, Fasta JTT LG+F+I+G JTT LG+F+I+G
CC1 Sac Phy, Fasta JTT LG+I+G JTT LG+I+G
CC1 Paladin Phy, Fasta JTT LG+G JTT LG+G
CC2 Phy, Fasta JTT LG+I+G JTT LG+I+G
CC3 Phy, Fasta JTT LG+I+G JTT LG+I+G
PPPL Phy, Fasta JTT LG+I+G JTT LG+I+G
PPM Phy, Fasta JTT LG+G JTT LG+G
HAD Phy, Fasta JTT LG+G JTT LG+G
HP Phy, Fasta JTT VT+G JTT VT+G
AP Phy, Fasta JTT LG+G JTT LG+G
PHP Phy, Fasta JTT WAG+G JTT WAG+G
RTR1 Phy, Fasta JTT WAG+G JTT WAG+G

Methods

We aligned the phosphatase domains by PROMALS3D followed by manual adjustment. We inferred the tree by two methods: neighbour joining (NJ) and Maximum Likelihood (ML). We used MEGA (version 6.0) to infer NJ trees, and RAxML (version 8.2.9 and PhyML (version 3.0) to infer ML trees. We used JTT substitution model in both methods. We also use the best model from Prottest by Bayesian information criterion (BIC) to infer ML trees (see individual folds and families below).

For PhyML, we use the parameters:

-d aa -b 100 -m MATRIX -f e -v e -a e -s SPR --rand_start 5 --n_rand_starts 5 --r_seed 1234

where MATRIX is the name of the subtitution matrix, e.g. LG, JTT, WAG; "-f e" toggles on +F, "-v e" toggles on +I, and "-a e" toggles on +G.

For RAxML, we use the parameters:

-f a -m MODEL -x 1234 -p 5678 -# 100

where MODEL is the model, i.e. the substitution matrix + I + G + F (see RAxML's manual).

The best models chosen by difference measurements (AIC, AICc, BIC, DT) are basically similar, but the rankings are different. For example, below human PTP (the numbers are weight (ranking).

    model          AIC         AICc        BIC         DT
    LG+I+G+F       0.98(1)     0.00(3)     0.00(3)     0.01(4)     
    LG+I+G         0.02(2)     1.00(1)     1.00(1)     0.01(1)     
    LG+G+F         0.00(3)     0.00(4)     0.00(4)     0.01(3)     
    LG+G           0.00(4)     0.00(2)     0.00(2)     0.01(2) 

However, sometimes the best models chosen by different measurements are quite different. For example, human myotubularins:

    model          AIC         AICc        BIC         DT
    LG+G+F         0.74(1)     0.76(1)     0.00(5)     0.03(4)     
    LG+I+G+F       0.26(2)     0.24(2)     0.00(7)     0.03(3)     
    JTT+G+F        0.00(3)     0.00(3)     0.00(10)    0.01(26)    
    JTT+I+G+F      0.00(4)     0.00(4)     0.00(13)    0.01(25)    
    VT+G+F         0.00(5)     0.00(5)     0.00(12)    0.00(62)    
    VT+I+G+F       0.00(6)     0.00(6)     0.00(15)    0.00(59)    
    WAG+G+F        0.00(7)     0.00(7)     0.00(14)    0.01(36)    
    WAG+I+G+F      0.00(8)     0.00(9)     0.00(16)    0.01(35)    
    LG+G           0.00(9)     0.00(8)     0.97(1)     0.03(1)     
    LG+I+G         0.00(10)    0.00(10)    0.03(2)     0.03(2) 

And human DSP+PTEN:

    model          AIC         AICc        BIC         DT
    LG+G+F         0.73(1)     0.00(61)    0.00(3)     0.02(4)     
    LG+I+G+F       0.27(2)     0.00(62)    0.00(4)     0.02(3)     
    LG+G           0.00(3)     0.00(7)     0.93(1)     0.02(1)     
    LG+I+G         0.00(4)     0.00(22)    0.07(2)     0.02(2)