HMM

From PhosphataseWiki
Revision as of 17:56, 11 September 2015 by Mark (Talk | contribs)

Jump to: navigation, search

The page is under construction.

List of HMMs of phosphatase domains

SAC

List of HMMs of accessory domains

PAP_NTD (Purple Acid Phosphatase, N-Terminal Domain)
CDC25_NTD (CDC25, N-Terminal Domain)
IQ
PPIP5K_RimK
STS_UBA
MTMR5_C1
PP2C_C

The Pfam PP2C_C profile only match to PPP1C subfamily, but not other PPPc subfamilies. It overlaps with our in-house PPPc HMM profile.

General Protocol for HMM building

We usually built the HMMs from PSI-BLAST hits.

To find the domain sequences for building a HMM, we PSI-BLASTed the domain sequence or the full sequence. It sometime matters if you query the region that is supposed to contain the domain (based upon structure or any evidence) or the full sequence. The full sequence is often more sensitive to find weak hits to the domain. We recommended to download the files of Alignment, Search Strategies, and PssmWithParameters of PSI-BLAST result for reproductivity.

After several rounds of PSI-BLAST, we download the sequences of the aligned regions (not the complete sequences) from PSI-BLAST result. Because some sequences are redundant, which are not useful to build the HMM profile, we create the non-redundant sequence data set by using program CD-HIT (usually with sequence identity threshold as 70%, i.e. the parameter -c is set as 0.7).

We then carry out multiple sequence alignment (MSA) using programs such as MUSCLE, manually adjust the alignment usually by removing low-quality region in MSA editor such as JalView. We further inspect the distribution of sequence lengths in the MSA and remove the sequences which are shorter than most sequences in the MSA. How we remove the short sequences is dependent on the distribution and the MSA itself, which varies case by case.

We carry out MSA program and manually adjust the resulted MSA again after remove the short sequences. Then, we build HMM using program HMMBUILD. Depending on the format you use, you may need to convert the MSA into STOCKHOLM format before running HMMBUILD.