Difference between revisions of "HMM PD00127"
(→How the HMM is built) |
|||
Line 5: | Line 5: | ||
=== How the HMM is built === | === How the HMM is built === | ||
− | We searched the Pfam domains of ''C. elegans'' ACP5 ([[Phosphatase_Sequence_CeleP014_AA]]) and human ACP5 via Pfam server. Based upon the position of Metallophos domain and the length of the linking region between Metallophos and Pur_ac_phosph_N domain, we guessed the region of 1-100 of ''C. elegans'' ACP5 ([[Phosphatase_Sequence_CeleP014_AA]]) may contain the Pur_ac_phosph_N domain. | + | We searched the Pfam domains of ''C. elegans'' ACP5 ([[Phosphatase_Sequence_CeleP014_AA]]) and human ACP5 via Pfam server. Based upon the position of Metallophos domain and the length of the linking region between Metallophos and Pur_ac_phosph_N domain, we guessed the region of 1-100 of ''C. elegans'' ACP5 ([[Phosphatase_Sequence_CeleP014_AA]]) may contain the Pur_ac_phosph_N domain. We then PSI-BLASTed the region against NR database via NCBI BLAST server. |
− | We | + | We downloaded the sequences. Then, we i) removed redundant sequences above 70% identity using CD-HIT program (parameter -c 0.7, other parameters default), ii) aligned the sequences by MUSCLE program (default parameters), iii) manually curated the alignment by removing low quality columns, iv) removed the sequences shorter than the length of the shortest 25% sequences, v) realigned the sequences by MUSCLE program, vi) manually curated the alignment, vii) build HMM profile using HMMBUILD program (default parameters), viii) validate the HMM profile that whether it was able to detect the domain from protein phosphatases such as ''C. elegans'' ACP5 ([[Phosphatase_Sequence_CeleP014_AA]]). |
Revision as of 05:54, 10 September 2015
Description
The HMM extends Pfam profile Pur_ac_phosph_N, which cannot detect the domain in some phosphatases, such as C. elegans ACP5 (Phosphatase_Sequence_CeleP014_AA) and one of the Monosiga ACP5s (Phosphatase_Sequence_MbreP082_AA).
How the HMM is built
We searched the Pfam domains of C. elegans ACP5 (Phosphatase_Sequence_CeleP014_AA) and human ACP5 via Pfam server. Based upon the position of Metallophos domain and the length of the linking region between Metallophos and Pur_ac_phosph_N domain, we guessed the region of 1-100 of C. elegans ACP5 (Phosphatase_Sequence_CeleP014_AA) may contain the Pur_ac_phosph_N domain. We then PSI-BLASTed the region against NR database via NCBI BLAST server.
We downloaded the sequences. Then, we i) removed redundant sequences above 70% identity using CD-HIT program (parameter -c 0.7, other parameters default), ii) aligned the sequences by MUSCLE program (default parameters), iii) manually curated the alignment by removing low quality columns, iv) removed the sequences shorter than the length of the shortest 25% sequences, v) realigned the sequences by MUSCLE program, vi) manually curated the alignment, vii) build HMM profile using HMMBUILD program (default parameters), viii) validate the HMM profile that whether it was able to detect the domain from protein phosphatases such as C. elegans ACP5 (Phosphatase_Sequence_CeleP014_AA).