Difference between revisions of "HMM PD00128"

From PhosphataseWiki
Jump to: navigation, search
(How the HMM is built)
Line 11: Line 11:
 
We searched the Pfam domains of ''Drosophila melanogaste'' string, one of the two fly CDC25s, human CDC25A and ''Nematostella vectensis'' CDC25 via Pfam server. Based upon the position of CDC25 phosphatase domain and the length of the linking region between phosphatase domain and CDC25_NTD/M-inducer_phosp domain, we guessed the region of 1-300 of ''Drosophila melanogaste'' string may contain the CDC25_NTD/M-inducer_phosp domain. We then PSI-BLASTed the region against NR database via NCBI BLAST server.  
 
We searched the Pfam domains of ''Drosophila melanogaste'' string, one of the two fly CDC25s, human CDC25A and ''Nematostella vectensis'' CDC25 via Pfam server. Based upon the position of CDC25 phosphatase domain and the length of the linking region between phosphatase domain and CDC25_NTD/M-inducer_phosp domain, we guessed the region of 1-300 of ''Drosophila melanogaste'' string may contain the CDC25_NTD/M-inducer_phosp domain. We then PSI-BLASTed the region against NR database via NCBI BLAST server.  
  
We downloaded the sequences. Then, we i) removed redundant sequences above 70% identity using CD-HIT program (parameter -c 0.7, other parameters default), ii) aligned the sequences by MUSCLE program (default parameters), iii) manually curated the alignment by removing low quality columns, iv) removed the sequences shorter than the length of the shortest 25% sequences, v) realigned the sequences by MUSCLE program, vi) manually curated the alignment, vii) build HMM profile using HMMBUILD program (default parameters), viii) validate the HMM profile that whether it was able to detect the domain from protein phosphatases such as ''Drosophila melanogaste'' string and "C. elegans" CDC25s.
+
We downloaded the sequences. Then, we i) removed redundant sequences above 70% identity using CD-HIT program (parameter -c 0.7, other parameters default), ii) aligned the sequences by MUSCLE program (default parameters), iii) manually curated the alignment by removing low quality columns, iv) removed the sequences shorter than the length of the shortest 25% sequences, v) realigned the sequences by MUSCLE program, vi) manually curated the alignment, vii) build HMM profile using HMMBUILD program (default parameters), viii) validate the HMM profile that whether it was able to detect the domain from protein phosphatases such as ''Drosophila melanogaste'' string and "C. elegans" CDC25s. We found a hit in ''Amphimedon queenslandica'' CDC25 (conditional E-value 0.00018). The best hit in C. elegans has a conditional E-value of 0.0049.
  
 
We also carried out PSI-BLAST using the CDC25_NTD/M-inducer_phosp domain of ''Nematostella vectensis'' CDC25. We found putative CDC25_NTD/M-inducer_phosp domain in Trichocephalida order. We confirmed the domain by comparing it with our in-house CDC25_NTD HMM profile. We further PSI-BLASTed the CDC25_NTD domain of ''Trichocephalida spiralis'' CDC25, but did not find any hit in any Caenorhabditis species.
 
We also carried out PSI-BLAST using the CDC25_NTD/M-inducer_phosp domain of ''Nematostella vectensis'' CDC25. We found putative CDC25_NTD/M-inducer_phosp domain in Trichocephalida order. We confirmed the domain by comparing it with our in-house CDC25_NTD HMM profile. We further PSI-BLASTed the CDC25_NTD domain of ''Trichocephalida spiralis'' CDC25, but did not find any hit in any Caenorhabditis species.

Revision as of 17:38, 10 September 2015

Back to List of HMMs

Symbol: CDC25_NTD

Name: CDC25, N-Terminal Domain

Description

The CDC25_NTD HMM profile is related to Pfam profile M-inducer_phosp (PF06617). The Pfam profile is able to detect CDC25_NTD domains in deuterostome CDC25s and basal eumetazoa Nematostella vectensis. However, no CDC25_NTD domain is found in ecdysozoa CDC25s by Pfam profile. Using our in-house profile, we are able to detect the CDC25_NTD domain in insect CDC25s (e.g. string and twine in Drosophila melanogaster) and CDC25s in the nematodes of Trichocephalida order. We did not found CDC25_NTD domain in Caenorhabditis species by comparing the full sequence and domain sequence of Trichocephalida spiralis against NR database via NCBI BLAST server, nor did we find it by comparing it with our in-house CDC25_NTD profile.

How the HMM is built

We searched the Pfam domains of Drosophila melanogaste string, one of the two fly CDC25s, human CDC25A and Nematostella vectensis CDC25 via Pfam server. Based upon the position of CDC25 phosphatase domain and the length of the linking region between phosphatase domain and CDC25_NTD/M-inducer_phosp domain, we guessed the region of 1-300 of Drosophila melanogaste string may contain the CDC25_NTD/M-inducer_phosp domain. We then PSI-BLASTed the region against NR database via NCBI BLAST server.

We downloaded the sequences. Then, we i) removed redundant sequences above 70% identity using CD-HIT program (parameter -c 0.7, other parameters default), ii) aligned the sequences by MUSCLE program (default parameters), iii) manually curated the alignment by removing low quality columns, iv) removed the sequences shorter than the length of the shortest 25% sequences, v) realigned the sequences by MUSCLE program, vi) manually curated the alignment, vii) build HMM profile using HMMBUILD program (default parameters), viii) validate the HMM profile that whether it was able to detect the domain from protein phosphatases such as Drosophila melanogaste string and "C. elegans" CDC25s. We found a hit in Amphimedon queenslandica CDC25 (conditional E-value 0.00018). The best hit in C. elegans has a conditional E-value of 0.0049.

We also carried out PSI-BLAST using the CDC25_NTD/M-inducer_phosp domain of Nematostella vectensis CDC25. We found putative CDC25_NTD/M-inducer_phosp domain in Trichocephalida order. We confirmed the domain by comparing it with our in-house CDC25_NTD HMM profile. We further PSI-BLASTed the CDC25_NTD domain of Trichocephalida spiralis CDC25, but did not find any hit in any Caenorhabditis species.