Difference between revisions of "HMM PD0133"

From PhosphataseWiki
Jump to: navigation, search
(How the HMM is built)
(How the HMM is built)
Line 9: Line 9:
  
 
=== How the HMM is built ===
 
=== How the HMM is built ===
We PSI-BLASTed the sequence that containing the C1 domain in ''C. elegans'' mtm-5 (1540-1590 determined by searching the complete sequence against CDD and Pfam database). However, the search converged right after the 1st round.
+
We PSI-BLASTed the sequence that containing the C1 domain in ''C. elegans'' mtm-5 (1540-1590 determined by searching the complete sequence against CDD and Pfam database). However, the search against protein NR data set converged right after the 1st round.
  
We PSI-BLASTed the sequence that containing the C1 domain in ''Drosophila Melanogaster'' sf (1790-1840 determined by searching the complete sequence against CDD and Pfam database). (it is undergoing)
+
We PSI-BLASTed the sequence that containing the C1 domain in ''Drosophila Melanogaster'' sf (1790-1840 determined by searching the complete sequence against CDD and Pfam database). There are more than 20000 hits (over the maximum of NCBI BLAST server) in the 2nd round, when searched against protein NR dataset. We therefore searched against UniProt, instead.
  
 
Below are the two full sequences used:
 
Below are the two full sequences used:

Revision as of 19:25, 11 September 2015

Back to List of HMMs

Symbol: MTMR5_C1

Name: MTMR5, C1 domain

Description

The MTMR5 subfamily has a C1 domain which is able to be detected by Pfam C1_1 profile in C1 clan. We try to build a HMM, which can detect the C1 domain of MTMR5.

How the HMM is built

We PSI-BLASTed the sequence that containing the C1 domain in C. elegans mtm-5 (1540-1590 determined by searching the complete sequence against CDD and Pfam database). However, the search against protein NR data set converged right after the 1st round.

We PSI-BLASTed the sequence that containing the C1 domain in Drosophila Melanogaster sf (1790-1840 determined by searching the complete sequence against CDD and Pfam database). There are more than 20000 hits (over the maximum of NCBI BLAST server) in the 2nd round, when searched against protein NR dataset. We therefore searched against UniProt, instead.

Below are the two full sequences used:

C. elegans sequence:

MRDPDKVKSGPICDTVAVIVLEESDDENALPDVLHEVQSPHTSDNIPTSSIKKFARPRGWYNQSVSSPSEFFYQILTTERGTRRIAYVLSTWEEDEKTLNFKAVSIVLISQNFHPKAFKEILLEISNDLRTPEFSSSSELIRFLTYELVEEGSTIEIRTKTLHVELGFELIPISPVTGKDVAMLFKMLGFQNVIKIIHALLSDCRIVLASSSLMRLSRCQNAILSLLYPFEYVHSCVTILPDSLAEVLESPTPFLIGVLSEFVTSFGDENIVVYLDNGEVHVPDHAEIYKSDDYYYNSLHQRLRDVMFTTTSQEDLSIPNEERIEVDDFILDKKLRACFIYYFAELLYGYQYYILYTRIKGNFEKKLTTSLTFHVGAFRGFRKLTDMMSSSLLKSVYFQTFILTRALPRRKHDLFDEISCFKELDQLIFKQNSTSSESKKIIEHISCELIQKERYMEKCSARKQEIFTKIHWISGKELAQNNNSIIHTVKPKMRSNVILQAMLPVVNTHAEYHANQFEAYAHRIEALRNCLAAIFEGKVAFASKSLDAVKSSMRFAPLRIELCRLLNQKCSHDKLTDKQFEDIALLMNAALQAECEEDKDGVVRSLMYLSNVYSRKVAQGMQQYMYTAVQEHKVWKNQRFWTSCFYYEVHEMLFSEMLQKDRKITESLWCHTLRPCAMEMINTDDTDQEELVKQENEMIQAQAKHFANILISLQIPLSEEFFEHEDAHRSVLNEKCKWIVNTLDSILGVTGRINGLSLSRIQTYVEAHVESLRDVYVEMSTGEHLKKGNFDPVLAHGEFLISDPIDCYLLTSIEESEMSLNRLENLLPADGSLFLTNYRVIFKGKSVDINATNGTIVQTIPLYSMESFKKLTNKKLIPTQLIEKGVKIEHIISIRSSCASSIIIAFDEDEINNMAIEKFLEVIETNSHNSFAFYNTRKDMKVVENGSHKFGTLNSAIRGFTKKKTDTRRIRSHSSHRGSIQLSFDKMEELDYLKKNAHIRYAVIDYPRIGLNSKIVKLRMSHSNLDYTICPSYPGNFIVPSETNESELAKVAKGFVEHRLPVVVWMNENGALLVRASAFTSIDMVKKLKKVVNYRRNASKLTGSMTGSQQTLHSKASSNEESSSNIVAGAEIKSAEVQMNYIAKLSNSSQRAVSYALPTQYADKFSTFNDGCTLTQNNANGFPTTRIHRKALYVLLEKGHGVKIPIDSNAEAIMVRSVKESELRRSLQRARQICSSEFQVENRTSFLESWNASNWPQCVSRMIELSNSIVALMNLYNSSVAICLEAGRSITTILSSLSQLLSDPYYRTCDGFQVLVEKEWLAFGHYFHKDTETSSPSFICFLDCVYQISQQYPTAFEFSYFYISFLAYHSTAGYFRTFIDDCEEKRLQSDANEFYLPDNLATINVWEFIKLRNRVSAAFYNELYEQIGDIVIPSSSIPQIHMWPFLAETHLKYGSPYDIEPASHEQQLVDPDYEEEEDWSKLNNTDIDERHLNRRVRSPERDPANMDMIRLLQKSYLTELFDASDRKTTTNGESNGKETIHELTPFTVGARPVQCCYCTNILTRWSKAVHCKKCRIHVHEGCVNRNITIGNITHTWDAKPFEDIKMPSGAIQIGTPQAEKMLHSPNNTLTRESMSPPTANTIPPLCTGYLSKRGAKLKLWVPRFFVLYPDSPKVYYYEDFENWKTAEKPSGCIDLVDFKSFNLEQTGRRGLIELHMKNKTHRLLSENINEAIRWKECIEQVIRD

Drosophila Melanogaster sequence:

MSRLADYFVIVGYDSDKEKTASNVGGQPTCGKIVQRFPEKDWPDTPFIEGIEWFCQPLGWSLSYEKQEPKFFVSVLTDIDANKHYCACLSFHETVAITQTRSVDDEDETIGSSRLLGATPSSMDGITTTSTPASITHHSVMYAPKCLVLISRLDCAETFKNCLGTIYTVYIENLAYGLETLIGNILGCIQVPPAGGPQVRFSIGAGDKQSLQPPQSSSLPTTGSGVHFLFKQLGIKNVLILLCSVMTENKILFLSKCYWHLTDSCRALVALMYPFRYTHVYIPILPAPLTEVLSTPTPFIMGIHSSLQTEITDLLDVIVVDLDGGLVTIPESLTPPVPILPSPLWEQTQDLLSMILFPNLAQADLAFPTLERPSAIAKTDAQIDKELRAIFMRLFAQLLQGYRSCLTIIRIHPKPVITFHKAGFLGARDLIESEFLFRVLDSMFFTTFVNERGPPWRSSDAWDELYSSMNELLKSEAQNRNLVGRTQRFKGYFNFTFPSYFQILTHIQELGRVLYENEGTLAHISYAQKVLRPPEGAFQRIHQPAFPRISSEKVELIIQEGIRKNGVPQRFHVTRNQHRIIPMGPRLPEALDVRPNVQNSARRLEVLRICVSYIFENRITDARKLLPAVMRTLMHRDARLILCREFFGYVHGNKAVLDHQQFELVVRFMNKALQKSSGIDEYTVAAALLPMSTIFCRKLSTGVVQFAYTEIQDHAIWKNLQFWESTFFQDVQGQIKALYLLHRRQNEHQKEANCVLDEVPLEEPTALEITAEQLRKSPNIEEEKKAELAKSEESTLYSQAIHFANRMVSLLIPLDVNVDAASKPKPAFRLEENQSVSNSIMGSHSLSEHSDEGFEENNALEIGVTVGKTISRFIDCVCTEGGVTSEHIRNLHDMVPGVVHMHIESLEPVYLEAKRHPHVQKPKIQTPCLLPGEDLVTDHLRCFLMPDGREDETQCLIPAEGALFLTNYRVIFKGSPCDPLFCEQVIVRTFPIASLLKEKKISVLYLAHLDQTLTEGLQLRSSSFQLIKVAFDPEVTPEQIESFRKILSKARHPFDEFEYFAFQSYGTMLQGVAPLKTKEKYSTLKGFAKKTLLRGAKKAGFKQKQQTKRKLVSDYDYGSADAQETQSIDDELEDGDEFETQNNAMPRLLTTKDVERMRERSYVQDWKRLGFDAESQRGFRISNANTSYATCRSYPAIIVAPVQCSDAAIMHLGRCFKGQRIPLPTWRHANGALLIRGGQPNSKSVIGMLKNTTGSTTNAHHDVTHYPEQDKYFLALINTMPKLTPLALNQYSGMNLSMSSLMGHSSSDDRQPLTPELSRKHKNNLDISDGNKSSQGGKGGTMKGNPKNSLAHPFRKMRLYALGEKSQAKSNMNVDFCADFIPVDYPDIRQSRPAFKKLIRACMPSHNTNEADGQSFAKMVEQSDWLQQISSLMQLSGAVVDLIDLQESSVMLSLEDGSDVTAQLSSIAQLCLDPYYRSLDGFRVLVEKEWLAFGHRFAHRSNLKPSHANTNIAFAPTFLQFLDVVHQLQRQFPMAFEFNDFYLRFLAYHSVSCRFRTFLFDCELERSDSGIAAMEDKRGSLNAKHMFGAGGMATNGSDDECSVYPLDIRSQRAPAPLNRIGHSIFDYIERQHNKTPIFYNFLYSGDKSVTLRPQNNVAALDLWCYYTNEELAQGAPYDLEVTTVDDEIDLSETKGKRMVITAGYDNMEKCNPSAYVCLLSEVKQAETERGHLPQKWLQVWNSLEVPQLEPVARNTSLGNIFVQTHQHKRSTLEIIMKGRLAGYQDKYFHPHRFEKHPYTTPTNCNHCTKLLWGPVGYRCMDCGNSYHEKCTEHSMKNCTKYKAIDGAVGPPNVNMSQGDTASIASSAATTARTSSHHFYNQFSSNVAENRTHEGHLYKRGALLKGWKQRWFVLDSIKHQLRYYDTSEDTAPKGIIELAEVQSVTAAQPAQIGAKGVDEKGFFDLKTSKRIYNFYAINANLAQEWIEKLQACLQ