다음 파일이 있습니다.
chr11_pilon3.g3568.t1 transcript:OIT01734 transcript:OIT01734 1.1e-107 389.8 1000 218 992 1 216 130 345 MDALTRHIQGDVPWCMLFADDIILIDETRAGVSERLEIWRQTLESKGFKISRSKTEYLECKFGDEPSGVGREVMLGSQAIAKRDSVRYLGSVIQGDGEIDGDVTHRIGAGWSKWRLASGVLCDKKIPHKLKGKFFRAMVRPAMFYEAECWPVKNSHIQRMKVAEMRMLRWMCGHTRLDKIKNEVIRQKVGVAPVDKKMGEARLRWFGHVRRRGPDA MDALTRHIQGDVPWCMLFADDIVLIDETRVGVNERLEVWRQTLESKGFKLSRSKTEYLECKFSAESSEVGRDVKLGSQVIAKRDSFRYLGSVIQGEGEIDGDVTHRIGAGWSKWRLASGVLCDKKVPQKLKGKFYRAVVRPAMLYGAECWPVKNSHVQRMKVAEMRMLRWMRGLTRLDRIRNEVIREKVGVALVDEKMREARLRWYGHVRRRRPDA MDALTRHIQGDVPWCMLFADDIILIDETRAGVSERLEIWRQTLESKGFKISRSKTEYLECKFGDEPSGVGREVMLGSQAIAKRDSVRYLGSVIQGDGEIDGDVTHRIGAGWSKWRLASGVLCDKKIPHKLKGKFFRAMVRPAMFYEAECWPVKNSHIQRMKVAEMRMLRWMCGHTRLDKIKNEVIRQKVGVAPVDKKMGEARLRWFGHVRRRGPDAR* MKVWERVVEARVREMTSISVNQFGFMPGRSTTEAIHLVRRLVEHFRDKKKDLHMVFIDLENAYDKVPREVLWRCLEAKSVPEAYIRVIKDMYDGAKTRVRTVGGDSDHFPVVMGLHQGSALSPLLFALVMDALTRHIQGDVPWCMLFADDIVLIDETRVGVNERLEVWRQTLESKGFKLSRSKTEYLECKFSAESSEVGRDVKLGSQVIAKRDSFRYLGSVIQGEGEIDGDVTHRIGAGWSKWRLASGVLCDKKVPQKLKGKFYRAVVRPAMLYGAECWPVKNSHVQRMKVAEMRMLRWMRGLTRLDRIRNEVIREKVGVALVDEKMREARLRWYGHVRRRRPDAPVRIYKSAILGHLNSHGSQNALAGPVEAEENRQKTKKEVMEEIIQKSKFFKAQKAKDREENDELTEQLDKDFTSLVESKALLSLTQPDKINALKALVNKNISVGNVKKDEVADVPRKASIGKEKPDTYEMLVSEMALDMRARPSDRTKTPEEIAQEEKERLELLEQEXXXXXXXXXXXXXXDGNASDDNSKLVKDPRTVSGDDLGDDLEEVPRTKLGWIGEILRRKENELESEDAASSGDSDDGEDEGXXXXXXXXXXXXXXXXXXXXDEEQGKTQTIKDWEQSDDDIIDTELEDDDEGFGDDAKKVVKIKDHKEENLSITVAAENKKKMQVFYGVLLQYFAVLANKKPLNSKLLNLLVKPLMEMSAVSPYFAAICARQRLQRTRAQFCEDLKNTGKSSWPSLKTIFLLRLWSMIFPCSDFRHCVMTPAILLMCEYLMRCTIISGRDIAIASFLCSLLLSVIKQSQKFCPEAIVFIQTLLMAALDRKQRSNSQLDNLMEIKELGPLLCIRSSKVEMDSLDFLTLMDLPEDSQYFHSDNYRTSMLVTVLETLQGFVNVYKELISFPEIFMLISKLLCKMAGENHIPDALREKIKDVSQLIDTKAQEHHMLRQPLKMRKKKPVPIRMLNPKFEENFVKGRDYDPDRERA 389.8 1000 216 85.6 185 31 200 0 0 92.6 0 22IV6AV2SN4IV11IL12GSDA1PS1GE3ED1MK4AV6VF9DE29IV1HQ6FY2MV5FL1EG10IV14CR1HL4KR1KR5QE5PL2KE2GR6FY6GR3 85.6 1.1e-107 99.1
gene.9403.0.4.p1 transcript:OIT35479 transcript:OIT35479 8.5e-191 667.5 1721 690 406 1 378 1 378 MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVLSTCRSFSKSGVPFHSMVVTGGFCQRTQLENLRQELDILIATPGRFMFLIKEGYLQLTNLKCAVLDEVDILFSDEDFETAFQCLINSSPITTQYLFVTATLPMDIYNKLVESFPDCELVSGPGMHRTSPGLEEFLVDCSGDETAEKSPDTAFINKKNALLHLVEDSPVPKTIVFCNKIDSCRKVENALKRFDRKGFSIKILPFHAALDQRRRLANMEEFRRSKMENVSLFLVCTDRASRGIDFEGVDHVVLFDYPRDPSEYVRRVGRTARGAGGKGKAFIFAVGKQVSLARRIMERNKKGHPVHDVPSILT* MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVCQISSSIKGTFATYSPYCSATTHTKRKK 667.5 1721 378 91.0 344 34 352 0 0 93.1 0 6VASP14PQ3VG50IV25PSXPXDXNXNXHXPXPXPXTXQXSXSXSDN38ND3ITAT14DG20DE1KR2FS11GD14IS4QH30DE4EQ1QR2GD102 91.0 8.5e-191 54.8
gene.9403.0.5.p1 transcript:OIT35479 transcript:OIT35479 8.5e-191 667.5 1721 690 406 1 378 1 378 MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVLSTCRSFSKSGVPFHSMVVTGGFCQRTQLENLRQELDILIATPGRFMFLIKEGYLQLTNLKCAVLDEVDILFSDEDFETAFQCLINSSPITTQYLFVTATLPMDIYNKLVESFPDCELVSGPGMHRTSPGLEEFLVDCSGDETAEKSPDTAFINKKNALLHLVEDSPVPKTIVFCNKIDSCRKVENALKRFDRKGFSIKILPFHAALDQRRRLANMEEFRRSKMENVSLFLVCTDRASRGIDFEGVDHVVLFDYPRDPSEYVRRVGRTARGAGGKGKAFIFAVGKQVSLARRIMERNKKGHPVHDVPSILT* MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVCQISSSIKGTFATYSPYCSATTHTKRKK 667.5 1721 378 91.0 344 34 352 0 0 93.1 0 6VASP14PQ3VG50IV25PSXPXDXNXNXHXPXPXPXTXQXSXSXSDN38ND3ITAT14DG20DE1KR2FS11GD14IS4QH30DE4EQ1QR2GD102 91.0 8.5e-191 54.8
gene.69001.9.9.p1 NisylKD955766g0010.1 NisylKD955766g0010.1 1.4e-294 1011.9 2615 531 530 1 530 1 530 MKEMCLAVAPLPFRLGNNLIFHNPLSIGSSSHMDVTRLNSMGGTTTSLYAESAEKDLSDTVSSSRSEGVPLLHMISENESNNWISGDAVVRESEDDEILSLDGDQMSCSLSVVSDSSSLCGDDFIGFEVASEIFGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKIEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGHRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQEQWKKAFTNCFLMVDDEVGGTGNHEAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPTALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRAIQKGSKDNITVIVVDLKAQRKFKSKT MKEMCLAVAPLPFRLGNNLIFRNPPSIGSSSHMDATRLNSMGDTTTSLYAESAEKDLSDTVSSSRSEGVPLLPMISENDRNNWIAGDAVVRESEDDEILSLDGDQVSCSLSVVSDSSSLCGDDFIGFEVASDIYGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKLEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGDRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQDQWKKAFTNCFLKVDDEVGGTGNREAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPMALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRATQKGSKDNITVIVVDLKAQRKFKSKT MKEMCLAVAPLPFRLGNNLIFHNPLSIGSSSHMDVTRLNSMGGTTTSLYAESAEKDLSDTVSSSRSEGVPLLHMISENESNNWISGDAVVRESEDDEILSLDGDQMSCSLSVVSDSSSLCGDDFIGFEVASEIFGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKIEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGHRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQEQWKKAFTNCFLMVDDEVGGTGNHEAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPTALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRAIQKGSKDNITVIVVDLKAQRKFKSKT* MKEMCLAVAPLPFRLGNNLIFRNPPSIGSSSHMDATRLNSMGDTTTSLYAESAEKDLSDTVSSSRSEGVPLLPMISENDRNNWIAGDAVVRESEDDEILSLDGDQVSCSLSVVSDSSSLCGDDFIGFEVASDIYGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKLEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGDRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQDQWKKAFTNCFLKVDDEVGGTGNREAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPMALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRATQKGSKDNITVIVVDLKAQRKFKSKT 1011.9 2615 530 96.6 512 18 519 0 0 97.9 0 21HR2LP9VA7GD29HP5EDSR4SA20MV25ED1FY40IL74HD62ED11MK10HR40TM127IT25 96.6 1.4e-294 99.8
gene.9403.9.5.p1 transcript:OIT35479 transcript:OIT35479 8.5e-191 667.5 1721 690 406 1 378 1 378 MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVLSTCRSFSKSGVPFHSMVVTGGFCQRTQLENLRQELDILIATPGRFMFLIKEGYLQLTNLKCAVLDEVDILFSDEDFETAFQCLINSSPITTQYLFVTATLPMDIYNKLVESFPDCELVSGPGMHRTSPGLEEFLVDCSGDETAEKSPDTAFINKKNALLHLVEDSPVPKTIVFCNKIDSCRKVENALKRFDRKGFSIKILPFHAALDQRRRLANMEEFRRSKMENVSLFLVCTDRASRGIDFEGVDHVVLFDYPRDPSEYVRRVGRTARGAGGKGKAFIFAVGKQVSLARRIMERNKKGHPVHDVPSILT* MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVCQISSSIKGTFATYSPYCSATTHTKRKK 667.5 1721 378 91.0 344 34 352 0 0 93.1 0 6VASP14PQ3VG50IV25PSXPXDXNXNXHXPXPXPXTXQXSXSXSDN38ND3ITAT14DG20DE1KR2FS11GD14IS4QH30DE4EQ1QR2GD102 91.0 8.5e-191 54.8
위 파일에는 유사한 ID가 있습니다.
gene.9403.0.4.p1
gene.9403.0.5.p1
gene.9403.9.5.p1
보관하면 gene.9403
ID만 동일해집니다. 나머지 열은 gene.9403
동일하므로 중복 항목을 제거하고 싶습니다.
나는 이것을 사용했고 awk -F"\t" '!seen[$2, $3, $4, $5, $6, $7,$8, $9,$10,$11,$12, $13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31]++' select-results2.txt
위의 예에 대한 올바른 결과를 얻었습니다.
chr11_pilon3.g3568.t1 transcript:OIT01734 transcript:OIT01734 1.1e-107 389.8 1000 218 992 1 216 130 345 MDALTRHIQGDVPWCMLFADDIILIDETRAGVSERLEIWRQTLESKGFKISRSKTEYLECKFGDEPSGVGREVMLGSQAIAKRDSVRYLGSVIQGDGEIDGDVTHRIGAGWSKWRLASGVLCDKKIPHKLKGKFFRAMVRPAMFYEAECWPVKNSHIQRMKVAEMRMLRWMCGHTRLDKIKNEVIRQKVGVAPVDKKMGEARLRWFGHVRRRGPDA MDALTRHIQGDVPWCMLFADDIVLIDETRVGVNERLEVWRQTLESKGFKLSRSKTEYLECKFSAESSEVGRDVKLGSQVIAKRDSFRYLGSVIQGEGEIDGDVTHRIGAGWSKWRLASGVLCDKKVPQKLKGKFYRAVVRPAMLYGAECWPVKNSHVQRMKVAEMRMLRWMRGLTRLDRIRNEVIREKVGVALVDEKMREARLRWYGHVRRRRPDA MDALTRHIQGDVPWCMLFADDIILIDETRAGVSERLEIWRQTLESKGFKISRSKTEYLECKFGDEPSGVGREVMLGSQAIAKRDSVRYLGSVIQGDGEIDGDVTHRIGAGWSKWRLASGVLCDKKIPHKLKGKFFRAMVRPAMFYEAECWPVKNSHIQRMKVAEMRMLRWMCGHTRLDKIKNEVIRQKVGVAPVDKKMGEARLRWFGHVRRRGPDAR* MKVWERVVEARVREMTSISVNQFGFMPGRSTTEAIHLVRRLVEHFRDKKKDLHMVFIDLENAYDKVPREVLWRCLEAKSVPEAYIRVIKDMYDGAKTRVRTVGGDSDHFPVVMGLHQGSALSPLLFALVMDALTRHIQGDVPWCMLFADDIVLIDETRVGVNERLEVWRQTLESKGFKLSRSKTEYLECKFSAESSEVGRDVKLGSQVIAKRDSFRYLGSVIQGEGEIDGDVTHRIGAGWSKWRLASGVLCDKKVPQKLKGKFYRAVVRPAMLYGAECWPVKNSHVQRMKVAEMRMLRWMRGLTRLDRIRNEVIREKVGVALVDEKMREARLRWYGHVRRRRPDAPVRIYKSAILGHLNSHGSQNALAGPVEAEENRQKTKKEVMEEIIQKSKFFKAQKAKDREENDELTEQLDKDFTSLVESKALLSLTQPDKINALKALVNKNISVGNVKKDEVADVPRKASIGKEKPDTYEMLVSEMALDMRARPSDRTKTPEEIAQEEKERLELLEQEXXXXXXXXXXXXXXDGNASDDNSKLVKDPRTVSGDDLGDDLEEVPRTKLGWIGEILRRKENELESEDAASSGDSDDGEDEGXXXXXXXXXXXXXXXXXXXXDEEQGKTQTIKDWEQSDDDIIDTELEDDDEGFGDDAKKVVKIKDHKEENLSITVAAENKKKMQVFYGVLLQYFAVLANKKPLNSKLLNLLVKPLMEMSAVSPYFAAICARQRLQRTRAQFCEDLKNTGKSSWPSLKTIFLLRLWSMIFPCSDFRHCVMTPAILLMCEYLMRCTIISGRDIAIASFLCSLLLSVIKQSQKFCPEAIVFIQTLLMAALDRKQRSNSQLDNLMEIKELGPLLCIRSSKVEMDSLDFLTLMDLPEDSQYFHSDNYRTSMLVTVLETLQGFVNVYKELISFPEIFMLISKLLCKMAGENHIPDALREKIKDVSQLIDTKAQEHHMLRQPLKMRKKKPVPIRMLNPKFEENFVKGRDYDPDRERA 389.8 1000 216 85.6 185 31 200 0 0 92.6 0 22IV6AV2SN4IV11IL12GSDA1PS1GE3ED1MK4AV6VF9DE29IV1HQ6FY2MV5FL1EG10IV14CR1HL4KR1KR5QE5PL2KE2GR6FY6GR3 85.6 1.1e-107 99.1
gene.9403.0.4.p1 transcript:OIT35479 transcript:OIT35479 8.5e-191 667.5 1721 690 406 1 378 1 378 MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVLSTCRSFSKSGVPFHSMVVTGGFCQRTQLENLRQELDILIATPGRFMFLIKEGYLQLTNLKCAVLDEVDILFSDEDFETAFQCLINSSPITTQYLFVTATLPMDIYNKLVESFPDCELVSGPGMHRTSPGLEEFLVDCSGDETAEKSPDTAFINKKNALLHLVEDSPVPKTIVFCNKIDSCRKVENALKRFDRKGFSIKILPFHAALDQRRRLANMEEFRRSKMENVSLFLVCTDRASRGIDFEGVDHVVLFDYPRDPSEYVRRVGRTARGAGGKGKAFIFAVGKQVSLARRIMERNKKGHPVHDVPSILT* MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVCQISSSIKGTFATYSPYCSATTHTKRKK 667.5 1721 378 91.0 344 34 352 0 0 93.1 0 6VASP14PQ3VG50IV25PSXPXDXNXNXHXPXPXPXTXQXSXSXSDN38ND3ITAT14DG20DE1KR2FS11GD14IS4QH30DE4EQ1QR2GD102 91.0 8.5e-191 54.8
gene.69001.9.9.p1 NisylKD955766g0010.1 NisylKD955766g0010.1 1.4e-294 1011.9 2615 531 530 1 530 1 530 MKEMCLAVAPLPFRLGNNLIFHNPLSIGSSSHMDVTRLNSMGGTTTSLYAESAEKDLSDTVSSSRSEGVPLLHMISENESNNWISGDAVVRESEDDEILSLDGDQMSCSLSVVSDSSSLCGDDFIGFEVASEIFGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKIEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGHRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQEQWKKAFTNCFLMVDDEVGGTGNHEAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPTALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRAIQKGSKDNITVIVVDLKAQRKFKSKT MKEMCLAVAPLPFRLGNNLIFRNPPSIGSSSHMDATRLNSMGDTTTSLYAESAEKDLSDTVSSSRSEGVPLLPMISENDRNNWIAGDAVVRESEDDEILSLDGDQVSCSLSVVSDSSSLCGDDFIGFEVASDIYGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKLEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGDRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQDQWKKAFTNCFLKVDDEVGGTGNREAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPMALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRATQKGSKDNITVIVVDLKAQRKFKSKT MKEMCLAVAPLPFRLGNNLIFHNPLSIGSSSHMDVTRLNSMGGTTTSLYAESAEKDLSDTVSSSRSEGVPLLHMISENESNNWISGDAVVRESEDDEILSLDGDQMSCSLSVVSDSSSLCGDDFIGFEVASEIFGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKIEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGHRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQEQWKKAFTNCFLMVDDEVGGTGNHEAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPTALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRAIQKGSKDNITVIVVDLKAQRKFKSKT* MKEMCLAVAPLPFRLGNNLIFRNPPSIGSSSHMDATRLNSMGDTTTSLYAESAEKDLSDTVSSSRSEGVPLLPMISENDRNNWIAGDAVVRESEDDEILSLDGDQVSCSLSVVSDSSSLCGDDFIGFEVASDIYGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKLEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGDRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQDQWKKAFTNCFLKVDDEVGGTGNREAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPMALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRATQKGSKDNITVIVVDLKAQRKFKSKT 1011.9 2615 530 96.6 512 18 519 0 0 97.9 0 21HR2LP9VA7GD29HP5EDSR4SA20MV25ED1FY40IL74HD62ED11MK10HR40TM127IT25 96.6 1.4e-294 99.8
다만, 제가 생각하지 않으면 gene.9403
잘못된 내용을 삭제하게 될까봐 걱정이 됩니다. 첫 번째 열도 고려하는 방법이 있나요?
미리 감사드립니다.
답변1
이 시도:
awk '
{line = gensub(/^([^.]+\.[^.]+)[^[:blank:]]*/, "\1", 1, $0)}
!seen[line]++
' file