ID를 사용하여 ID 헤더를 추출하고 파일 B의 두 번째 열에 추가합니다.

ID를 사용하여 ID 헤더를 추출하고 파일 B의 두 번째 열에 추가합니다.

식별 번호 제품군 이름과 제목이 포함된 파일 A와 ID와 시퀀스가 ​​포함된 파일 B가 있습니다.

B의 접근번호를 이용하여 A의 성(family name)과 바이러스 이름을 검색하여 B의 두 번째 열에 추가하고 싶습니다.

파일 A

NC_001348 PEPS Herpesviridae Human herpesvirus 3, complete genome.txt
NC_001350 PEPS Herpesviridae Saimiriine herpesvirus 2 complete genome.txt
NC_001491 PEPS Herpesviridae Equid herpesvirus 1, complete genome.txt
NC_001798 PEPS Herpesviridae Human herpesvirus 2 strain HG52, complete genome.txt
NC_001806 PEPS Herpesviridae Human herpesvirus 1 strain 17, complete genome.txt
NC_001826 PEPS Herpesviridae Murine herpesvirus 68 strain WUMS, complete genome.txt
NC_001844 PEPS Herpesviridae Equid herpesvirus 4, complete genome.txt
NC_001847 PEPS Herpesviridae Bovine herpesvirus 1, complete genome.txt
NC_001987 PEPS Herpesviridae Ateline herpesvirus 3 complete genome.txt
NC_002229 PEPS Herpesviridae Gallid herpesvirus 2, complete genome.txt

문서 B

NC_001348_71671_71760_KY215944.1    GCGCGGCTGGTGATGCAATGCGTGACCAGCTACTGGCGCAACTCGCGCTGCGCCGCCTTTGTGAACAGCTTCCCCATGGTGATGTACATC
NC_001350_89668_89757_HQ221963.1    CTTTCAGGATTTTCTGGCAGTTTTGCTGTCAAGAATGACATGATCTGGTGATGCCATATCTCAATATACAGCGCAGTGCTCACTGGTCTG
NC_001491_126502_126591_AF480884.1  AACGTGTCGGTGCGCACGGCCGTCAGGGCGAAGCCCGGGTGGATGTGGGCCTTGGTCTGCAGCACCAGCGACACCGGCGAGATCTTGTAC
NC_001798_97563_97652_AY714813.1    CGCAGGTGCCCGAAGACGTCGCAGACGGCCGCCCGCAGGGCCATGCACTGCATGGAGCCCGTGGTGCCGCCCGGCCCCCGGTCCAGGTGC
NC_001806_196955_197044_FJ483970.2  TCATCGATCTCAGTCTGTCGGCCGCTCCACGGCTCTGACTGGACTTTCCAAAGTACATACTGCAGTCAGAGCTGTCGAGCGGTTAACAGA

예상 출력

NC_001348_71671_71760_KY215944.1    Herpesviridae Human herpesvirus 3, complete genome  GCGCGGCTGGTGATGCAATGCGTGACCAGCTACTGGCGCAACTCGCGCTGCGCCGCCTTTGTGAACAGCTTCCCCATGGTGATGTACATC
NC_001350_89668_89757_HQ221963.1    Herpesviridae Saimiriine herpesvirus 2 complete genome  CTTTCAGGATTTTCTGGCAGTTTTGCTGTCAAGAATGACATGATCTGGTGATGCCATATCTCAATATACAGCGCAGTGCTCACTGGTCTG
NC_001491_126502_126591_AF480884.1  Herpesviridae Equid herpesvirus 1, complete genome  AACGTGTCGGTGCGCACGGCCGTCAGGGCGAAGCCCGGGTGGATGTGGGCCTTGGTCTGCAGCACCAGCGACACCGGCGAGATCTTGTAC
NC_001798_97563_97652_AY714813.1    Herpesviridae Human herpesvirus 2 strain HG52, complete genome  CGCAGGTGCCCGAAGACGTCGCAGACGGCCGCCCGCAGGGCCATGCACTGCATGGAGCCCGTGGTGCCGCCCGGCCCCCGGTCCAGGTGC
NC_001806_196955_197044_FJ483970.2  Herpesviridae Human herpesvirus 1 strain 17, complete genome    TCATCGATCTCAGTCTGTCGGCCGCTCCACGGCTCTGACTGGACTTTCCAAAGTACATACTGCAGTCAGAGCTGTCGAGCGGTTAACAGA

답변1

주문하다:

c=`awk '{print NR}' file2| sort -nr | sed -n '1p'`;for ((i=1;i<=$c;i++)); do j=`awk -v i="$i" 'NR==i{$1=$2="";print $0}' file1`; awk -v i="$i" -v j="$j" 'NR == i{$3=$2;$2=j;print $0}' file2; done| sed "s/complete genome.txt/complete genome/g"

산출

c=`awk '{print NR}' file2| sort -nr | sed -n '1p'`;for ((i=1;i<=$c;i++)); do j=`awk -v i="$i" 'NR==i{$1=$2="";print $0}' file1`; awk -v i="$i" -v j="$j" 'NR == i{$3=$2;$2=j;print $0}' file2; done| sed "s/complete genome.txt/complete genome/g"
NC_001348_71671_71760_KY215944.1   Herpesviridae Human herpesvirus 3, complete genome GCGCGGCTGGTGATGCAATGCGTGACCAGCTACTGGCGCAACTCGCGCTGCGCCGCCTTTGTGAACAGCTTCCCCATGGTGATGTACATC
NC_001350_89668_89757_HQ221963.1   Herpesviridae Saimiriine herpesvirus 2 complete genome CTTTCAGGATTTTCTGGCAGTTTTGCTGTCAAGAATGACATGATCTGGTGATGCCATATCTCAATATACAGCGCAGTGCTCACTGGTCTG
NC_001491_126502_126591_AF480884.1   Herpesviridae Equid herpesvirus 1, complete genome AACGTGTCGGTGCGCACGGCCGTCAGGGCGAAGCCCGGGTGGATGTGGGCCTTGGTCTGCAGCACCAGCGACACCGGCGAGATCTTGTAC
NC_001798_97563_97652_AY714813.1   Herpesviridae Human herpesvirus 2 strain HG52, complete genome CGCAGGTGCCCGAAGACGTCGCAGACGGCCGCCCGCAGGGCCATGCACTGCATGGAGCCCGTGGTGCCGCCCGGCCCCCGGTCCAGGTGC
NC_001806_196955_197044_FJ483970.2   Herpesviridae Human herpesvirus 1 strain 17, complete genome TCATCGATCTCAGTCTGTCGGCCGCTCCACGGCTCTGACTGGACTTTCCAAAGTACATACTGCAGTCAGAGCTGTCGAGCGGTTAACAGA

관련 정보