awk 공간 분할 문제

awk 공간 분할 문제

awk로 첫 번째 공백 이후에는 분할할 수 없습니다.

$ grep ">" Supplemental_dataset_07_NbE_CDS.fasta | awk 'BEGIN { FS = "\t" } {print $1}' | head
>NbD053290.1 Partial, glutelin type-B 2-like  (XP_016462855.1)
>NbD053289.2 GDSL esterase/lipase At2g38180-like  (XP_016505556.1)
>NbD053288.1 SUMO-conjugating enzyme SCE1  (XP_019223445.1)
>NbD053287.1 bifunctional epoxide hydrolase 2-like  (XP_016470817.1)
>NbD053286.1 uncharacterized protein LOC109221334 isoform X1  (XP_019241352.1)
>NbD053285.2 uncharacterized protein LOC107817905  (XP_016499316.1)
>NbD053284.3 cell division cycle protein 123 homolog  (XP_019248046.1)
>NbD053283.1 Partial, probable rhamnogalacturonate lyase B  (XP_009789094.1)
>NbD053282.1 aluminum-activated malate transporter 2-like  (XP_009760052.1)
>NbD053281.1 Partial, uncharacterized protein LOC107803999  (XP_016483291.1)

안타깝게도 다음 명령은 설명의 일부를 제거합니다.

grep ">" Supplemental_dataset_07_NbE_CDS.fasta | awk 'BEGIN { FS = " " } {print $1","$2}' | head

>NbD053290.1,Partial,
>NbD053289.1,GDSL
>NbD053288.1,SUMO-conjugating
>NbD053287.1,bifunctional
>NbD053286.1,uncharacterized
>NbD053285.1,uncharacterized
>NbD053284.1,cell
>NbD053283.1,Partial,
>NbD053282.1,aluminum-activated
>NbD053281.1,Partial,

이 출력을 생성하기 위해 위 명령을 어떻게 수정합니까?

>NbD053290.1,Partial, glutelin type-B 2-like  (XP_016462855.1)
>NbD053289.2,GDSL esterase/lipase At2g38180-like  (XP_016505556.1)

미리 감사드립니다.

답변1

그러면 전체 grep | awk | head파이프라인이 대체됩니다.

awk '/>/{sub(/ /,","); print; if (++c == 10) exit}' Supplemental_dataset_07_NbE_CDS.fasta

답변2

$ awk -F" " '{ sub(" ",","); print; }' input
>NbD053290.1,Partial, glutelin type-B 2-like  (XP_016462855.1)
>NbD053289.2,GDSL esterase/lipase At2g38180-like  (XP_016505556.1)
>NbD053288.1,SUMO-conjugating enzyme SCE1  (XP_019223445.1)
>NbD053287.1,bifunctional epoxide hydrolase 2-like  (XP_016470817.1)
>NbD053286.1,uncharacterized protein LOC109221334 isoform X1  (XP_019241352.1)
>NbD053285.2,uncharacterized protein LOC107817905  (XP_016499316.1)
>NbD053284.3,cell division cycle protein 123 homolog  (XP_019248046.1)
>NbD053283.1,Partial, probable rhamnogalacturonate lyase B  (XP_009789094.1)
>NbD053282.1,aluminum-activated malate transporter 2-like  (XP_009760052.1)
>NbD053281.1,Partial, uncharacterized protein LOC107803999  (XP_016483291.1)

답변3

Raku(이전 Perl_6) 사용

$ raku -pe 's/\s/,/;'
>NbD053290.1,Partial, glutelin type-B 2-like  (XP_016462855.1)
>NbD053289.2,GDSL esterase/lipase At2g38180-like  (XP_016505556.1)

또는

$ raku -pe 's:5th/<.ws>/,/;' glutelin.txt
>NbD053290.1,Partial, glutelin type-B 2-like  (XP_016462855.1)
>NbD053289.2,GDSL esterase/lipase At2g38180-like  (XP_016505556.1)

https://raku.org/

관련 정보