한 채널의 모든 패턴을 grep하고 인쇄합니다.

Question

솔루션은 다음을 사용합니다.앗. 두 개의 파일 이름을 인수로 전달하고 if (FNR == NR)관용구를 사용하여 첫 번째 파일을 읽는지 두 번째 파일을 읽는지에 따라 다른 작업을 수행합니다. 우리는 사용할 것이다연관 배열버튼과 출력 라인을 저장합니다.

파일은 다음과 같습니다 a.awk.

# usage: awk -f a.awk keyfile1 datafile2
BEGIN {
    FS = "\t"                               # set field separator to TAB
}
{
    if (FNR == NR) {                        # if looking at first (key) file
        k[$1]=$1                            # just save each key
    } else {                                # if looking at second file
        if ($1 in k) {                      # if first col is one that we want
            output=$1 "_" $2                # prepare output line
            if (out[$1]=="")                # if first time we've seen this key
                out[$1]=output              # store output as is
            else                            # and when we find more matches for this key
                out[$1]=out[$1] ";" output  # we append ";" and the output
        }
    }
}
END {                                       # at the end
    for (i in out)                          # print all the output lines
        print out[i]
}

사용 방법은 다음과 같습니다.

$ awk -f a.awk file1 file2
K00001_ko00010_Glycolysis__Gluconeogenesis;K00001_ko00020_Citrate_cycle_(TCA_cycle)
K00003_ko00010_Glycolysis__Gluconeogenesis;K00003_ko00020_Citrate_cycle_(TCA_cycle)
K00005_ko00010_Glycolysis__Gluconeogenesis;K00005_ko00020_Citrate_cycle_(TCA_cycle)

Answer 1

솔루션은 다음을 사용합니다.앗. 두 개의 파일 이름을 인수로 전달하고 if (FNR == NR)관용구를 사용하여 첫 번째 파일을 읽는지 두 번째 파일을 읽는지에 따라 다른 작업을 수행합니다. 우리는 사용할 것이다연관 배열버튼과 출력 라인을 저장합니다.

파일은 다음과 같습니다 a.awk.

# usage: awk -f a.awk keyfile1 datafile2
BEGIN {
    FS = "\t"                               # set field separator to TAB
}
{
    if (FNR == NR) {                        # if looking at first (key) file
        k[$1]=$1                            # just save each key
    } else {                                # if looking at second file
        if ($1 in k) {                      # if first col is one that we want
            output=$1 "_" $2                # prepare output line
            if (out[$1]=="")                # if first time we've seen this key
                out[$1]=output              # store output as is
            else                            # and when we find more matches for this key
                out[$1]=out[$1] ";" output  # we append ";" and the output
        }
    }
}
END {                                       # at the end
    for (i in out)                          # print all the output lines
        print out[i]
}

사용 방법은 다음과 같습니다.

$ awk -f a.awk file1 file2
K00001_ko00010_Glycolysis__Gluconeogenesis;K00001_ko00020_Citrate_cycle_(TCA_cycle)
K00003_ko00010_Glycolysis__Gluconeogenesis;K00003_ko00020_Citrate_cycle_(TCA_cycle)
K00005_ko00010_Glycolysis__Gluconeogenesis;K00005_ko00020_Citrate_cycle_(TCA_cycle)

한 채널의 모든 패턴을 grep하고 인쇄합니다.

답변1

관련 정보