2개의 서로 다른 파일에서 2개의 열만 비교하여 누락된 행을 추가하는 방법

Question 1

어쩌면 이런 게 있지 않을까요?

cat file2 | awk '!(1 in f) {if ((getline l < "-") == 1) split(l, f)} $3!=f[3] {print;next} {print l; delete f}' file1 | column -t

스크립트에서는 다음을 file1예상합니다.토론file2그것을 기대하면서 awk표준 입력. 나는 이것을 더 명확하게 하기 위해 "cat의 쓸모없는 사용"을 사용했지만 당연히 < file2이것을 리디렉션으로 제공할 수 있습니다. 실제로 파일 이름을 스크립트 자체에 포함시킬 수도 있지만 "file2"이 방법이 좀 더 유연합니다."-"getline

또한 두 파일은 field3 값과 관련하여 "동기화"되거나 사용 사례에 적합한 경우 "조기 앞서" 시작될 것으로 예상됩니다 file2.file1

스크립트는 읽기 쉽도록 개별적으로 분류되었으며 자세한 설명은 다음과 같습니다.

# Check if our `real_fields` array is not existent.
# NOTE: we use the `<index> in <array>` construct
# in order to force awk treat `real_fields` name as an
# array (instead of as a scalar as it would by default)
# and build it in an empty state
!(1 in real_fields) {
    # get the next line (if any) from the "real" file
    if ((getline real_line < "-") == 1)
        # split that line in separate fields populating
        # our `real_fields` array
        split(real_line, real_fields)
        # awk split function creates an array with numeric
        # indexes for each field found as per FS separator
}
# if field3 of the current line of the "reference"
# file does not match the current line of the "real" file..
$3!=real_fields[3] {
    # print current line of "reference" file
    print
    # go reading next line of "reference" file thus
    # skipping the final awk pattern
    next
}
# final awk pattern, we get here only if the pattern
# above did not match, i.e. if field3 values from both
# files match
{
    # print current line of "real" file
    print real_line
    # delete our real_fields array, thus triggering
    # the fetching of the next line of "real" file as
    # performed by the first awk pattern
    delete real_fields
}

Answer

어쩌면 이런 게 있지 않을까요?

cat file2 | awk '!(1 in f) {if ((getline l < "-") == 1) split(l, f)} $3!=f[3] {print;next} {print l; delete f}' file1 | column -t

스크립트에서는 다음을 file1예상합니다.토론file2그것을 기대하면서 awk표준 입력. 나는 이것을 더 명확하게 하기 위해 "cat의 쓸모없는 사용"을 사용했지만 당연히 < file2이것을 리디렉션으로 제공할 수 있습니다. 실제로 파일 이름을 스크립트 자체에 포함시킬 수도 있지만 "file2"이 방법이 좀 더 유연합니다."-"getline

또한 두 파일은 field3 값과 관련하여 "동기화"되거나 사용 사례에 적합한 경우 "조기 앞서" 시작될 것으로 예상됩니다 file2.file1

스크립트는 읽기 쉽도록 개별적으로 분류되었으며 자세한 설명은 다음과 같습니다.

# Check if our `real_fields` array is not existent.
# NOTE: we use the `<index> in <array>` construct
# in order to force awk treat `real_fields` name as an
# array (instead of as a scalar as it would by default)
# and build it in an empty state
!(1 in real_fields) {
    # get the next line (if any) from the "real" file
    if ((getline real_line < "-") == 1)
        # split that line in separate fields populating
        # our `real_fields` array
        split(real_line, real_fields)
        # awk split function creates an array with numeric
        # indexes for each field found as per FS separator
}
# if field3 of the current line of the "reference"
# file does not match the current line of the "real" file..
$3!=real_fields[3] {
    # print current line of "reference" file
    print
    # go reading next line of "reference" file thus
    # skipping the final awk pattern
    next
}
# final awk pattern, we get here only if the pattern
# above did not match, i.e. if field3 values from both
# files match
{
    # print current line of "real" file
    print real_line
    # delete our real_fields array, thus triggering
    # the fetching of the next line of "real" file as
    # performed by the first awk pattern
    delete real_fields
}

Question 2

배열의 순서를 설정해야 합니다. 그렇지 않으면 awk가 행 순서를 재정렬합니다.

#!/usr/bin/awk -f

BEGIN {
    PROCINFO["sorted_in"] = "@ind_str_asc"
}
NR==FNR {
    a[i++,$3]=$0
    next
} 
{
    for (c in a) {
        split(c, s, SUBSEP)
        if (s[2] == $3) {
            print $0
            getline
        } else {
            print a[c]
        }
    }
}

./script.awk file1 file2

Answer

배열의 순서를 설정해야 합니다. 그렇지 않으면 awk가 행 순서를 재정렬합니다.

#!/usr/bin/awk -f

BEGIN {
    PROCINFO["sorted_in"] = "@ind_str_asc"
}
NR==FNR {
    a[i++,$3]=$0
    next
} 
{
    for (c in a) {
        split(c, s, SUBSEP)
        if (s[2] == $3) {
            print $0
            getline
        } else {
            print a[c]
        }
    }
}

./script.awk file1 file2

2개의 서로 다른 파일에서 2개의 열만 비교하여 누락된 행을 추가하는 방법

답변1

답변2

관련 정보