이전 파일과 새 파일을 비교하지만 새 파일에만 존재하는 줄을 무시하시겠습니까?

Question 1

사용join두 파일에서 일치하는 줄을 병합합니다. 파일 이름이 (출력에 표시된 대로) 체크섬 뒤에 오고 md5sum공백이 없다고 가정하면 이전 체크섬 및 새 체크섬과 함께 두 목록에 있는 모든 파일 이름이 인쇄됩니다.

join -1 2 -2 2 <(sort -k 2 oldlist) <(sort -k 2 newlist)

새 파일도 보려면 -a옵션을 에 전달하십시오 join. 일부 출력 후처리에서는 체크섬이 변경되지 않은 파일 이름을 제거합니다.

join -a 2 -1 2 -2 2 <(sort -k 2 oldlist) <(sort -k 2 newlist) |
awk '$2 != $3'

Answer

사용join두 파일에서 일치하는 줄을 병합합니다. 파일 이름이 (출력에 표시된 대로) 체크섬 뒤에 오고 md5sum공백이 없다고 가정하면 이전 체크섬 및 새 체크섬과 함께 두 목록에 있는 모든 파일 이름이 인쇄됩니다.

join -1 2 -2 2 <(sort -k 2 oldlist) <(sort -k 2 newlist)

새 파일도 보려면 -a옵션을 에 전달하십시오 join. 일부 출력 후처리에서는 체크섬이 변경되지 않은 파일 이름을 제거합니다.

join -a 2 -1 2 -2 2 <(sort -k 2 oldlist) <(sort -k 2 newlist) |
awk '$2 != $3'

Question 2

이 작업은 혼자서도 할 수 있습니다 awk.

$ awk 'FNR==NR   { o[$2]=$1; next }       !o[$2] { print $0, "NEW"; next } 
       $1!=o[$2] { print $0, "CHANGED" }' newlist oldlist

(가정된 파일 형식은 md5sum출력 형식: "md5 파일 이름"입니다.)

고쳐 쓰다awk: 이 한 줄의 코드가 어떻게 작동하는지 단계별로 설명합니다 .

awk 'FNR==NR { # if current record number==overall record number (still processing the first file)
  o[$2]=$1     # store the record in array o: the key is the file name, the value is the md5
  next         # go to next record (do not execute the rest of the code)
}
# reaching this point means we are processing the second input file
!o[$2] {       # if array o not contains item with the current record`s file name
  print $0, "NEW" # print the current record and specify that it`s new
  next         # go to next record (do not execute the rest of the code)
}
# reaching this point means array o contains item with the current file name
$1!=o[$2] {    # if the current md5 is not equal with the md5 save for the current file name
  print $0, "CHANGED" # print the current record and specify it`s changed
}' newlist oldlist

Answer

이 작업은 혼자서도 할 수 있습니다 awk.

$ awk 'FNR==NR   { o[$2]=$1; next }       !o[$2] { print $0, "NEW"; next } 
       $1!=o[$2] { print $0, "CHANGED" }' newlist oldlist

(가정된 파일 형식은 md5sum출력 형식: "md5 파일 이름"입니다.)

고쳐 쓰다awk: 이 한 줄의 코드가 어떻게 작동하는지 단계별로 설명합니다 .

awk 'FNR==NR { # if current record number==overall record number (still processing the first file)
  o[$2]=$1     # store the record in array o: the key is the file name, the value is the md5
  next         # go to next record (do not execute the rest of the code)
}
# reaching this point means we are processing the second input file
!o[$2] {       # if array o not contains item with the current record`s file name
  print $0, "NEW" # print the current record and specify that it`s new
  next         # go to next record (do not execute the rest of the code)
}
# reaching this point means array o contains item with the current file name
$1!=o[$2] {    # if the current md5 is not equal with the md5 save for the current file name
  print $0, "CHANGED" # print the current record and specify it`s changed
}' newlist oldlist

Question 3

귀하의 질문을 올바르게 이해했다면 comm귀하가 원하는 것을 실제로 수행하는 것이 가능합니다. 나는 그것을 조사해 볼 것을 제안한다.comm --help

구체적으로

  -1              suppress column 1 (lines unique to FILE1)
  -2              suppress column 2 (lines unique to FILE2)
  -3              suppress column 3 (lines that appear in both files)

그러니 comm newFile oldFile -1 -3당신이 원하는 것을 할 것입니다.

Answer

귀하의 질문을 올바르게 이해했다면 comm귀하가 원하는 것을 실제로 수행하는 것이 가능합니다. 나는 그것을 조사해 볼 것을 제안한다.comm --help

구체적으로

  -1              suppress column 1 (lines unique to FILE1)
  -2              suppress column 2 (lines unique to FILE2)
  -3              suppress column 3 (lines that appear in both files)

그러니 comm newFile oldFile -1 -3당신이 원하는 것을 할 것입니다.

Question 4

대안으로 나는 항상 " sdiff -s"를 사용하여 파일 목록이나 md5sum을 비교합니다.

파일이 정상이라고 가정하면 md5sum은 " md5hash filename"을 출력합니다. 그 다음에:

sdiff -s oldfile newfile | grep -v ">"
# sorting on the md5hash should help align and pick up renamed files.
sdiff -s <(sort oldfile) <(sort newfile)

이를 깨기 위해:
sdiff -s: 공통 행을 억제하므로 정확한 일치가 무시됩니다. |, 의 차이를 보여라 <. : 이 명령이 sdiff보다 먼저 정렬되는지 여부입니다. : 새 파일 항목을 무시합니다. filename 에 아무것도 없는 경우에만 작동하며 이는 어쨌든 불가능합니다.>
<(sort oldfile)
grep -v ">">

sdiff더 긴 선을 표시하도록 너비를 변경할 수 있습니다 -w 100.

Answer

대안으로 나는 항상 " sdiff -s"를 사용하여 파일 목록이나 md5sum을 비교합니다.

파일이 정상이라고 가정하면 md5sum은 " md5hash filename"을 출력합니다. 그 다음에:

sdiff -s oldfile newfile | grep -v ">"
# sorting on the md5hash should help align and pick up renamed files.
sdiff -s <(sort oldfile) <(sort newfile)

이를 깨기 위해:
sdiff -s: 공통 행을 억제하므로 정확한 일치가 무시됩니다. |, 의 차이를 보여라 <. : 이 명령이 sdiff보다 먼저 정렬되는지 여부입니다. : 새 파일 항목을 무시합니다. filename 에 아무것도 없는 경우에만 작동하며 이는 어쨌든 불가능합니다.>
<(sort oldfile)
grep -v ">">

sdiff더 긴 선을 표시하도록 너비를 변경할 수 있습니다 -w 100.

이전 파일과 새 파일을 비교하지만 새 파일에만 존재하는 줄을 무시하시겠습니까?

답변1

답변2

답변3

답변4

관련 정보