두 파일을 한 줄씩 비교하는 방법은 무엇입니까?

Question 1

휴대성이 얼마나 좋은지는 말할 수 없지만 모든 기반을 다 다루려고 노력했습니다. 귀하의 정보를 바탕으로 테스트에서 두 파일을 모두 복사하기 위해 최선을 다했습니다. sed에서 특수 문자에 문제가 있는 경우 cleanLine 함수의 두 번째 줄에서 이스케이프할 수 있습니다.

#!/bin/bash

# compare two files and return lines in
# first file that are missing in second file

ProgName=${0##*/}
Pid=$$
CHK_FILE="$1"
REF_FILE="$2"
D_BUG="$3"
TMP_FILE="/tmp/REF_${Pid}.tmp"
declare -a MISSING='()'
m=0

scriptUsage() {
cat <<ENDUSE

    $ProgName  <file_to_check> <reference_file> [-d|--debug]

    Lines in 'file_to_check' not present in 'reference_file'
      are printed to standard output.

    file_to_check:     File being checked
    reference_file:    File to be checked against
    -d|--debug:        Run script in debug mode (Optional)
    -h|--help:         Print this help message

ENDUSE
}

# delete temp file on any exit
trap 'rm $TMP_FILE > /dev/null 2>&1' EXIT


#-- check args
  [[ $CHK_FILE == "-h" || $CHK_FILE == "--help" ]] && { scriptUsage; exit 0; }
  [[ -n $CHK_FILE && -n $REF_FILE ]] || { >&2 echo "Not enough arguments!"; scriptUsage; exit 1; }
  [[ $D_BUG == "-d" || $D_BUG == "--debug" ]] && set -x
  [[ -s $CHK_FILE ]] || { >&2 echo "File $CHK_FILE not found"; exit 1; }
  [[ -s $REF_FILE ]] || { >&2 echo "File $REF_FILE not found"; exit 1; }
#--


#== edit temp file to 3 match comparison rules
  # copy ref file to temp for editing
  cp "$REF_FILE" $TMP_FILE || { >&2 echo "Unable to create temporary file"; exit 1; }
  # rule 3 - ignore empty lines
  sed -i '/^\s*$/d' $TMP_FILE
  # rule 1 - ignore begin/end of line spaces
  sed -i 's/^[[:space:]][[:space:]]*//;s/[[:space:]][[:space:]]*$//' $TMP_FILE
  # rule 2 - multi space/tab as single space
  sed -i 's/[[:space:]][[:space:]]*/ /g' $TMP_FILE
#==


# function to clean LINE to match 3 rules
# & escape '/' and '.' for later sed command
cleanLine() {
  var=$(echo "$1" | sed 's/^[[:space:]][[:space:]]*//;s/[[:space:]][[:space:]]*$//;s/[[:space:]][[:space:]]*/ /g')
  echo $var | sed 's/\//\\\//g;s/\./\\\./g'
}


### parse check file
while IFS='' read -r LINE || [[ -n $LINE ]]
  do
    if [[ -z $LINE ]]
      then
        continue
      else
        CLN_LINE=$(cleanLine "$LINE")
        FOUND=$(sed -n "/$CLN_LINE/{p;q}" $TMP_FILE)
        [[ -z $FOUND ]] && MISSING[$m]="$LINE" && ((m++))
        FOUND=""
    fi
done < "$CHK_FILE"
###


#++ print missing line(s) (if any)
  if (( $m > 0 ))
    then
      printf "\n  Missing line(s) found:\n"
      #*SEE BELOW ON THIS
      for (( p=0; $p<$m; p++ ))
        do
          printf "    %s\n" "${MISSING[$p]}"
      done
      echo
    else
      printf "\n  **No missing lines found**\n\n"
  fi
#* using 'for p in ${MISSING[@]}' causes:
#* "SPACED LINES" to become:
#* "SPACED"
#* "LINES" when printed to stdout!
#++

Answer

휴대성이 얼마나 좋은지는 말할 수 없지만 모든 기반을 다 다루려고 노력했습니다. 귀하의 정보를 바탕으로 테스트에서 두 파일을 모두 복사하기 위해 최선을 다했습니다. sed에서 특수 문자에 문제가 있는 경우 cleanLine 함수의 두 번째 줄에서 이스케이프할 수 있습니다.

#!/bin/bash

# compare two files and return lines in
# first file that are missing in second file

ProgName=${0##*/}
Pid=$$
CHK_FILE="$1"
REF_FILE="$2"
D_BUG="$3"
TMP_FILE="/tmp/REF_${Pid}.tmp"
declare -a MISSING='()'
m=0

scriptUsage() {
cat <<ENDUSE

    $ProgName  <file_to_check> <reference_file> [-d|--debug]

    Lines in 'file_to_check' not present in 'reference_file'
      are printed to standard output.

    file_to_check:     File being checked
    reference_file:    File to be checked against
    -d|--debug:        Run script in debug mode (Optional)
    -h|--help:         Print this help message

ENDUSE
}

# delete temp file on any exit
trap 'rm $TMP_FILE > /dev/null 2>&1' EXIT


#-- check args
  [[ $CHK_FILE == "-h" || $CHK_FILE == "--help" ]] && { scriptUsage; exit 0; }
  [[ -n $CHK_FILE && -n $REF_FILE ]] || { >&2 echo "Not enough arguments!"; scriptUsage; exit 1; }
  [[ $D_BUG == "-d" || $D_BUG == "--debug" ]] && set -x
  [[ -s $CHK_FILE ]] || { >&2 echo "File $CHK_FILE not found"; exit 1; }
  [[ -s $REF_FILE ]] || { >&2 echo "File $REF_FILE not found"; exit 1; }
#--


#== edit temp file to 3 match comparison rules
  # copy ref file to temp for editing
  cp "$REF_FILE" $TMP_FILE || { >&2 echo "Unable to create temporary file"; exit 1; }
  # rule 3 - ignore empty lines
  sed -i '/^\s*$/d' $TMP_FILE
  # rule 1 - ignore begin/end of line spaces
  sed -i 's/^[[:space:]][[:space:]]*//;s/[[:space:]][[:space:]]*$//' $TMP_FILE
  # rule 2 - multi space/tab as single space
  sed -i 's/[[:space:]][[:space:]]*/ /g' $TMP_FILE
#==


# function to clean LINE to match 3 rules
# & escape '/' and '.' for later sed command
cleanLine() {
  var=$(echo "$1" | sed 's/^[[:space:]][[:space:]]*//;s/[[:space:]][[:space:]]*$//;s/[[:space:]][[:space:]]*/ /g')
  echo $var | sed 's/\//\\\//g;s/\./\\\./g'
}


### parse check file
while IFS='' read -r LINE || [[ -n $LINE ]]
  do
    if [[ -z $LINE ]]
      then
        continue
      else
        CLN_LINE=$(cleanLine "$LINE")
        FOUND=$(sed -n "/$CLN_LINE/{p;q}" $TMP_FILE)
        [[ -z $FOUND ]] && MISSING[$m]="$LINE" && ((m++))
        FOUND=""
    fi
done < "$CHK_FILE"
###


#++ print missing line(s) (if any)
  if (( $m > 0 ))
    then
      printf "\n  Missing line(s) found:\n"
      #*SEE BELOW ON THIS
      for (( p=0; $p<$m; p++ ))
        do
          printf "    %s\n" "${MISSING[$p]}"
      done
      echo
    else
      printf "\n  **No missing lines found**\n\n"
  fi
#* using 'for p in ${MISSING[@]}' causes:
#* "SPACED LINES" to become:
#* "SPACED"
#* "LINES" when printed to stdout!
#++

Question 2

간단한 해결책:

diff -bB fileA fileB | grep -v '^>'

-b(또는 --ignore-space-change)은 "공백량의 변화를 무시함"을 의미합니다. -B(또는 --ignore-blank-lines)은 "모든 빈 행의 변경 사항 무시"를 의미합니다. grep -v '>'fileA에 존재하지 않는 fileB의 행에 대한 보고서를 삭제합니다.

이것은 선행 공백을 무시하지 않지만 그렇지 않으면 원하는 것과 가깝습니다.

"B에는 존재하지만 A에는 존재하지 않는 행 예diff -bB fileA fileB 재미있기도 하고, 비교를 절반만 하고 두 번 하는 대신 그냥 해보는 건 어떨까요 ?

Answer

간단한 해결책:

diff -bB fileA fileB | grep -v '^>'

-b(또는 --ignore-space-change)은 "공백량의 변화를 무시함"을 의미합니다. -B(또는 --ignore-blank-lines)은 "모든 빈 행의 변경 사항 무시"를 의미합니다. grep -v '>'fileA에 존재하지 않는 fileB의 행에 대한 보고서를 삭제합니다.

이것은 선행 공백을 무시하지 않지만 그렇지 않으면 원하는 것과 가깝습니다.

"B에는 존재하지만 A에는 존재하지 않는 행 예diff -bB fileA fileB 재미있기도 하고, 비교를 절반만 하고 두 번 하는 대신 그냥 해보는 건 어떨까요 ?

Question 3

diff -w file1 file2

공백 문자를 무시하도록 -w플래그 를 지정 합니다(이것은 대부분의 구현에서 구현되는 확장입니다).diffdiff

다음을 입력:

file1:

hello world

abc
123

this is line 2 (the last line)

file2:

        hello   world

abc
123

this is line 3 (the last line)

이 명령은

6c6
< this is line 2 (the last line)
---
> this is line 3 (the last line)

빈 줄을 무시하도록 하려면 빈 줄을 제거하여 입력 파일을 전처리하세요. 프로세스 대체(예: bash또는 ksh93) 를 이해하는 쉘을 사용하십시오 .

diff -w <( sed '/^[[:space:]]*$/d' file1 ) <( sed '/^[[:space:]]*$/d' file2 )

diff빈 줄을 무시할 수 있는 옵션이 있다면( GNU -B를 사용하는 경우 매뉴얼을 참조하세요 diff), 그것을 사용하세요. 내 것은 그런 옵션이 없습니다.

Answer

diff -w file1 file2