bash + 각 줄의 모든 단어 수가 동일한지 확인하는 방법

Question 1

사용 awk:

awk 'BEGIN { r = "true" } NR == 1 { n = NF; next } NF != n { r = "false"; n = "N/A"; exit } END { printf("status=%s count=%s\n", r, n) }' somefilename

또는 awk스크립트로:

#!/usr/bin/awk -f

BEGIN { r = "true" }

NR == 1 { n = NF; next }
NF != n { r = "false"; n = "N/A"; exit }

END { printf("status=%s count=%s\n", r, n) }

스크립트는 r"결과"와 같은 설정으로 시작 됩니다 true(거짓이 아닌 참이라고 가정합니다). 그런 다음 n첫 번째 행의 필드 수로 초기화됩니다 ("숫자").

입력 데이터의 다른 행에 다른 수의 필드가 있는 경우 다음 r으로 설정되고 false스크립트 가 종료됩니다(블록을 통해 n) .N/AEND

마지막으로 rsum의 현재 값을 인쇄합니다.n

이 스크립트의 출력은 다음과 유사합니다.

status=true count=5

또는

status=false count=N/A

이는 or export, 또는 와 함께 사용할 수 있습니다 .bashdeclareeval

declare $( awk '...' somefilename )

그러면 쉘 변수가 생성되고 count호출 status쉘에서 사용할 수 있습니다.

Answer

사용 awk:

awk 'BEGIN { r = "true" } NR == 1 { n = NF; next } NF != n { r = "false"; n = "N/A"; exit } END { printf("status=%s count=%s\n", r, n) }' somefilename

또는 awk스크립트로:

#!/usr/bin/awk -f

BEGIN { r = "true" }

NR == 1 { n = NF; next }
NF != n { r = "false"; n = "N/A"; exit }

END { printf("status=%s count=%s\n", r, n) }

스크립트는 r"결과"와 같은 설정으로 시작 됩니다 true(거짓이 아닌 참이라고 가정합니다). 그런 다음 n첫 번째 행의 필드 수로 초기화됩니다 ("숫자").

입력 데이터의 다른 행에 다른 수의 필드가 있는 경우 다음 r으로 설정되고 false스크립트 가 종료됩니다(블록을 통해 n) .N/AEND

마지막으로 rsum의 현재 값을 인쇄합니다.n

이 스크립트의 출력은 다음과 유사합니다.

status=true count=5

또는

status=false count=N/A

이는 or export, 또는 와 함께 사용할 수 있습니다 .bashdeclareeval

declare $( awk '...' somefilename )

그러면 쉘 변수가 생성되고 count호출 status쉘에서 사용할 수 있습니다.

Question 2

연관 배열을 사용하여 각 개수의 수량을 보관할 수 있습니다.

#!/bin/bash
declare -A seen
while read -a line ; do
    (( seen[${#line[@]}]++ ))
done

if [[ ${#seen[@]} == 1 ]] ; then
    echo count=${#seen[@]}
    exit
else
    echo count=NA
    exit 1
fi

또는 외부 도구를 사용하여 작업을 수행할 수도 있습니다. 예를 들어, 다음 스크립트는 Perl을 사용하여 -a자동 분할 옵션을 통해 단어 수를 계산하고, sort -u고유한 수를 얻고, wc -l수가 하나 이상 있는지 확인합니다.

#!/bin/bash
out=$(perl -lane 'print scalar @F' | sort -u)
if ((1 == $(wc -l <<<"$out") )) ; then
    echo count=$out
    exit
else
    echo count=NA
    exit 1
fi

Answer

연관 배열을 사용하여 각 개수의 수량을 보관할 수 있습니다.

#!/bin/bash
declare -A seen
while read -a line ; do
    (( seen[${#line[@]}]++ ))
done

if [[ ${#seen[@]} == 1 ]] ; then
    echo count=${#seen[@]}
    exit
else
    echo count=NA
    exit 1
fi

또는 외부 도구를 사용하여 작업을 수행할 수도 있습니다. 예를 들어, 다음 스크립트는 Perl을 사용하여 -a자동 분할 옵션을 통해 단어 수를 계산하고, sort -u고유한 수를 얻고, wc -l수가 하나 이상 있는지 확인합니다.

#!/bin/bash
out=$(perl -lane 'print scalar @F' | sort -u)
if ((1 == $(wc -l <<<"$out") )) ; then
    echo count=$out
    exit
else
    echo count=NA
    exit 1
fi

Question 3

if
  count=$(
    awk 'NR == 1 {print count = NF}
         NF != count {exit 1}' < file
  )
then
  if [ -z "$count" ]; then
    echo "OK? Not OK? file is empty"
  else
    echo "OK all lines have $count words"
  fi
else
  echo >&2 "Not all lines have the same number of words or the file can't be read"
fi

마지막 부분에서 구별할 수 있습니다.다른 개수그리고열 수없는 파일다시 [ -z "$count" ].

Answer

if
  count=$(
    awk 'NR == 1 {print count = NF}
         NF != count {exit 1}' < file
  )
then
  if [ -z "$count" ]; then
    echo "OK? Not OK? file is empty"
  else
    echo "OK all lines have $count words"
  fi
else
  echo >&2 "Not all lines have the same number of words or the file can't be read"
fi

마지막 부분에서 구별할 수 있습니다.다른 개수그리고열 수없는 파일다시 [ -z "$count" ].

Question 4

#!/usr/bin/perl

use strict; # get perl to warn us if we try to use an undeclared variable.

# get all words on first line, and store them in a hash
#
# note: it doesn't matter which line we get the word list from because
# we only want to know if all lines have the same number of identical
# words.
my %words = map { $_ => 1 } split (/\s+/,<>);

while(<>) {
  # now do the same for each subsequent line
  my %thisline = map { $_ => 1 } split ;

  # and compare them.  exit with a non-zero exit code if they differ.
  if (%words != %thisline) {
    # optionally print a warning message to STDERR here.
    exit 1;
  }
};

# print the number of words we saw on the first line
print scalar keys %words, "\n";
exit 0

( exit 0마지막 줄은 필수가 아닙니다. 어쨌든 기본값입니다. 반환 코드가 프로그램 출력의 중요한 부분임을 문서화하는 데에만 "유용합니다".

노트: 한 줄에 반복되는 단어는 포함되지 않습니다. 예를 들어 다음과 같이 sda sdb sdc sdc sdc계산됩니다 .삼말은, 아니5마지막 세 단어가 동일하기 때문입니다. 이것이 중요하다면 해시는 각 단어의 발생 횟수도 계산해야 합니다. 이 같은:

#!/usr/bin/perl

use strict;   # get perl to warn us if we try to use an undeclared variable.

# get all words on first line, and store them in a hash
#
# note: it doesn't matter which line we get the word list from because
# we only want to know if all lines have the same number of identical
# words.
my %words=();
$words{$_}++ for split (/\s+/,<>);

while(<>) {
  # now do the same for each subsequent line
  my %thisline=();
  $thisline{$_}++ for split;

  # and compare them.  exit with a non-zero exit code if they differ.
  if (%words != %thisline) {
    # optionally print a warning message to STDERR here
    exit 1;
  }
};

# add up the number of times each word was seen  on the first line  
my $count=0;
foreach (keys %words) {
  $count += $words{$_};
};

# print the total
print "$count\n";
exit 0;

중요한 차이점은 해시 배열이 채워지는 방식입니다. 첫 번째 버전에서는 각 키("단어")의 값을 1로 설정합니다. 두 번째 버전에서는 각 키의 발생 횟수를 계산합니다.

두 번째 버전도 각 키의 값을 더해야 하며, 표시된 키 수만 인쇄할 수는 없습니다.

Answer

#!/usr/bin/perl

use strict; # get perl to warn us if we try to use an undeclared variable.

# get all words on first line, and store them in a hash
#
# note: it doesn't matter which line we get the word list from because
# we only want to know if all lines have the same number of identical
# words.
my %words = map { $_ => 1 } split (/\s+/,<>);

while(<>) {
  # now do the same for each subsequent line
  my %thisline = map { $_ => 1 } split ;

  # and compare them.  exit with a non-zero exit code if they differ.
  if (%words != %thisline) {
    # optionally print a warning message to STDERR here.
    exit 1;
  }
};

# print the number of words we saw on the first line
print scalar keys %words, "\n";
exit 0

( exit 0마지막 줄은 필수가 아닙니다. 어쨌든 기본값입니다. 반환 코드가 프로그램 출력의 중요한 부분임을 문서화하는 데에만 "유용합니다".

노트: 한 줄에 반복되는 단어는 포함되지 않습니다. 예를 들어 다음과 같이 sda sdb sdc sdc sdc계산됩니다 .삼말은, 아니5마지막 세 단어가 동일하기 때문입니다. 이것이 중요하다면 해시는 각 단어의 발생 횟수도 계산해야 합니다. 이 같은:

#!/usr/bin/perl

use strict;   # get perl to warn us if we try to use an undeclared variable.

# get all words on first line, and store them in a hash
#
# note: it doesn't matter which line we get the word list from because
# we only want to know if all lines have the same number of identical
# words.
my %words=();
$words{$_}++ for split (/\s+/,<>);

while(<>) {
  # now do the same for each subsequent line
  my %thisline=();
  $thisline{$_}++ for split;

  # and compare them.  exit with a non-zero exit code if they differ.
  if (%words != %thisline) {
    # optionally print a warning message to STDERR here
    exit 1;
  }
};

# add up the number of times each word was seen  on the first line  
my $count=0;
foreach (keys %words) {
  $count += $words{$_};
};

# print the total
print "$count\n";
exit 0;

중요한 차이점은 해시 배열이 채워지는 방식입니다. 첫 번째 버전에서는 각 키("단어")의 값을 1로 설정합니다. 두 번째 버전에서는 각 키의 발생 횟수를 계산합니다.

두 번째 버전도 각 키의 값을 더해야 하며, 표시된 키 수만 인쇄할 수는 없습니다.

bash + 각 줄의 모든 단어 수가 동일한지 확인하는 방법

답변1

답변2

답변3

답변4

관련 정보