UNIX 파일 라인 수를 여러 유형의 레코드 가수와 비교

UNIX 파일 라인 수를 여러 유형의 레코드 가수와 비교

내 파일에는 여러 헤더와 여러 레코드 유형(예: 0001, 0002, 0003, 0004)이 있습니다. 꼬리 행의 각 레코드 유형 개수는 전체 세부 레코드 개수와 함께 제공됩니다.

샘플 파일:

XYZH001
YZXH002
0001Rec1
0001Rec2
YZXH002
0002Rec1
0002Rec2
YZXH002
0003Rec1
0003Rec2
0003Rec3
YZXH002
0004Rec1
T999008002002004001

파일 세부정보:

Detail records are where 1 to 4 position data in (0001, 0002, 0003, 0004)

Trailer:
Trailer identifier(position 1 to 4)             = T999
total data count (position 5 to 7)              = 008
count of record type 0001 (position 8 to 10)    = 002
count of record type 0002 (position 11 to 13)   = 002
count of record type 0003 (position 14 to 16)   = 004
count of record type 0004 (position 17 to 19)   = 001

필요하다:

-- Compare overall detail row count where 1 to 4 position data in (0001, 0002, 0003, 0004) with trailer record count (position 5 to 7) 
-- Compare each Record type row count with trailer record count
     eg. Compare row count where 1 to 4 position data = 0001 with trailer record count for 0001 (position 5 to 7) 
      .....
-- Stop execution in case of detail record row count and trailer count mismatch

예상 출력:

Overall detail row count 8 matches with trailer record count 8.
Row count for 0001 record type 2 matches with trailer record count 2.
Row count for 0002 record type 2 matches with trailer record count 2.
Row count for 0003 record type 3 does not match with trailer record count 4.
Stopping execution.

답변1

사용:

awk '/^000[1-4]/{ rec[substr($0, 1, 4)+0]++; totalRec++ }
END{ trailTotalRec=substr($0, 5, 3)+0
     for(i=1; i<=4; i++) { trailRec[i]=substr($0, pos+8, 3)+0; pos+=3 }
    if(trailTotalRec==totalRec)
        print "Overall detail row count", totalRec, "matches with trailer record count", trailTotalRec"."
    for(i=1; i<=4; i++) {
        print "Row count for 000" i, "record type", rec[i], (trailRec[i]==rec[i]?"matches":"does not"), "with trailer record count", trailRec[i]"."
        if(trailRec[i]!=rec[i]) { print "Stopping execution."; exit 7 }
    }
}' infile

답변2

#!/usr/bin/awk -f

/^000[0-9]/ {
    i=substr($1,1,4);
    sub(/^0+/,"",i);
    bc[i]++;   # bc = body count array
    rtotal++;
}

/^T999/ {
  dc = substr($0,5,3);
  for (r in bc) {
    rc[r] = substr($0,(r-1)*3+8,3); # rc = row count array
    sub(/^0+/,"",rc[r]);
    ttotal+=rc[r]
  };
}

END {
  if (dc == rtotal) {
    printf "Overall detail row count %i matches with trailer record count %i.\n", rtotal, dc;
  } else {
     print "Row count %i does not match trailer record count %i", rtotal, dc;
     exit 1;
  }
  for (r in rc) {
    if (rc[r] == bc[r]) {
      printf "Row count for %04i record type %i matches with trailer record count %i.\n", r, bc[r], rc[r];
    } else {
      printf "Row count for %04i record type %i does not match with trailer record count %i.\n", r, bc[r], rc[r];
      print "Stopping execution"
      exit 2;
    };
  };
}

이는 T999트레일러 라인을 가정합니다.~ 할 것이다배열의 각 행에는 데이터 개수(dc) AND 에 대한 3개의 문자가 있습니다 bc.

데이터에 오류가 있는 경우 1(총 행 수가 꼬리의 총 데이터 수와 일치하지 않음) 또는 2(총 행 수가 꼬리의 총 데이터 수와 일치하지 않음)로 종료됩니다. . 예를 들어 쉘 내장 변수(마지막 명령의 종료 코드)가 있는 쉘 케이스 문에서 $?테스트하고 사용할 수 있습니다 .

 #!/bin/sh

 ./summarise.awk file.txt
 ec="$?"
 case "$ec" in
    0) probably_do_nothing ;; # success! 0 == no error
    1) do_something ;;
    2) do_another_thing ;;
 esac

예를 들어 로 저장 summarise.awk하고 를 사용하여 실행 가능하게 만듭니다 chmod +x summarise.awk.

$ ./summarise.awk file.txt 
Overall detail row count 8 matches with trailer record count 8.
Row count for 0001 record type 2 matches with trailer record count 2.
Row count for 0002 record type 2 matches with trailer record count 2.
Row count for 0003 record type 3 does not match with trailer record count 4.
Stopping execution

답변3

$ cat tst.awk
/^[0-9]/ {
    currType = substr($0,1,4)
    if ( currType != prevType ) {
        recNr2type[++actNumRecs] = currType
        prevType = currType
    }
    recNrs2rowCnts[actNumRecs]++
    actTotRows++
    next
}
{ trailer = $0 }
END {
    expTotRows = substr(trailer,5,3)+0

    result = (actTotRows == expTotRows ? "matches" : "does not match")
    printf "Overall detail row count %d %s with trailer record count %d.\n", \
        actTotRows, result, expTotRows

    trailer = substr(trailer,8)
    while ( (trailer != "") && (result == "matches") ) {
        expRowCnt = substr(trailer,1,3)+0
        actRowCnt = recNrs2rowCnts[++expNumRecs]
        type      = recNr2type[expNumRecs]

        result = (actRowCnt == expRowCnt ? "matches" : "does not match")
        printf "Row count for %s record type %s %s with trailer record count %d.\n", \
            actRowCnt, type, result, expRowCnt

        trailer = substr(trailer,4)
    }

    print "Stopping execution."
}

$ awk -f tst.awk file
Overall detail row count 8 matches with trailer record count 8.
Row count for 2 record type 0001 matches with trailer record count 2.
Row count for 2 record type 0002 matches with trailer record count 2.
Row count for 3 record type 0003 does not match with trailer record count 4.
Stopping execution.

관련 정보