Bash 스크립트에서 문자열 추출, 잘라내기 및 처리

Question 1

또한 다음과 같은 간단한 awk방법을 시도해 보세요.

awk -F"[_.:]" '                                 # set field separator to "_", ".", or ":"
        {SUM[$3] += $NF                         # sum all trailing fields in array indexed by the date
        }
END     {for (s in SUM) print s, SUM[s]         # print the date and the respective sum
        }
' OFS=":" file[123]                             # set output field separator; have shell expand file names 1 - 3

Answer

또한 다음과 같은 간단한 awk방법을 시도해 보세요.

awk -F"[_.:]" '                                 # set field separator to "_", ".", or ":"
        {SUM[$3] += $NF                         # sum all trailing fields in array indexed by the date
        }
END     {for (s in SUM) print s, SUM[s]         # print the date and the respective sum
        }
' OFS=":" file[123]                             # set output field separator; have shell expand file names 1 - 3

Question 2

awk -F 'localhost_access_' ' 
    {
         n=substr($2,1+index($2,":"));  
         gsub(".tar.gz.*","",$2);
         str[$2]+=n
    }
    END{
        for (i in str){
            print i":"str[i]
        }
    }' node1.txt node2.txt node3.txt node4.txt | sort -to > output.txt

    output_not_sorted=$(cat output.txt);

    # sort output by date

    exit

이 문제가 개선될 수 있는지 알려주세요.

Answer

awk -F 'localhost_access_' ' 
    {
         n=substr($2,1+index($2,":"));  
         gsub(".tar.gz.*","",$2);
         str[$2]+=n
    }
    END{
        for (i in str){
            print i":"str[i]
        }
    }' node1.txt node2.txt node3.txt node4.txt | sort -to > output.txt

    output_not_sorted=$(cat output.txt);

    # sort output by date

    exit

이 문제가 개선될 수 있는지 알려주세요.

Question 3

질문 시작 부분에 표시된 두 파일을 고려하고 실제로 빈 줄이 포함되어 있지 않다고 가정합니다.

$ awk -F ':' -v OFS=':' '
    { sum[$1]+=$2 }
    END { for (key in sum) {
        split(key,f,"[_.]")
        print f[3],sum[key] } }' file* | sort
2018-06-19:0
2018-06-20:0
2018-06-21:1
2018-06-22:0
2018-06-23:0
2018-06-24:0
2018-06-25:0
2018-06-26:1
2018-06-27:0
2018-07-04:2
2018-07-05:3
2018-07-06:6
2018-07-07:0
2018-07-19:28
2018-07-20:17
2018-07-21:12
2018-07-22:4
2018-07-23:2
2018-07-24:2905
2018-07-25:10440
2018-07-26:2644
2018-07-27:1896
2018-07-28:1238
2018-07-29:932

파일 이름을 연관 배열의 키로 사용 sum하고 그 안의 각 파일 이름에 대한 합계를 수집합니다. 마지막으로 키를 반복 sum하고 각 키의 날짜 부분과 합계를 인쇄합니다. 키의 날짜 부분은 점과 밑줄로 나눈 후 세 번째 필드입니다.

결과는 파이프라인입니다 sort.

더 짧지만 기본적으로 위와 동일합니다(단, 날짜를 배열의 키로 사용하는 경우에만 sum):

awk -F '[_.:]' -v OFS=':' '
    { sum[$3]+=$6 }
    END { for (d in sum) print d, sum[d] }' file*

Answer