로그에서 특정 정보 추출

Question 1

다음 표현을 시도해 볼 수 있습니다 sed.

sed -e 's/^\(.* .* \).* .*== \([^ ]* \).*MAIL FROM:<\([^ ]*\)> [^ ]* \([0-9 .]*\)\[.*Messages from \([^ ]*\).*$/\1\t\2\t\3\t\5\t\4/'

귀하의 예에서는 저에게 효과적이었습니다.

설명하다

표현식 sed에는 단 하나의 명령( )만 포함되어 있습니다 s/.../.../.

첫 번째 부분 s///:

'^\(.* .* \)'      -- Timestamp, two first space-separated blocks of text, \1.
'.* .*== '         -- Uninteresting text after timestamp.
'\([^ ]* \)'       -- Block of test between spaces, first email address, \2.
'.*MAIL FROM:<'    -- Position before second email.
'\([^ ]*\)>'       -- Second email addr, non-space characters, ended by '>', \3.
' [^ ]* '          -- SIZE=...:
'\([0-9 .]*\)\['   -- Error codes: digits, spaces and dots ended by '[', \4.
'.*Messages from ' -- Position before IP.
'\([^ ]*\)'        -- Non-space characters, ended by space, IP. \5.
'.*$'              -- Text before end of string, not interesting.

보시다시피 이것은 원본 로그에 대한 직접적인 설명일 뿐 흥미롭지는 않습니다.

두 번째 부분은 (탭 문자)를 올바른 순서 로 구분 기호로 배치하는 s///것입니다 .\N\t

Answer

다음 표현을 시도해 볼 수 있습니다 sed.

sed -e 's/^\(.* .* \).* .*== \([^ ]* \).*MAIL FROM:<\([^ ]*\)> [^ ]* \([0-9 .]*\)\[.*Messages from \([^ ]*\).*$/\1\t\2\t\3\t\5\t\4/'

귀하의 예에서는 저에게 효과적이었습니다.

설명하다

표현식 sed에는 단 하나의 명령( )만 포함되어 있습니다 s/.../.../.

첫 번째 부분 s///:

'^\(.* .* \)'      -- Timestamp, two first space-separated blocks of text, \1.
'.* .*== '         -- Uninteresting text after timestamp.
'\([^ ]* \)'       -- Block of test between spaces, first email address, \2.
'.*MAIL FROM:<'    -- Position before second email.
'\([^ ]*\)>'       -- Second email addr, non-space characters, ended by '>', \3.
' [^ ]* '          -- SIZE=...:
'\([0-9 .]*\)\['   -- Error codes: digits, spaces and dots ended by '[', \4.
'.*Messages from ' -- Position before IP.
'\([^ ]*\)'        -- Non-space characters, ended by space, IP. \5.
'.*$'              -- Text before end of string, not interesting.

보시다시피 이것은 원본 로그에 대한 직접적인 설명일 뿐 흥미롭지는 않습니다.

두 번째 부분은 (탭 문자)를 올바른 순서 로 구분 기호로 배치하는 s///것입니다 .\N\t

Question 2

나는 awk에 대한 경험이 많지 않지만 시도해 보고 싶습니다. 나는 당신이 얼마나 많은 로그 라인을 얻고 싶은지 모르기 때문에 이것이 매우 취약하다고 생각합니다.

어쨌든 이것은 BEGIN블록을 사용하여 선택할 변수를 설정하고 제목을 표시하기 전에 인쇄할 형식 문자열을 사용합니다. Time 및 EmailTo는 예측 가능하므로 세 가지 정규식 세트( $1, $2및 $5) 앞에 숫자 필드를 사용할 수 있습니다. 이는 매우 대략적인 작업입니다. 개선을 위한 어떤 제안이라도 대단히 감사하겠습니다!

awk 'BEGIN {
        from=""; ip=""; error=""; fstr="%-24s%-24s%-40s%-16s%s\n";
        printf(fstr, "Timestamp:", "EmailTo:", "EmailFrom:", "IPAddress:", "ErrorCodes:");
    }
{   for (i=6; i<NF; i++)
    {   
    # From Address
    if ($i ~ /FROM:<[^ ]*>/)  
        from=substr($i, 7, length($i)-7);
    # Errors found in two adjacent fields.
    if ($(i-1) ~ /[[:digit:]]{3}/ && $i ~ /[[:digit:]]\.[[:digit:]]\.[[:digit:]]/)
        error=$(i-1) " " $i;
    # From address after predictable string.
    if ($(i-2) " " $(i-1) == "Messages from" && $i ~ /[[:digit:].]{7,15}/)
        ip=$i;
    }
    printf(fstr, $1" "$2, $5, from, ip, error);
}' logs

Answer

나는 awk에 대한 경험이 많지 않지만 시도해 보고 싶습니다. 나는 당신이 얼마나 많은 로그 라인을 얻고 싶은지 모르기 때문에 이것이 매우 취약하다고 생각합니다.

어쨌든 이것은 BEGIN블록을 사용하여 선택할 변수를 설정하고 제목을 표시하기 전에 인쇄할 형식 문자열을 사용합니다. Time 및 EmailTo는 예측 가능하므로 세 가지 정규식 세트( $1, $2및 $5) 앞에 숫자 필드를 사용할 수 있습니다. 이는 매우 대략적인 작업입니다. 개선을 위한 어떤 제안이라도 대단히 감사하겠습니다!

awk 'BEGIN {
        from=""; ip=""; error=""; fstr="%-24s%-24s%-40s%-16s%s\n";
        printf(fstr, "Timestamp:", "EmailTo:", "EmailFrom:", "IPAddress:", "ErrorCodes:");
    }
{   for (i=6; i<NF; i++)
    {   
    # From Address
    if ($i ~ /FROM:<[^ ]*>/)  
        from=substr($i, 7, length($i)-7);
    # Errors found in two adjacent fields.
    if ($(i-1) ~ /[[:digit:]]{3}/ && $i ~ /[[:digit:]]\.[[:digit:]]\.[[:digit:]]/)
        error=$(i-1) " " $i;
    # From address after predictable string.
    if ($(i-2) " " $(i-1) == "Messages from" && $i ~ /[[:digit:].]{7,15}/)
        ip=$i;
    }
    printf(fstr, $1" "$2, $5, from, ip, error);
}' logs

로그에서 특정 정보 추출

답변1

설명하다

답변2

관련 정보