패턴 일치, 개행 추가 및 줄 끝에 단어 추가

패턴 일치, 개행 추가 및 줄 끝에 단어 추가

누구든지 이 문제를 해결하도록 도와줄 수 있나요? 압축을 푼 파일이 있는데 파일 내용은 아래와 같습니다.

(11213068, 2020-11-16) deleted
(1075227404, 2021-06-14) added
(11213177, 2020-11-16) deleted
(1075227413, 2021-06-14) added
(11213070, 2020-11-16) deleted
(1075193958, 2021-05-28) added
(1075194668, 2022-11-29) added
(1073757334, 2021-01-20) (1073757337, 2021-01-20) (1073757349, 2021-01-20) (1073757331, 2021-01-20) (1073757346, 2021-01-20) added
(1073757237, 2020-11-20) (1073757263, 2020-11-20) (1073757233, 2020-11-20) (1073757241, 2020-11-20) (1073757247, 2020-11-20) deleted

++ 내가 원하는 파일 결과는 이렇습니다.-

(11213068, 2020-11-16) delete
(1075227404, 2021-06-14) add
(11213177, 2020-11-16) delete
(1075227413, 2021-06-14) add
(11213070, 2020-11-16) delete
(1075193958, 2021-05-28) add
(1075194668, 2022-11-29) add
(1073757334, 2021-01-20) add
(1073757337, 2021-01-20) add
(1073757349, 2021-01-20) add
(1073757331, 2021-01-20) add
(1073757346, 2021-01-20) add
(1073757237, 2020-11-20) delete
(1073757263, 2020-11-20) delete
(1073757233, 2020-11-20) delete
(1073757241, 2020-11-20) delete
(1073757247, 2020-11-20) delete

마지막 두 줄에서는 해결책을 찾을 수 없습니다. 나는 이 명령의 출력을 사용했습니다.

awk '$3!="added"' | awk '$3!="deleted"' | sed 's/) (/\n/g' file.txt

(11213068, 2020-11-16) deleted
(1075227404, 2021-06-14) added
(11213177, 2020-11-16) deleted
(1075227413, 2021-06-14) added
(11213070, 2020-11-16) deleted
(1075193958, 2021-05-28) added
(1075194668, 2022-11-29) added
(1073757334, 2021-01-20
1073757337, 2021-01-20
1073757349, 2021-01-20
1073757331, 2021-01-20
1073757346, 2021-01-20) added
(1073757237, 2020-11-20
1073757263, 2020-11-20
1073757233, 2020-11-20
1073757241, 2020-11-20
1073757247, 2020-11-20) deleted

시간 내 주셔서 감사합니다.

답변1

이를 위해 올바른 구분 기호를 사용하십시오.

awk -F') ' '{for (i=1;i<NF;i++) print $i FS $NF}' file

sub()마지막 필드를 교체해야 하는 경우 라인 처리 시작 부분에 사용하는 등 여러 가지 방법이 있습니다 .

awk -F') ' '{sub(/added$/,"add"); sub(/deleted$/,"delete"); for (i=1;i<NF;i++) print $i FS $NF}' file

답변2

GNU sed광범위한 정규식 패턴이 있습니다 -E.

  • )
    ( 개행 마커를 사용하여 샌드위치 영역을 표시합니다. 마지막 필드(과거 시제 정리 후)는 첫 번째 토큰으로 전송되고 첫 번째 토큰에 인쇄된 다음 첫 번째 토큰에서 잘립니다. 이 과정은 패턴 공간이 소진될 때까지 반복됩니다.

$ sed -Ee '/\n/ba
    /e?d$/s/ (add|delete)e?d$/ \1/
    s/[)] [(]/) \n(/g;:a
    s/(\n.*)?\n.* (\S+)$/\2&/
    /\n.*\n/{P;D;}
' file

$ perl -F'\)\s' -lane '$, = ") ";
    my $l = pop(@F) =~
     s/^(add)ed$/$1/r =~
      s/^(delete)d$/$1/r;
    print $_, $l for @F;
' file

답변3

아마도 2단계 솔루션이 아닐까요?

<infile sed 's/deleted/delete/; s/added/add/' | 
awk 'NF==3; NF>3 { for (i=1; i<NF; i+=2) print $i, $(i+1), $NF }'

답변4

FPAT에 GNU awk 사용:

$ awk -v FPAT='[(][^)]+)|\\S+' '{for (i=1; i<NF; i++) print $i, $NF}' file
(11213068, 2020-11-16) deleted
(1075227404, 2021-06-14) added
(11213177, 2020-11-16) deleted
(1075227413, 2021-06-14) added
(11213070, 2020-11-16) deleted
(1075193958, 2021-05-28) added
(1075194668, 2022-11-29) added
(1073757334, 2021-01-20) added
(1073757337, 2021-01-20) added
(1073757349, 2021-01-20) added
(1073757331, 2021-01-20) added
(1073757346, 2021-01-20) added
(1073757237, 2020-11-20) deleted
(1073757263, 2020-11-20) deleted
(1073757233, 2020-11-20) deleted
(1073757241, 2020-11-20) deleted
(1073757247, 2020-11-20) deleted

또는 정말로 이 마지막 단어를 바꾸고 싶다면:

$ awk -v FPAT='[(][^)]+)|\\S+' '
    BEGIN { map["deleted"]="delete"; map["added"]="add" }
    { for (i=1; i<NF; i++) print $i, map[$NF] }
' file
(11213068, 2020-11-16) delete
(1075227404, 2021-06-14) add
(11213177, 2020-11-16) delete
(1075227413, 2021-06-14) add
(11213070, 2020-11-16) delete
(1075193958, 2021-05-28) add
(1075194668, 2022-11-29) add
(1073757334, 2021-01-20) add
(1073757337, 2021-01-20) add
(1073757349, 2021-01-20) add
(1073757331, 2021-01-20) add
(1073757346, 2021-01-20) add
(1073757237, 2020-11-20) delete
(1073757263, 2020-11-20) delete
(1073757233, 2020-11-20) delete
(1073757241, 2020-11-20) delete
(1073757247, 2020-11-20) delete

관련 정보