입력 샘플
0bef-82-46-8a-9a0b.xml "Fruits/Mango Apple /Plum cherry date">1446815.ABC
0bef-82-46-8a-9a0b 5da-0-ba-c1-1a9 "Fruits/Pear Banana/Plum orange mango"
0bef-82-46-8a-9a0b ac-94-4ab-91-23 "Fruits/Pear Banana/Plum orange mango"
0bef-82-46-8a-9a0b 5z-94-ab-92-2f3 "Fruits/Pear Banana/Plum orange mango"
952f-82-46-8a-9a0b.xml "Fruits/Mango"1244115.ABC
3cff-82-46-8a-9a0b.xml "Fruits/Big Mango/Not Sweet ">905499.ABC
6m0k-82-46-8a-9a0b.xml "Fruits/Big Pear/Very Sweet">855499.ABC
17a-42-df-c24.xml "Fruits Market/Big Apple/Sweet "1483415.ABC
17a-42-df-c24 54-ba-4411-9-3d8 "Veg/Radish /Radish Carrot Celery Onion"
17a-42-df-c24 2da5-0-4a-b1-e89 "Veg/Radish /Radish Carrot Celery Onion"
17a-42-df-c24 b7-94-4db-92-2f3 "Veg/Radish /Radish Carrot Celery Onion"
17a-42-df-c24 4d-67c-446-b5-ac "Veg/Radish /Radish Carrot Celery Onion"
17a-42-df-c24 2-8b-4det-87-769 "Veg/Radish /Radish Carrot Celery Onion"
예상 출력 -
0bef-82-46-8a-9a0b.xml,"Fruits/Mango Apple /Plum cherry date",0bef-82-46-8a-9a0b,5da-0-ba-c1-1a9,"Fruits/Pear Banana/Plum orange mango"
0bef-82-46-8a-9a0b.xml,"Fruits/Mango Apple /Plum cherry date",0bef-82-46-8a-9a0b,ac-94-4ab-91-23,"Fruits/Pear Banana/Plum orange mango"
0bef-82-46-8a-9a0b.xml,"Fruits/Mango Apple /Plum cherry date",0bef-82-46-8a-9a0b,5z-94-ab-92-2f3,"Fruits/Pear Banana/Plum orange mango"
952f-82-46-8a-9a0b.xml,"Fruits/Mango",,
3cff-82-46-8a-9a0b.xml,"Fruits/Big Mango/Not Sweet ",,
6m0k-82-46-8a-9a0b.xml,"Fruits/Big Pear/Very Sweet",,
17a-42-df-c24.xml,"Fruits Market/Big Apple/Sweet ",17a-42-df-c24,54-ba-4411-9-3d8,"Veg/Radish /Radish Carrot Celery Onion"
17a-42-df-c24.xml,"Fruits Market/Big Apple/Sweet ",17a-42-df-c24,2da5-0-4a-b1-e89,"Veg/Radish /Radish Carrot Celery Onion"
17a-42-df-c24.xml,"Fruits Market/Big Apple/Sweet ",17a-42-df-c24,b7-94-4db-92-2f3,"Veg/Radish /Radish Carrot Celery Onion"
17a-42-df-c24.xml,"Fruits Market/Big Apple/Sweet ",17a-42-df-c24,4d-67c-446-b5-ac,"Veg/Radish /Radish Carrot Celery Onion"
17a-42-df-c24.xml,"Fruits Market/Big Apple/Sweet ",17a-42-df-c24,2-8b-4det-87-769,"Veg/Radish /Radish Carrot Celery Onion"
원시 데이터 입력:
- 각 줄에는 선행 또는 후행 공백이 없습니다.
- 줄 사이에는 공백이 없습니다. 표시된 공백은 보기에 좋고 이해하기 쉽게 하기 위한 것입니다. 최종 출력에는 공백도 필요하지 않습니다.
- 여러 줄에서 ">" 기호가 누락되었습니다. 이것은 오타가 아닙니다.
bash/shell 스크립트(sed, awk 등)를 사용하여 다시 포맷하는 방법을 안내해 주시겠습니까? 나는 길을 잃었다.
답변1
사용 awk
:
awk '{
if (sub(/\.xml /, ".xml,")){ # replace `.xml ` with `.xml,`
if (NR>1 && is_processed != 1){ # xml line was not printed?
print xml"," # print previous xml line + `,`
}
sub(/>?[0-9]+\.ABC$/, ",") # replace strings `>1446815.ABC` or `1244115.ABC` with `,`
xml=$0 # save line in variable `xml`
is_processed=0 # clear flag
}
else {
if (!NF) next # skip empty line
sub(/ /, ",") # replace 1st ` ` with `,`
sub(/ /, ",") # replace 2nd ` ` with `,`
print xml$0 # print xml line + current line
is_processed=1 # set flag
}
}
END {
# print possible remaining line
if (is_processed != 1) print xml","
}' filein > fileout
-block은 if
이를 포함하는 행을 처리 .xml
하고 이를 변수에 저장합니다 xml
. -block은 else
xml 라인의 다음 "하위"를 처리하고 xml 라인과 처음 두 개의 공백 문자가 쉼표로 대체된 수정된 라인을 인쇄합니다. 빈 줄은 건너뜁니다.
if
"자식"이 없으면 추가 쉼표가 있는 xml 줄은 맨 위 블록 (줄 번호가 1보다 큰 경우) 또는 END
블록에 인쇄됩니다 .
출력( fileout
):
0bef-82-46-8a-9a0b.xml,"Fruits/Mango Apple /Plum cherry date",0bef-82-46-8a-9a0b,5da-0-ba-c1-1a9,"Fruits/Pear Banana/Plum orange mango"
0bef-82-46-8a-9a0b.xml,"Fruits/Mango Apple /Plum cherry date",0bef-82-46-8a-9a0b,ac-94-4ab-91-23,"Fruits/Pear Banana/Plum orange mango"
0bef-82-46-8a-9a0b.xml,"Fruits/Mango Apple /Plum cherry date",0bef-82-46-8a-9a0b,5z-94-ab-92-2f3,"Fruits/Pear Banana/Plum orange mango"
952f-82-46-8a-9a0b.xml,"Fruits/Mango",,
3cff-82-46-8a-9a0b.xml,"Fruits/Big Mango/Not Sweet ",,
6m0k-82-46-8a-9a0b.xml,"Fruits/Big Pear/Very Sweet",,
17a-42-df-c24.xml,"Fruits Market/Big Apple/Sweet ",17a-42-df-c24,54-ba-4411-9-3d8,"Veg/Radish /Radish Carrot Celery Onion"
17a-42-df-c24.xml,"Fruits Market/Big Apple/Sweet ",17a-42-df-c24,2da5-0-4a-b1-e89,"Veg/Radish /Radish Carrot Celery Onion"
17a-42-df-c24.xml,"Fruits Market/Big Apple/Sweet ",17a-42-df-c24,b7-94-4db-92-2f3,"Veg/Radish /Radish Carrot Celery Onion"
17a-42-df-c24.xml,"Fruits Market/Big Apple/Sweet ",17a-42-df-c24,4d-67c-446-b5-ac,"Veg/Radish /Radish Carrot Celery Onion"
17a-42-df-c24.xml,"Fruits Market/Big Apple/Sweet ",17a-42-df-c24,2-8b-4det-87-769,"Veg/Radish /Radish Carrot Celery Onion"
답변2
밀러 사용(https://github.com/johnkerl/miller) 및 sed
<input.csv sed -r 's|^(.+")(.?[0-9]+.+)$|\1 "\2"|g' | \
mlr --csv -N --ifs " " put 'if($1=~"xml") {$4=$1;$5=$2}' \
then unsparsify \
then fill-down -f 4,5 \
then count-similar -g 4 \
then filter '($1=~"xml" && $count==1) || ($1!=~"xml" && $count>1)' \
then reorder -f 4,5,1,2,3 \
then put 'if($2=~"Fru"){$1="";$2="";$3=""}' \
then cut -x -f count
당신은 할 것
+------------------------+--------------------------------------+--------------------+------------------+----------------------------------------+
| 0bef-82-46-8a-9a0b.xml | Fruits/Mango Apple /Plum cherry date | 0bef-82-46-8a-9a0b | 5da-0-ba-c1-1a9 | Fruits/Pear Banana/Plum orange mango |
| 0bef-82-46-8a-9a0b.xml | Fruits/Mango Apple /Plum cherry date | 0bef-82-46-8a-9a0b | ac-94-4ab-91-23 | Fruits/Pear Banana/Plum orange mango |
| 0bef-82-46-8a-9a0b.xml | Fruits/Mango Apple /Plum cherry date | 0bef-82-46-8a-9a0b | 5z-94-ab-92-2f3 | Fruits/Pear Banana/Plum orange mango |
| 952f-82-46-8a-9a0b.xml | Fruits/Mango | - | - | - |
| 3cff-82-46-8a-9a0b.xml | Fruits/Big Mango/Not Sweet | - | - | - |
| 6m0k-82-46-8a-9a0b.xml | Fruits/Big Pear/Very Sweet | - | - | - |
| 17a-42-df-c24.xml | Fruits Market/Big Apple/Sweet | 17a-42-df-c24 | 54-ba-4411-9-3d8 | Veg/Radish /Radish Carrot Celery Onion |
| 17a-42-df-c24.xml | Fruits Market/Big Apple/Sweet | 17a-42-df-c24 | 2da5-0-4a-b1-e89 | Veg/Radish /Radish Carrot Celery Onion |
| 17a-42-df-c24.xml | Fruits Market/Big Apple/Sweet | 17a-42-df-c24 | b7-94-4db-92-2f3 | Veg/Radish /Radish Carrot Celery Onion |
| 17a-42-df-c24.xml | Fruits Market/Big Apple/Sweet | 17a-42-df-c24 | 4d-67c-446-b5-ac | Veg/Radish /Radish Carrot Celery Onion |
| 17a-42-df-c24.xml | Fruits Market/Big Apple/Sweet | 17a-42-df-c24 | 2-8b-4det-87-769 | Veg/Radish /Radish Carrot Celery Onion |
+------------------------+--------------------------------------+--------------------+------------------+----------------------------------------+
노트: 입력으로 빈 줄 없이 CSV를 사용했습니다.