솔루션을 해킹하는 것은 약간 어렵습니다.
csv로 내보낸 SQL 테이블의 한 열에는 다음 데이터 유형의 쉼표로 구분된 문자열이 포함되어 있습니다. 일부 줄에는 구분 기호 없이 문자열만 있지만 다른 줄은 아래와 같습니다.
정규식과 큰따옴표를 일치시켜 전체 문자열이 필요합니다.
SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L2-TDD,AntennaNearUnit=1,RetSubUnit=1
SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L3-TDD,AntennaNearUnit=1,RetSubUnit=1
SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L1-TDD,AntennaNearUnit=1,RetSubUnit=1
SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L2-TDD,AntennaNearUnit=1,RetSubUnit=1
SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L3-TDD,AntennaNearUnit=1,RetSubUnit=1
여기 내가 붙어있어
샘플 파일은 다음과 같습니다.
export_pl,XYSS04S,58.38,800,39544,0,6,NULL,NULL,0,0,SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L2-TDD,AntennaNearUnit=1,RetSubUnit=1,4,0,28169.8,0,NULL,0.0604592,NULL,NULL,NULL,NULL,0,NULL,0.666667,0,0,NULL,NULL,NULL,NULL,1703.12,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK
export_pl,XYSS04S,58.38,250,39544,0,6,NULL,NULL,0,0,SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L3-TDD,AntennaNearUnit=1,RetSubUnit=1,4,0,28169.8,0,0,NULL,NULL,99.9459,3.88318,0,NULL,25.7521,0.666667,888,50848,43,1966,NULL,NULL,NULL,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK
export_pl,XYSS04S,58.38,340,39544,0,6,NULL,NULL,0,0,SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L1-TDD,AntennaNearUnit=1,RetSubUnit=1,6,0,28169.8,0,0,NULL,NULL,100,100,0.000163336,NULL,0.0388318,1,43,1966,888,50848,NULL,NULL,NULL,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK
다음 형식은 최종 목표입니다. 즉, 큰따옴표로 묶인 전체 문자열입니다. 정규 표현식을 사용하고 양쪽 끝에 큰따옴표를 추가합니다. 아래 예 - 처음에는 정규식을 전달하고 sed를 적용해 보았습니다.
grep -Eo "\w{10}\=\w{12},\w{10}=\w{3}.*,\w{10}=[[:alnum:]]" export.csv | sed 's/^/"/;s/$/"/'
"SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L2-TDD,AntennaNearUnit=1,RetSubUnit=1"
"SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L3-TDD,AntennaNearUnit=1,RetSubUnit=1"
"SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L1-TDD,AntennaNearUnit=1,RetSubUnit=1"
그러나 grep과 sed -i를 사용하면 일치하는 정규식을 수정하지 않습니다.
예를 들어 -
tail export.csv | sed -e 's/"\w{10}\=\w{12},\w{10}=\w{3}.*,\w{10}=\d$"/"/;s/$/"/'
export_pl,XYSS04S,58.38,800,39544,0,6,NULL,NULL,0,0,SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L2-TDD,AntennaNearUnit=1,RetSubUnit=1,4,0,28169.8,0,NULL,0.0604592,NULL,NULL,NULL,NULL,0,NULL,0.666667,0,0,NULL,NULL,NULL,NULL,1703.12,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK"
export_pl,XYSS04S,58.38,250,39544,0,6,NULL,NULL,0,0,SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L3-TDD,AntennaNearUnit=1,RetSubUnit=1,4,0,28169.8,0,0,NULL,NULL,99.9459,3.88318,0,NULL,25.7521,0.666667,888,50848,43,1966,NULL,NULL,NULL,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK"
export_pl,XYSS04S,58.38,340,39544,0,6,NULL,NULL,0,0,SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L1-TDD,AntennaNearUnit=1,RetSubUnit=1,6,0,28169.8,0,0,NULL,NULL,100,100,0.000163336,NULL,0.0388318,1,43,1966,888,50848,NULL,NULL,NULL,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK"
"를 추가하면 됩니다.마지막 열.
답변1
요점은 구분된 필드 \w+=\w+
와 일치하는 ,
시퀀스 주위에 따옴표를 붙이는 것 입니다 perl
.
$ perl -lpe 's/(?<![^,])(\w+=\w+)(,(?1))*(?![^,])/"$&"/g' your-file
export_pl,XYSS04S,58.38,800,39544,0,6,NULL,NULL,0,0,"SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1",AntennaUnitGroup=L2-TDD,"AntennaNearUnit=1,RetSubUnit=1",4,0,28169.8,0,NULL,0.0604592,NULL,NULL,NULL,NULL,0,NULL,0.666667,0,0,NULL,NULL,NULL,NULL,1703.12,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK
export_pl,XYSS04S,58.38,250,39544,0,6,NULL,NULL,0,0,"SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1",AntennaUnitGroup=L3-TDD,"AntennaNearUnit=1,RetSubUnit=1",4,0,28169.8,0,0,NULL,NULL,99.9459,3.88318,0,NULL,25.7521,0.666667,888,50848,43,1966,NULL,NULL,NULL,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK
export_pl,XYSS04S,58.38,340,39544,0,6,NULL,NULL,0,0,"SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1",AntennaUnitGroup=L1-TDD,"AntennaNearUnit=1,RetSubUnit=1",6,0,28169.8,0,0,NULL,NULL,100,100,0.000163336,NULL,0.0388318,1,43,1966,888,50848,NULL,NULL,NULL,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK
AntennaUnitGroup=L2-TDD
따옴표 밖에는 ord 문자가 -
없다는 점에 유의하세요 \w
. 를 제외한 모든 문자 \w
와 일치하려면 [\w-]
또는 로 바꾸십시오 (예: include ).[^,"=]
,
"
=
n-place -i
파일을 수정하는 옵션이 추가되었습니다 .i
답변2
awk를 사용하십시오.
$ awk 'BEGIN{FS=OFS=","} {$12="\""$12; $19=$19"\""} 1' file
export_pl,XYSS04S,58.38,800,39544,0,6,NULL,NULL,0,0,"SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L2-TDD,AntennaNearUnit=1,RetSubUnit=1",4,0,28169.8,0,NULL,0.0604592,NULL,NULL,NULL,NULL,0,NULL,0.666667,0,0,NULL,NULL,NULL,NULL,1703.12,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK
export_pl,XYSS04S,58.38,250,39544,0,6,NULL,NULL,0,0,"SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L3-TDD,AntennaNearUnit=1,RetSubUnit=1",4,0,28169.8,0,0,NULL,NULL,99.9459,3.88318,0,NULL,25.7521,0.666667,888,50848,43,1966,NULL,NULL,NULL,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK
export_pl,XYSS04S,58.38,340,39544,0,6,NULL,NULL,0,0,"SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L1-TDD,AntennaNearUnit=1,RetSubUnit=1",6,0,28169.8,0,0,NULL,NULL,100,100,0.000163336,NULL,0.0388318,1,43,1966,888,50848,NULL,NULL,NULL,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK
-E
또는 GNU 또는 BSD sed와 같은 ERE 지원 sed의 경우:
$ sed -E 's/(([^,]*,){11})(([^,]*,){7}[^,]*)/\1"\3"/' file
export_pl,XYSS04S,58.38,800,39544,0,6,NULL,NULL,0,0,"SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L2-TDD,AntennaNearUnit=1,RetSubUnit=1",4,0,28169.8,0,NULL,0.0604592,NULL,NULL,NULL,NULL,0,NULL,0.666667,0,0,NULL,NULL,NULL,NULL,1703.12,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK
export_pl,XYSS04S,58.38,250,39544,0,6,NULL,NULL,0,0,"SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L3-TDD,AntennaNearUnit=1,RetSubUnit=1",4,0,28169.8,0,0,NULL,NULL,99.9459,3.88318,0,NULL,25.7521,0.666667,888,50848,43,1966,NULL,NULL,NULL,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK
export_pl,XYSS04S,58.38,340,39544,0,6,NULL,NULL,0,0,"SubNetwork=ONRM_ROOT_AB,SubNetwork=XYZ,MeContext=XYSS04S,ManagedElement=XYSS04S,Equipment=1,AntennaUnitGroup=L1-TDD,AntennaNearUnit=1,RetSubUnit=1",6,0,28169.8,0,0,NULL,NULL,100,100,0.000163336,NULL,0.0388318,1,43,1966,888,50848,NULL,NULL,NULL,RT,2023-10-08 00:00:00,2023-10-10 00:00:00,20:00:00;21:00:00;22:00:00;14:00:00,1,OK
답변3
사용 awk
:
$ awk '{ sub(/[_[:alpha:]]{10}=[_[:alpha:]]{12},[_[:alpha:]]{10}=[_[:alpha:]]{3}.*,[_[:alpha:]]{10}=[[:alnum:]]+/, "\"&\"")}1'
# Using GNU awk
$ awk '{sub(/\w{10}=\w{12},\w{10}=\w{3}.*,\w{10}=[[:alnum:]]{1,}/, "\"&\"" ) }1'
아니면 GNU awk
기능을 gensub()
이용 해보세요
$ awk '{print gensub(/(^.*)(\w{10}=\w{12},\w{10}=\w{3}.*,\w{10}=[[:alnum:]]{1,})(.*$)/, "\\1\"\\2\"\\3", "g"); }'