awk 열에서 하나 이상의 일치 항목을 반복합니다.

Question 1

GNU awk 사용 FPAT(이미 gawk가 필요하므로 gensub()및 \s약어 도 사용 [[:space:]]):

$ cat tst.awk
BEGIN {
    FPAT = "([^,]*)|(\"[^\"]+\")"
    OFS=","
}
{
    name = gensub(/^"|"$/,"","g",$1)
    n = split(gensub(/^"|"$/,"","g",$2),emails,/\s*[;,|:]\s*/)
    for (i=1; i<=n; i++) {
        print name, emails[i]
    }
}
$
$ awk -f tst.awk file
agrippa,[email protected]
elvirka,[email protected]
Inofs,[email protected]
Inofs,[email protected]
bekbz,[email protected]
bekbz,[email protected]
njkzif,[email protected]
njkzif,[email protected]
njycz,[email protected]
njycz,[email protected]
DanielEdict,[email protected]
JosEmbesy,[email protected]
JosEmbesy,[email protected]
Walterdon,[email protected]
Walterdon,[email protected]
Kennethlob,[email protected]
Ninosh,[email protected]
Patrickbam,[email protected]

FWIW 저는 일반적으로 이 *sub(/^"|"$/,"",...)방법을 사용하여 CSV 필드에서 가능한 선행/훈련 큰따옴표를 제거합니다. substr()큰따옴표 없이 필드를 나누지 않는 방법에 비해 이점이 있기 때문입니다.

[;,|:]이메일 주소가 손상되었거나 처리를 잊어버린 항목(예: 의 구분 기호 ) 에 대비하여 몇 가지 오류 감지를 추가할 수도 있습니다 .

$ cat tst.awk
BEGIN {
    FPAT = "([^,]*)|(\"[^\"]+\")"
    OFS=","
}
{
    name = gensub(/^"|"$/,"","g",$1)
    n = split(gensub(/^"|"$/,"","g",$2),emails,/\s*[;,|:]\s*/)
    for (i=1; i<=n; i++) {
        email = emails[i]
        if ( gsub(/@/,"&",email) != 1 ) {
            printf "ERROR: too few or too many email addresses in \"%s\"\n", email | "cat>&2"
            exit 1
        }
        print name, email
    }
}

정말로 이메일 주소를 확인하고 싶다면 FWIW 지난 5년 동안 문제가 없었으며 이 정규식의 수정된 버전을 사용해 왔다는 것을 알고 있습니다.http://www.regular-expressions.info/email.html(저는 특별히 [:alpha:] 대신 [a-zA-Z]를 사용했습니다. 왜냐하면 저는 제 로케일에서 그렇게 간주되는 문자만 허용하고 싶기 때문입니다. 애플리케이션에 적합한 문자를 결정하는 것은 여러분의 몫입니다.)

    (email ~ /^[0-9a-zA-Z._%+-]+@[0-9a-zA-Z.-]+\.[a-zA-Z]{2,}$/)

Answer

GNU awk 사용 FPAT(이미 gawk가 필요하므로 gensub()및 \s약어 도 사용 [[:space:]]):

$ cat tst.awk
BEGIN {
    FPAT = "([^,]*)|(\"[^\"]+\")"
    OFS=","
}
{
    name = gensub(/^"|"$/,"","g",$1)
    n = split(gensub(/^"|"$/,"","g",$2),emails,/\s*[;,|:]\s*/)
    for (i=1; i<=n; i++) {
        print name, emails[i]
    }
}
$
$ awk -f tst.awk file
agrippa,[email protected]
elvirka,[email protected]
Inofs,[email protected]
Inofs,[email protected]
bekbz,[email protected]
bekbz,[email protected]
njkzif,[email protected]
njkzif,[email protected]
njycz,[email protected]
njycz,[email protected]
DanielEdict,[email protected]
JosEmbesy,[email protected]
JosEmbesy,[email protected]
Walterdon,[email protected]
Walterdon,[email protected]
Kennethlob,[email protected]
Ninosh,[email protected]
Patrickbam,[email protected]

FWIW 저는 일반적으로 이 *sub(/^"|"$/,"",...)방법을 사용하여 CSV 필드에서 가능한 선행/훈련 큰따옴표를 제거합니다. substr()큰따옴표 없이 필드를 나누지 않는 방법에 비해 이점이 있기 때문입니다.

[;,|:]이메일 주소가 손상되었거나 처리를 잊어버린 항목(예: 의 구분 기호 ) 에 대비하여 몇 가지 오류 감지를 추가할 수도 있습니다 .

$ cat tst.awk
BEGIN {
    FPAT = "([^,]*)|(\"[^\"]+\")"
    OFS=","
}
{
    name = gensub(/^"|"$/,"","g",$1)
    n = split(gensub(/^"|"$/,"","g",$2),emails,/\s*[;,|:]\s*/)
    for (i=1; i<=n; i++) {
        email = emails[i]
        if ( gsub(/@/,"&",email) != 1 ) {
            printf "ERROR: too few or too many email addresses in \"%s\"\n", email | "cat>&2"
            exit 1
        }
        print name, email
    }
}

정말로 이메일 주소를 확인하고 싶다면 FWIW 지난 5년 동안 문제가 없었으며 이 정규식의 수정된 버전을 사용해 왔다는 것을 알고 있습니다.http://www.regular-expressions.info/email.html(저는 특별히 [:alpha:] 대신 [a-zA-Z]를 사용했습니다. 왜냐하면 저는 제 로케일에서 그렇게 간주되는 문자만 허용하고 싶기 때문입니다. 애플리케이션에 적합한 문자를 결정하는 것은 여러분의 몫입니다.)

    (email ~ /^[0-9a-zA-Z._%+-]+@[0-9a-zA-Z.-]+\.[a-zA-Z]{2,}$/)

Question 2

15개 이상의 열과 7개 열에 대한 괄호 안의 설명을 잘 이해하지 못하지만 주어진 예에서는 다음을 시도해 보십시오.

awk -F, '


        {gsub (/[" ]/,_)                        # remove double quotes and space all over
         D1 = $1                                # save field 1 and
         sub ($1 FS, _)                         # remove it from line
         n  = split ($0, T, /[,;:\|]/)          # split the residual line into array T
         for (i=1; i<=n; i++) print D1, T[i]    # print former $1, and each T element
        }
' OFS=, file
agrippa,[email protected]
elvirka,[email protected]
Inofs,[email protected]
Inofs,[email protected]
.
.
.
Patrickbam,[email protected]

Answer

15개 이상의 열과 7개 열에 대한 괄호 안의 설명을 잘 이해하지 못하지만 주어진 예에서는 다음을 시도해 보십시오.

awk -F, '


        {gsub (/[" ]/,_)                        # remove double quotes and space all over
         D1 = $1                                # save field 1 and
         sub ($1 FS, _)                         # remove it from line
         n  = split ($0, T, /[,;:\|]/)          # split the residual line into array T
         for (i=1; i<=n; i++) print D1, T[i]    # print former $1, and each T element
        }
' OFS=, file
agrippa,[email protected]
elvirka,[email protected]
Inofs,[email protected]
Inofs,[email protected]
.
.
.
Patrickbam,[email protected]

awk 열에서 하나 이상의 일치 항목을 반복합니다.

답변1

답변2

관련 정보