CSV에서 필드 수가 다른 변수 만들기

Question 1

이 스크립트에서는 해당 행을 기본 변수로만 읽습니다 $REPLY. 그런 다음 쉼표를 공백으로 바꾸고 ${REPLY//,/ }array 에 넣으십시오 declare -a COL=(). 그런 다음 부분 섹션은 루프를 사용하여 처리되며, 여기서 열 인덱스는 다음을 사용하여 계산됩니다 $((idx+2)).

#! /bin/bash
while read; do
    declare -a COL=( ${REPLY//,/ } )
    echo -e "container=${COL[0]}\nrow=${COL[1]}\nshelf=${COL[2]}"
    idx=1
    while [ $idx -lt 10 ]; do
        echo "section$idx=${COL[$((idx+2))]}"
        let idx=idx+1
    done
done

Answer

이 스크립트에서는 해당 행을 기본 변수로만 읽습니다 $REPLY. 그런 다음 쉼표를 공백으로 바꾸고 ${REPLY//,/ }array 에 넣으십시오 declare -a COL=(). 그런 다음 부분 섹션은 루프를 사용하여 처리되며, 여기서 열 인덱스는 다음을 사용하여 계산됩니다 $((idx+2)).

#! /bin/bash
while read; do
    declare -a COL=( ${REPLY//,/ } )
    echo -e "container=${COL[0]}\nrow=${COL[1]}\nshelf=${COL[2]}"
    idx=1
    while [ $idx -lt 10 ]; do
        echo "section$idx=${COL[$((idx+2))]}"
        let idx=idx+1
    done
done

Question 2

각 CSV 레코드에 대해 연관 배열을 사용하겠습니다. 데이터가 다음과 같은 파일에 있다고 가정합니다.input.csv

#!/usr/bin/env bash

counter=1          # provides index for each csv record
while read 
do
    IFS=',' a=( $REPLY )               # numeric array containing current row
    eval "declare -A row$counter"      # declare an assoc. array representing
                                       # this row   

    eval "row$counter+=( ['row']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['shelf']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['section1']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['section2']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['section3']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['section4']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['section5']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['section6']=${a[0]} )"
    a=( "${a[@]:1}" )

    declare -p row$counter

    (( counter = counter + 1 ))
done < <( cat input.csv )

# access arbitrary element
printf "\n---------\n%s\n" ${row3["section4"]}

이것은 나에게 다음과 같은 결과를 제공합니다.

declare -A row1='([section6]="6" [section5]="5" [section4]="4" [section3]="4" [section2]="2" [section1]="1" [shelf]="12" [row]="PL3" )'
declare -A row2='([section6]="" [section5]="" [section4]="" [section3]="2" [section2]="1" [section1]="4" [shelf]="13" [row]="PL4" )'
declare -A row3='([section6]="" [section5]="" [section4]="3" [section3]="2" [section2]="1" [section1]="5" [shelf]="14" [row]="PL5" )'
declare -A row4='([section6]="5" [section5]="4" [section4]="3" [section3]="2" [section2]="1" [section1]="6" [shelf]="15" [row]="PL6" )'
declare -A row5='([section6]="5" [section5]="4" [section4]="3" [section3]="2" [section2]="1" [section1]="7" [shelf]="16" [row]="PL7" )'
declare -A row6='([section6]="5" [section5]="4" [section4]="3" [section3]="2" [section2]="1" [section1]="8" [shelf]="15" [row]="PL8" )'
declare -A row7='([section6]="5" [section5]="4" [section4]="3" [section3]="2" [section2]="1" [section1]="7" [shelf]="16" [row]="PL9" )'

---------
3

Answer

각 CSV 레코드에 대해 연관 배열을 사용하겠습니다. 데이터가 다음과 같은 파일에 있다고 가정합니다.input.csv

#!/usr/bin/env bash

counter=1          # provides index for each csv record
while read 
do
    IFS=',' a=( $REPLY )               # numeric array containing current row
    eval "declare -A row$counter"      # declare an assoc. array representing
                                       # this row   

    eval "row$counter+=( ['row']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['shelf']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['section1']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['section2']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['section3']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['section4']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['section5']=${a[0]} )"
    a=( "${a[@]:1}" )
    eval "row$counter+=( ['section6']=${a[0]} )"
    a=( "${a[@]:1}" )

    declare -p row$counter

    (( counter = counter + 1 ))
done < <( cat input.csv )

# access arbitrary element
printf "\n---------\n%s\n" ${row3["section4"]}

이것은 나에게 다음과 같은 결과를 제공합니다.

declare -A row1='([section6]="6" [section5]="5" [section4]="4" [section3]="4" [section2]="2" [section1]="1" [shelf]="12" [row]="PL3" )'
declare -A row2='([section6]="" [section5]="" [section4]="" [section3]="2" [section2]="1" [section1]="4" [shelf]="13" [row]="PL4" )'
declare -A row3='([section6]="" [section5]="" [section4]="3" [section3]="2" [section2]="1" [section1]="5" [shelf]="14" [row]="PL5" )'
declare -A row4='([section6]="5" [section5]="4" [section4]="3" [section3]="2" [section2]="1" [section1]="6" [shelf]="15" [row]="PL6" )'
declare -A row5='([section6]="5" [section5]="4" [section4]="3" [section3]="2" [section2]="1" [section1]="7" [shelf]="16" [row]="PL7" )'
declare -A row6='([section6]="5" [section5]="4" [section4]="3" [section3]="2" [section2]="1" [section1]="8" [shelf]="15" [row]="PL8" )'
declare -A row7='([section6]="5" [section5]="4" [section4]="3" [section3]="2" [section2]="1" [section1]="7" [shelf]="16" [row]="PL9" )'

---------
3

Question 3

나는 다음과 같이 시작할 것이다:

while IFS=, read -ra fields; do
    for (( i = ${#fields[@]} - 1; i >= 0; i-- )); do
        [[ -z "${fields[i]}" ]] && unset fields[i] || break
    done
    declare -p fields
done < file

declare -a fields='([0]="PL3" [1]="12" [2]="3" [3]="1" [4]="2" [5]="3" [6]="4" [7]="5" [8]="6")'
declare -a fields='([0]="PL4" [1]="13" [2]="4" [3]="1" [4]="2")'
declare -a fields='([0]="PL5" [1]="14" [2]="5" [3]="1" [4]="2" [5]="3")'
declare -a fields='([0]="PL6" [1]="15" [2]="6" [3]="1" [4]="2" [5]="3" [6]="4" [7]="5" [8]="6" [9]="7" [10]="8")'
declare -a fields='([0]="PL7" [1]="16" [2]="7" [3]="1" [4]="2" [5]="3" [6]="4" [7]="5" [8]="6" [9]="7" [10]="8" [11]="9")'
declare -a fields='([0]="PL8" [1]="15" [2]="8" [3]="1" [4]="2" [5]="3" [6]="4" [7]="5" [8]="6" [9]="7" [10]="8")'
declare -a fields='([0]="PL9" [1]="16" [2]="7" [3]="1" [4]="2" [5]="3" [6]="4" [7]="5" [8]="6" [9]="7" [10]="8" [11]="9")'

파일에 뒤에 공백이 없는지 확인하십시오.

수치적으로 증가하는 변수 이름이 필요한지 질문합니다. bash에는 없는 데이터 구조인 2D 배열이 필요한 것 같습니다. Bash가 해당 작업에 적합한 도구라고 확신하시나요?

Answer

나는 다음과 같이 시작할 것이다:

while IFS=, read -ra fields; do
    for (( i = ${#fields[@]} - 1; i >= 0; i-- )); do
        [[ -z "${fields[i]}" ]] && unset fields[i] || break
    done
    declare -p fields
done < file

declare -a fields='([0]="PL3" [1]="12" [2]="3" [3]="1" [4]="2" [5]="3" [6]="4" [7]="5" [8]="6")'
declare -a fields='([0]="PL4" [1]="13" [2]="4" [3]="1" [4]="2")'
declare -a fields='([0]="PL5" [1]="14" [2]="5" [3]="1" [4]="2" [5]="3")'
declare -a fields='([0]="PL6" [1]="15" [2]="6" [3]="1" [4]="2" [5]="3" [6]="4" [7]="5" [8]="6" [9]="7" [10]="8")'
declare -a fields='([0]="PL7" [1]="16" [2]="7" [3]="1" [4]="2" [5]="3" [6]="4" [7]="5" [8]="6" [9]="7" [10]="8" [11]="9")'
declare -a fields='([0]="PL8" [1]="15" [2]="8" [3]="1" [4]="2" [5]="3" [6]="4" [7]="5" [8]="6" [9]="7" [10]="8")'
declare -a fields='([0]="PL9" [1]="16" [2]="7" [3]="1" [4]="2" [5]="3" [6]="4" [7]="5" [8]="6" [9]="7" [10]="8" [11]="9")'

파일에 뒤에 공백이 없는지 확인하십시오.

수치적으로 증가하는 변수 이름이 필요한지 질문합니다. bash에는 없는 데이터 구조인 2D 배열이 필요한 것 같습니다. Bash가 해당 작업에 적합한 도구라고 확신하시나요?

Question 4

데이터가 "간단한" CSV 형식(특별한 CSV 참조 필드가 필요하지 않음)이라고 가정하면 헤더 없는 CSV 데이터를 구조화된 JSON 파일로 비교적 쉽게 변환할 수 있습니다. 다음 코드는 CSV 파일의 각 입력 줄에 대해 하나씩 별도의 JSON 개체 집합을 만듭니다.

$ jq -R 'split(",") | {container:.[0], typeA:.[1], typeB:.[2], typeC:.[3:]}' file file.csv
{
  "container": "PL3",
  "typeA": "12.1.4.5-77",
  "typeB": "13.6.4.5-20",
  "typeC": [
    "17.3.577.9-29",
    "17.3.779.12-33",
    "17.3.802.12-60",
    "17.3.917.12-45",
    "17.3.956.12-63",
    "17.3.993.12-42"
  ]
}
{
  "container": "PL4",
  "typeA": "12.1.4.5-78",
  "typeB": "13.6.4.5-21",
  "typeC": [
    "17.3.577.9-30",
    "17.3.779.12-34"
  ]
}
[...] # output truncated for brevity

이 JSON 데이터가 에 저장되어 있다고 가정하면 file.json다양한 방법으로 쿼리할 수 있습니다.

$ jq -r --arg container PL7 --arg type typeA 'select(.container==$container)[$type]' file.json
12.1.4.5-81

$ jq -r --arg container PL8 --arg type typeB 'select(.container==$container)[$type]' file.json
13.6.4.5-25

$ jq -r --arg container PL6 --arg type typeC 'select(.container==$container)[$type][]' file.json
17.3.577.9-32
17.3.779.12-36
17.3.802.12-63
17.3.917.12-48
17.3.956.12-66

( []위 표현식 끝에 배열을 별도의 요소로 확장하는 것을 추가했습니다.)

$ jq -r --arg container PL3 --arg type typeC --arg sub 60 'select(.container==$container)[$type][] | select(endswith("-"+$sub))' file.json
17.3.802.12-60

Answer

데이터가 "간단한" CSV 형식(특별한 CSV 참조 필드가 필요하지 않음)이라고 가정하면 헤더 없는 CSV 데이터를 구조화된 JSON 파일로 비교적 쉽게 변환할 수 있습니다. 다음 코드는 CSV 파일의 각 입력 줄에 대해 하나씩 별도의 JSON 개체 집합을 만듭니다.

$ jq -R 'split(",") | {container:.[0], typeA:.[1], typeB:.[2], typeC:.[3:]}' file file.csv
{
  "container": "PL3",
  "typeA": "12.1.4.5-77",
  "typeB": "13.6.4.5-20",
  "typeC": [
    "17.3.577.9-29",
    "17.3.779.12-33",
    "17.3.802.12-60",
    "17.3.917.12-45",
    "17.3.956.12-63",
    "17.3.993.12-42"
  ]
}
{
  "container": "PL4",
  "typeA": "12.1.4.5-78",
  "typeB": "13.6.4.5-21",
  "typeC": [
    "17.3.577.9-30",
    "17.3.779.12-34"
  ]
}
[...] # output truncated for brevity

이 JSON 데이터가 에 저장되어 있다고 가정하면 file.json다양한 방법으로 쿼리할 수 있습니다.

$ jq -r --arg container PL7 --arg type typeA 'select(.container==$container)[$type]' file.json
12.1.4.5-81

$ jq -r --arg container PL8 --arg type typeB 'select(.container==$container)[$type]' file.json
13.6.4.5-25

$ jq -r --arg container PL6 --arg type typeC 'select(.container==$container)[$type][]' file.json
17.3.577.9-32
17.3.779.12-36
17.3.802.12-63
17.3.917.12-48
17.3.956.12-66

( []위 표현식 끝에 배열을 별도의 요소로 확장하는 것을 추가했습니다.)

$ jq -r --arg container PL3 --arg type typeC --arg sub 60 'select(.container==$container)[$type][] | select(endswith("-"+$sub))' file.json
17.3.802.12-60

CSV에서 필드 수가 다른 변수 만들기

답변1

답변2

답변3

답변4

관련 정보