파일의 각 줄에서 k까지 요소를 추출합니다.

Question 1

고급 언어에서는 배열의 배열을 사용할 수 있지만 bash에는 그런 기능이 없습니다. 다중 레벨 데이터 구조와 관련된 이와 같은 문제는 셸에서 해결하기가 매우 지루한 경우가 많습니다.

하지만 목표는 Python이 아닌 Unix 텍스트 처리를 배우는 것이므로 셸에서 해결해 보겠습니다.

이 솔루션에서는 파일을 한 번 읽어 행 헤더를 가져온 다음 여러 번 다시 읽어 필요한 수의 요소를 수집합니다. 우리는 두 개의 배열을 유지합니다. outrow각 행이 우리가 가는 곳에 추가되는 출력 라인 배열 cursor과 각 행의 위치를 저장하는 정수 배열입니다.

요청을 충족할 만큼 요소가 충분하지 않으면 이 스크립트는 영원히 반복됩니다. 이 문제를 해결하는 것은 독자의 연습 과제로 남겨집니다.

#!/bin/bash
k=$1
input=input.txt
declare -a outrow
declare -a cursor
K=0
n=0
while read line
do
    outrow[$n]=${line%% *}
    cursor[$n]=1
    (( n++ ))
done < $input

while [[ $K -lt $k ]]
do
    n=0
    while read line
    do
        declare -a col=( $line )
        if [[ ${#col[@]} -gt ${cursor[$n]} ]]
        then
            outrow[$n]+=" ${col[ ${cursor[$n]} ]}"
            (( cursor[$n]++ ))
            (( K++ ))
            [[ $K -lt $k ]] || break
        fi
        (( n++ ))
    done < $input
done

for row in "${outrow[@]}"
do
    echo "$row"
done

Answer

고급 언어에서는 배열의 배열을 사용할 수 있지만 bash에는 그런 기능이 없습니다. 다중 레벨 데이터 구조와 관련된 이와 같은 문제는 셸에서 해결하기가 매우 지루한 경우가 많습니다.

하지만 목표는 Python이 아닌 Unix 텍스트 처리를 배우는 것이므로 셸에서 해결해 보겠습니다.

이 솔루션에서는 파일을 한 번 읽어 행 헤더를 가져온 다음 여러 번 다시 읽어 필요한 수의 요소를 수집합니다. 우리는 두 개의 배열을 유지합니다. outrow각 행이 우리가 가는 곳에 추가되는 출력 라인 배열 cursor과 각 행의 위치를 저장하는 정수 배열입니다.

요청을 충족할 만큼 요소가 충분하지 않으면 이 스크립트는 영원히 반복됩니다. 이 문제를 해결하는 것은 독자의 연습 과제로 남겨집니다.

#!/bin/bash
k=$1
input=input.txt
declare -a outrow
declare -a cursor
K=0
n=0
while read line
do
    outrow[$n]=${line%% *}
    cursor[$n]=1
    (( n++ ))
done < $input

while [[ $K -lt $k ]]
do
    n=0
    while read line
    do
        declare -a col=( $line )
        if [[ ${#col[@]} -gt ${cursor[$n]} ]]
        then
            outrow[$n]+=" ${col[ ${cursor[$n]} ]}"
            (( cursor[$n]++ ))
            (( K++ ))
            [[ $K -lt $k ]] || break
        fi
        (( n++ ))
    done < $input
done

for row in "${outrow[@]}"
do
    echo "$row"
done

Question 2

노트:변수를 변경하여 num요소 수를 조정할 수 있습니다.

gawk -v num=5 '
BEGIN {
    PROCINFO["sorted_in"] = "@ind_str_asc"
}
{
    ### 
    # Traverse throught input.txt from first to last line
    # and store all elements in the two-dimensional array - table
    # along the way, maintain the array of counters for each letter
    ###

    # The array of counters for each unique element from the first column.
    # In our case the indexes of array are capital letters (A, B, C, D)
    # and values are the amount of each letter occurrences.
    cnt_arr[$1]++

    # Two dimension array - table
    # it looks like chess board - rows named by letters (A, B, C, D)
    # and columns named by numbers (1, 2, 3, 4, 5... etc).
    # Its cells contains numbers from the second column.
    # For example, if letter A occurrences 5 times in the input.txt
    # then, the table will have the A row with 5 columns 
    table[$1][cnt_arr[$1]] = $2
}
# At this point, all lines from input.txt are processed
# and stored in the table
END {
    # Do needed number of iterations - specified by the num variable
    for(i = 0; i < num; i++) {

        # On each iteration run the inner loop,
        # which iterating through all rows in the table
        for(row_name in table) {

            # Check each cell - if it is non-empty
            # add its value to the result_arr[row_name], separated by OFS.
            # OFS - output field separator, the space by default
            if(table[row_name][i]) {
                result_arr[row_name] = result_arr[row_name] OFS table[row_name][i]
                # and count the number of succesful occurences
                cnt++
            }

            # If count of non-empty cells equals to the num variable
            # or equals to the NR (number of records|lines)
            # print the result_arr and exit
            if(cnt == num || cnt >= NR) {
                for(i in result_arr) {
                    print i result_arr[i]
                }
                exit
            }
        }
    }
}' input.txt

정보대략 PROCINFO["sorted_in"] = "@ind_str_asc"라인은여기.

입력하다

A 1
B 2
C 9
D 1
A 5
B 3
C 9
A 6
C 7
A 5
C 1

산출

A 1 5
B 2
C 9
D 1

Answer

노트:변수를 변경하여 num요소 수를 조정할 수 있습니다.

gawk -v num=5 '
BEGIN {
    PROCINFO["sorted_in"] = "@ind_str_asc"
}
{
    ### 
    # Traverse throught input.txt from first to last line
    # and store all elements in the two-dimensional array - table
    # along the way, maintain the array of counters for each letter
    ###

    # The array of counters for each unique element from the first column.
    # In our case the indexes of array are capital letters (A, B, C, D)
    # and values are the amount of each letter occurrences.
    cnt_arr[$1]++

    # Two dimension array - table
    # it looks like chess board - rows named by letters (A, B, C, D)
    # and columns named by numbers (1, 2, 3, 4, 5... etc).
    # Its cells contains numbers from the second column.
    # For example, if letter A occurrences 5 times in the input.txt
    # then, the table will have the A row with 5 columns 
    table[$1][cnt_arr[$1]] = $2
}
# At this point, all lines from input.txt are processed
# and stored in the table
END {
    # Do needed number of iterations - specified by the num variable
    for(i = 0; i < num; i++) {

        # On each iteration run the inner loop,
        # which iterating through all rows in the table
        for(row_name in table) {

            # Check each cell - if it is non-empty
            # add its value to the result_arr[row_name], separated by OFS.
            # OFS - output field separator, the space by default
            if(table[row_name][i]) {
                result_arr[row_name] = result_arr[row_name] OFS table[row_name][i]
                # and count the number of succesful occurences
                cnt++
            }

            # If count of non-empty cells equals to the num variable
            # or equals to the NR (number of records|lines)
            # print the result_arr and exit
            if(cnt == num || cnt >= NR) {
                for(i in result_arr) {
                    print i result_arr[i]
                }
                exit
            }
        }
    }
}' input.txt

정보대략 PROCINFO["sorted_in"] = "@ind_str_asc"라인은여기.

입력하다

A 1
B 2
C 9
D 1
A 5
B 3
C 9
A 6
C 7
A 5
C 1

산출

A 1 5
B 2
C 9
D 1

파일의 각 줄에서 k까지 요소를 추출합니다.

답변1

답변2

관련 정보