Unix - Sed 명령 관련

Question

전체 행 대신 행 내의 열을 사용하려면 awk또는 perl가 .보다 더 나은 도구입니다 sed.

그리고 따옴표가 포함된 필드(쉼표 포함)를 처리해야 하므로 이를 사용하는 것이 더 좋습니다 perl.텍스트::CSV이와 같은 CSV 파일을 구문 분석하는 모듈입니다. 를 사용하여 이 작업을 수행할 수 있지만 awk필드 내의 따옴표와 쉼표를 처리하려면 자체 파서를 작성해야 합니다.

Debian 또는 이와 유사한 것을 실행 중인 경우 apt install libtext-csv-perl. 다른 배포판도 패키징될 수 있습니다. 그렇지 않으면 를 사용하여 설치하십시오 cpan.

다음은 세부 정보를 얻기 위해 Text::CSV실행할 수 있는 작업에 대한 매우 간단한 예입니다 .man Text::CSV

#!/usr/bin/perl

use strict;

use Text::CSV qw(csv);

my ($filename, $search, $year) = @ARGV;

my $csv = Text::CSV->new({allow_whitespace => 1,
                          allow_loose_quotes => 1,
                          quote_space => 0,
                         });

open(my $in, "<", $filename) or die "couldn't open $filename: $!";

my @headers = $csv->header($in);
pop @headers;                   # discard last field from @headers
$csv->say(*STDOUT, \@headers);  # print the headers

while (my $row = $csv->getline($in)) {

  # note: perl arrays start from zero, not one. So $row->[0] is
  # the first field.  $row->[3] is the fourth.

  if ($row->[0] =~ m/$search/i && $row->[3] == $year) {
    pop @{ $row };  # discard last field (year)
    $csv->say(*STDOUT, $row);
  }

}
close($in);

예를 들어 다른 이름으로 저장하고 -를 사용하여 쉘 스크립트와 동일하게 extract.pl실행 가능하게 만듭니다 .chmod +x extract.pl

귀하의 질문에 예제 입력 또는 출력을 제공하지 않았으므로 약간의 허튼 소리를 만들어야했습니다.

다음 입력 파일이 주어지면 input.csv:

business,description,address,year
"ABC","sells some items","123 Somewhere Street, Somewhere, V1234",2020
"BCD Co.","sells some items","123 Somewhere Street, Somewhere, V1234",2021
"BBB Pty Ltd","sells some items","123 Somewhere Street, Somewhere, V1234",2020
"BXYZ","sells some items","123 Somewhere Street, Somewhere, V1234",2021
"CDE","sells some items","123 Somewhere Street, Somewhere, V1234",2020
"DEF","sells some items","123 Somewhere Street, Somewhere, V1234",2020

다음과 같은 출력이 생성됩니다.

$ ./extract.pl input.csv '^b' 2021
business,description,address
BCD Co.,sells some items,"123 Somewhere Street, Somewhere, V1234"
BXYZ,sells some items,"123 Somewhere Street, Somewhere, V1234"

즉, 2021년에 "B" 또는 "b"로 시작하는 모든 업체 이름입니다(정규식 일치는 대소문자를 구분하지 않습니다). 처음 3개 필드만 인쇄됩니다.

출력이 필요한 경우(즉, 필드 내에 쉼표가 있는 경우) 필드만 참조하는 방법에 유의하세요. 공백이 포함된 필드도 인용하려면 스크립트에서 이를 quote_space => 0다음으로 변경하세요 quote_space => 1(또는 공백이 포함된 필드를 인용하는 것이 기본값이므로 해당 줄을 제거하세요 Text::CSV).

Answer 1