sed를 사용하여 대체를 연결하지 않고 여러 대체를 수행할 수 있는 방법이 있습니까?

Question 1

이런 종류의 문제에서는 두 패턴을 동시에 검색할 수 있도록 루프가 필요합니다.

awk '
    BEGIN {
        regex = "A|B"
        map["A"] = "BB"
        map["B"] = "AA"
    }
    {
        str = $0
        result = ""
        while (match(str, regex)) {
            found = substr(str, RSTART, RLENGTH)
            result = result substr(str, 1, RSTART-1) map[found]
            str = substr(str, RSTART+RLENGTH)
        }
        print result str
    }
'

물론, Perl을 사용할 수 있다면 이에 상응하는 oneliner가 있습니다:

perl -pe '
    BEGIN { %map = ("A" => "BB", "B" => "AA"); }
    s/(A|B)/$map{$1}/g;
'

패턴에 특수 문자가 포함되어 있지 않으면 동적으로 정규식을 작성할 수도 있습니다.

perl -pe '
    BEGIN {
        %map = ("A" => "BB", "B" => "AA");
        $regex = join "|", keys %map;
    }
    s/($regex)/$map{$1}/g;
'

그런데 Tcl에는 oneliners라는 내장 명령이 있지만 string mapTcl oneliners를 작성하는 것은 쉽지 않습니다.

길이별로 키를 정렬하는 효과를 보여줍니다.

정렬되지 않음

$ echo ABBA | perl -pe '
    BEGIN {
        %map = (A => "X", BB => "Y", AB => "Z");
        $regex = join "|", map {quotemeta} keys %map;
        print $regex, "\n";
    }
    s/($regex)/$map{$1}/g
'

A|AB|BB
XYX

정렬 있음

$ echo ABBA | perl -pe '
      BEGIN {
          %map = (A => "X", BB => "Y", AB => "Z");
          $regex = join "|", map {quotemeta $_->[1]}
                             reverse sort {$a->[0] <=> $b->[0]}
                             map {[length, $_]}
                             keys %map;
          print $regex, "\n";
      }
      s/($regex)/$map{$1}/g
  '

BB|AB|A
ZBX

Perl의 "일반" 정렬과 Schwartzian 정렬 벤치마킹: 서브루틴의 코드는 다음에서 직접 가져옵니다.sort문서

#!perl
use Benchmark   qw/ timethese cmpthese /;

# make up some key=value data
my $key='a';
for $x (1..10000) {
    push @unsorted,   $key++ . "=" . int(rand(32767));
}

# plain sorting: first by value then by key
sub nonSchwartzian {
    my @sorted = 
        sort { ($b =~ /=(\d+)/)[0] <=> ($a =~ /=(\d+)/)[0] || uc($a) cmp uc($b) } 
        @unsorted
}

# using the Schwartzian transform
sub schwartzian {
    my @sorted =
        map  { $_->[0] }
        sort { $b->[1] <=> $a->[1] || $a->[2] cmp $b->[2] }
        map  { [$_, /=(\d+)/, uc($_)] } 
        @unsorted
}

# ensure the subs sort the same way
die "different" unless join(",", nonSchwartzian()) eq join(",", schwartzian());

# benchmark
cmpthese(
    timethese(-10, {
        nonSchwartzian => 'nonSchwartzian()',
        schwartzian    => 'schwartzian()',
    })
);

실행하세요:

$ perl benchmark.pl
Benchmark: running nonSchwartzian, schwartzian for at least 10 CPU seconds...
nonSchwartzian: 11 wallclock secs (10.43 usr +  0.05 sys = 10.48 CPU) @  9.73/s (n=102)
schwartzian: 11 wallclock secs (10.13 usr +  0.03 sys = 10.16 CPU) @ 49.11/s (n=499)
                 Rate nonSchwartzian    schwartzian
nonSchwartzian 9.73/s             --           -80%
schwartzian    49.1/s           405%             --

Schwartzian 변환을 사용하는 코드는 4배 더 빠릅니다.

비교 함수는 어디에 있습니까?오직 length각 요소에 대해 다음을 수행합니다.

Benchmark: running nonSchwartzian, schwartzian for at least 10 CPU seconds...
nonSchwartzian: 11 wallclock secs (10.06 usr +  0.03 sys = 10.09 CPU) @ 542.52/s (n=5474)
schwartzian: 10 wallclock secs (10.21 usr +  0.02 sys = 10.23 CPU) @ 191.50/s (n=1959)
                Rate    schwartzian nonSchwartzian
schwartzian    191/s             --           -65%
nonSchwartzian 543/s           183%             --

Schwartzian은 이 저렴한 정렬 기능을 사용하면 속도가 훨씬 느려집니다.

이제 악성댓글에서 벗어날 수 있을까요?

Answer

이런 종류의 문제에서는 두 패턴을 동시에 검색할 수 있도록 루프가 필요합니다.

awk '
    BEGIN {
        regex = "A|B"
        map["A"] = "BB"
        map["B"] = "AA"
    }
    {
        str = $0
        result = ""
        while (match(str, regex)) {
            found = substr(str, RSTART, RLENGTH)
            result = result substr(str, 1, RSTART-1) map[found]
            str = substr(str, RSTART+RLENGTH)
        }
        print result str
    }
'

물론, Perl을 사용할 수 있다면 이에 상응하는 oneliner가 있습니다:

perl -pe '
    BEGIN { %map = ("A" => "BB", "B" => "AA"); }
    s/(A|B)/$map{$1}/g;
'

패턴에 특수 문자가 포함되어 있지 않으면 동적으로 정규식을 작성할 수도 있습니다.

perl -pe '
    BEGIN {
        %map = ("A" => "BB", "B" => "AA");
        $regex = join "|", keys %map;
    }
    s/($regex)/$map{$1}/g;
'

그런데 Tcl에는 oneliners라는 내장 명령이 있지만 string mapTcl oneliners를 작성하는 것은 쉽지 않습니다.

길이별로 키를 정렬하는 효과를 보여줍니다.

정렬되지 않음

$ echo ABBA | perl -pe '
    BEGIN {
        %map = (A => "X", BB => "Y", AB => "Z");
        $regex = join "|", map {quotemeta} keys %map;
        print $regex, "\n";
    }
    s/($regex)/$map{$1}/g
'

A|AB|BB
XYX

정렬 있음

$ echo ABBA | perl -pe '
      BEGIN {
          %map = (A => "X", BB => "Y", AB => "Z");
          $regex = join "|", map {quotemeta $_->[1]}
                             reverse sort {$a->[0] <=> $b->[0]}
                             map {[length, $_]}
                             keys %map;
          print $regex, "\n";
      }
      s/($regex)/$map{$1}/g
  '

BB|AB|A
ZBX

Perl의 "일반" 정렬과 Schwartzian 정렬 벤치마킹: 서브루틴의 코드는 다음에서 직접 가져옵니다.sort문서

#!perl
use Benchmark   qw/ timethese cmpthese /;

# make up some key=value data
my $key='a';
for $x (1..10000) {
    push @unsorted,   $key++ . "=" . int(rand(32767));
}

# plain sorting: first by value then by key
sub nonSchwartzian {
    my @sorted = 
        sort { ($b =~ /=(\d+)/)[0] <=> ($a =~ /=(\d+)/)[0] || uc($a) cmp uc($b) } 
        @unsorted
}

# using the Schwartzian transform
sub schwartzian {
    my @sorted =
        map  { $_->[0] }
        sort { $b->[1] <=> $a->[1] || $a->[2] cmp $b->[2] }
        map  { [$_, /=(\d+)/, uc($_)] } 
        @unsorted
}

# ensure the subs sort the same way
die "different" unless join(",", nonSchwartzian()) eq join(",", schwartzian());

# benchmark
cmpthese(
    timethese(-10, {
        nonSchwartzian => 'nonSchwartzian()',
        schwartzian    => 'schwartzian()',
    })
);

실행하세요:

$ perl benchmark.pl
Benchmark: running nonSchwartzian, schwartzian for at least 10 CPU seconds...
nonSchwartzian: 11 wallclock secs (10.43 usr +  0.05 sys = 10.48 CPU) @  9.73/s (n=102)
schwartzian: 11 wallclock secs (10.13 usr +  0.03 sys = 10.16 CPU) @ 49.11/s (n=499)
                 Rate nonSchwartzian    schwartzian
nonSchwartzian 9.73/s             --           -80%
schwartzian    49.1/s           405%             --

Schwartzian 변환을 사용하는 코드는 4배 더 빠릅니다.

비교 함수는 어디에 있습니까?오직 length각 요소에 대해 다음을 수행합니다.

Benchmark: running nonSchwartzian, schwartzian for at least 10 CPU seconds...
nonSchwartzian: 11 wallclock secs (10.06 usr +  0.03 sys = 10.09 CPU) @ 542.52/s (n=5474)
schwartzian: 10 wallclock secs (10.21 usr +  0.02 sys = 10.23 CPU) @ 191.50/s (n=1959)
                Rate    schwartzian nonSchwartzian
schwartzian    191/s             --           -65%
nonSchwartzian 543/s           183%             --

Schwartzian은 이 저렴한 정렬 기능을 사용하면 속도가 훨씬 느려집니다.

이제 악성댓글에서 벗어날 수 있을까요?

Question 2

에서 단일 대체를 사용하여 모든 작업을 수행할 수는 없지만 두 하위 문자열 과 합계가 단일 문자인지 또는 더 긴 문자열 sed인지에 따라 다른 방식으로 전체 작업을 올바르게 수행할 수 있습니다 .AB

두 하위 문자열의 합이 단일 문자 A라고 가정하면...B

AYB당신은 로 변신하고 싶습니다 BBYAA.

A각각을 로 변경 B하고 B사용하세요 A.y/AB/BA/
A새 문자열의 각 항목을 AAusing 으로 바꿉니다 s/A/AA/g.
B새 문자열의 각 항목을 BBusing 으로 바꿉니다 s/B/BB/g.

$ echo AYB | sed 'y/AB/BA/; s/B/BB/g; s/A/AA/g'
BBYAA

우리가 얻는 마지막 두 단계를 결합하면

$ echo AYB | sed 'y/AB/BA/; s/[AB]/&&/g'
BBYAA

실제로 여기서 작업 순서는 중요하지 않습니다.

$ echo AYB | sed 's/[AB]/&&/g; y/AB/BA/'
BBYAA

edit sed명령은 y///유틸리티 프로그램과 마찬가지로 첫 번째 인수의 문자를 두 번째 인수의 해당 문자로 변환합니다 tr. 이 작업 A은 단일 작업으로 수행 B되므로 y/AB/BA/. 일반적으로 말하면 y///그렇습니다 .많은개별 문자를 번역하는 것은 정규 표현식이 포함되지 않기 때문에 예를 들어보다 빠르며 이식 불가능한 편의 확장을 문자열사용하여 개행 문자를 s///g삽입할 수도 있습니다 .\ns///s///sed

&명령의 대체 부분에 있는 문자는 s///첫 번째 인수와 일치하는 표현식으로 대체되므로 입력 데이터의 문자는 s/[AB]/&&/g두 배가 됩니다.AB

oo다중 문자 하위 문자열의 경우 하위 문자열이 서로 다르다고 가정하면(즉 , 및 의 경우 와 같이 하나의 하위 문자열이 다른 하위 문자열에서 발견되지 않음 foo) 다음과 같은 것을 사용합니다.

$ echo fooxbar | sed 's/foo/@/g; s/bar/foofoo/g; s/@/barbar/g'
barbarxfoofoo

즉, 두 문자열을 데이터에 없는 중간 문자열로 바꿉니다. 중간 문자열은 단일 문자가 아니라 데이터에서 찾을 수 없는 모든 문자열일 수 있습니다.

Answer