목록의 문자열이 줄의 처음 12자 내에서 발견되면 대용량 파일에서 줄을 제거하시겠습니까?

Question 1

그리고 grep:

grep -vwf file matrix > matrix.new
mv matrix.new matrix

패턴 입력 파일 -f FILE로 사용하는 옵션FILE
-w전체 단어를 구성하는 일치 항목을 포함하는 줄만 선택하는 옵션
-v일치하지 않는 행을 선택하는 옵션

빈 줄 은 file허용되지 않습니다.

또는 식별자 파일을 수동으로 생성하는 경우 앵커를 사용하여 ^줄의 시작 부분을 일치시키고 각 식별자 뒤에 공백 문자를 추가하여 패턴의 끝을 표시합니다.

printf '^%s \n' denovo{1,100,1000,100000} > file
grep -vf file matrix > matrix.new
mv matrix.new matrix

Answer

그리고 grep:

grep -vwf file matrix > matrix.new
mv matrix.new matrix

패턴 입력 파일 -f FILE로 사용하는 옵션FILE
-w전체 단어를 구성하는 일치 항목을 포함하는 줄만 선택하는 옵션
-v일치하지 않는 행을 선택하는 옵션

빈 줄 은 file허용되지 않습니다.

또는 식별자 파일을 수동으로 생성하는 경우 앵커를 사용하여 ^줄의 시작 부분을 일치시키고 각 식별자 뒤에 공백 문자를 추가하여 패턴의 끝을 표시합니다.

printf '^%s \n' denovo{1,100,1000,100000} > file
grep -vf file matrix > matrix.new
mv matrix.new matrix

Question 2

노력하다:

$ awk 'FNR==NR{ids[$1]; next} !($1 in ids)' ids file
denovo10 someverylaaargenumbers and lotandlotsoftextuntil 5400........
denovo10000 someverylaaargenumbers and lotandlotsoftextuntil 5400.....
denovo184117 someverylaaargenumbers and lotandlotsoftextuntil 5400......

작동 방식:

FNR==NR{ids[$1]; next}

첫 번째 파일을 읽을 때 id가 있는 ids연관 배열에 키가 생성됩니다 . ids그런 다음 나머지 명령을 건너뛰고 해당 next줄로 이동합니다.
!($1 in ids)

두 번째 파일을 읽을 때 첫 번째 필드가 연관 배열의 키가 아닌 경우 해당 행이 인쇄됩니다 ids.

원본 파일 업데이트

코드가 올바르게 작동한다고 생각되면 파일을 변경할 수 있습니다.

awk 'FNR==NR{ids[$1]; next} !($1 in ids)' ids file >tmp && mv tmp file

Answer

노력하다:

$ awk 'FNR==NR{ids[$1]; next} !($1 in ids)' ids file
denovo10 someverylaaargenumbers and lotandlotsoftextuntil 5400........
denovo10000 someverylaaargenumbers and lotandlotsoftextuntil 5400.....
denovo184117 someverylaaargenumbers and lotandlotsoftextuntil 5400......

작동 방식:

FNR==NR{ids[$1]; next}

첫 번째 파일을 읽을 때 id가 있는 ids연관 배열에 키가 생성됩니다 . ids그런 다음 나머지 명령을 건너뛰고 해당 next줄로 이동합니다.
!($1 in ids)

두 번째 파일을 읽을 때 첫 번째 필드가 연관 배열의 키가 아닌 경우 해당 행이 인쇄됩니다 ids.

원본 파일 업데이트

코드가 올바르게 작동한다고 생각되면 파일을 변경할 수 있습니다.

awk 'FNR==NR{ids[$1]; next} !($1 in ids)' ids file >tmp && mv tmp file

목록의 문자열이 줄의 처음 12자 내에서 발견되면 대용량 파일에서 줄을 제거하시겠습니까?

답변1

답변2

원본 파일 업데이트

관련 정보