파일 1
##chr pos rc allele_count allele_states deletion_sum snp_type most_variable_allele diff:1-2 diff:1-3 diff:1-4 diff:1-5 diff:1-6 diff:1-7 diff:1-8 diff:1-9 diff:1-10 diff:1-11 diff:1-12 diff:2-3
MT 227 C 2 C/A 0 pop C 0 0 0 0 0 0 0.024 0 0.022 0 0 0
MT 233 G 2 G/T 0 pop G 0 0.009 0 0.012 0 0 0 0 0 0 0 0.009
MT 245 G 2 G/A 0 pop A 0 0 0 0 0 0.055 0.224 0.072 0.026 0 0 0
MT 251 C 2 C/T 0 pop C 0.276 0.034 0.231 0.005 0.027 0.036 0.025 0.002 0.107 0.034 0.034 0.309
MT 264 G 2 G/C 0 pop G 0 0 0 0.008 0 0.003 0 0 0 0 0 0
MT 286 G 2 G/T 0 pop T 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0 0.002 0
MT 292 A 2 A/T 0 pop T 0 0 0 0 0.003 0 0 0.002 0 0 0 0
MT 293 G 2 G/T 0 pop G 0 0 0 0 0.003 0.002 0 0 0 0 0 0
MT 295 G 2 G/T 0 pop G 0 0.002 0.002 0 0.001 0.002 0.002 0.002 0.002 0.002 0.002 0.003
파일_2
MT 251
MT 292
MT 295
원하는 출력
##chr pos rc allele_count allele_states deletion_sum snp_type most_variable_allele diff:1-2 diff:1-3 diff:1-4 diff:1-5 diff:1-6 diff:1-7 diff:1-8 diff:1-9 diff:1-10 diff:1-11 diff:1-12 diff:2-3
MT 251 C 2 C/T 0 pop C 0.276 0.034 0.231 0.005 0.027 0.036 0.025 0.002 0.107 0.034 0.034 0.309
MT 292 A 2 A/T 0 pop T 0 0 0 0 0.003 0 0 0.002 0 0 0 0
MT 295 G 2 G/T 0 pop G 0 0.002 0.002 0 0.001 0.002 0.002 0.002 0.002 0.002 0.002 0.003
이는 이 게시물에서 원하는 것과 유사합니다.첫 번째 열을 기준으로 두 파일을 비교합니다. 일치하면 행 유지
나는 첫 번째 열과 겹치는 부분을 유지했지만 awk 'NR==FNR{a[$0]=$0;next}a[$0]'
처음 두 열(chr 및 pos)과 일치하면 전체 행이 필요합니다.
답변1
전체 행 테스트를 사용하는 대신 처음 두 열이 배열의 키인지 테스트해야 합니다 $0
.
awk 'NR==FNR {a[$1" "$2] = 1; next}
FNR == 1 && FNR != NR {print} # print header
$1" "$2 in a' File_2 File_1