입력 파일이 있습니다(-t를 사용하여 열 2로 정렬).
TOP,25424242,T0137,0.08,0.06,0.02,24
TOP,25424242,T0138,0.07,0.06,0.01,24
TOP,17236110,T0138,9.65,9.37,0.28,89
TOP,23525255,T0137,0.40,0.30,0.11,24
TOP,23525255,T0138,0.08,0.07,0.01,24
TOP,21627012,T0138,0.41,0.33,0.08,24
TOP,75856354,T0137,0.18,0.17,0.01,36
TOP,75856354,T0138,0.18,0.17,0.01,26
TOP,42401990,T0137,0.06,0.05,0.01,24
열 2에서 동일한 값을 가진 두 행을 모두 삭제하고 싶기 때문에 필드 2에 고유한 값이 있는 행만 남게 됩니다. 위의 예에서는 다음과 같습니다.
TOP,17236110,T0138,9.65,9.37,0.28,89
TOP,21627012,T0138,0.41,0.33,0.08,24
TOP,42401990,T0137,0.06,0.05,0.01,24
답변1
이것은 작동합니다:
$ awk -F, '{a[$2]=$0; b[$2]++;} END{for(i in a){if(b[i]==1){print a[i]}}}' file
TOP,17236110,T0138,9.65,9.37,0.28,89
TOP,21627012,T0138,0.41,0.33,0.08,24
TOP,42401990,T0137,0.06,0.05,0.01,24
답변2
짧은uniq
현재 입력 구조에 대한 팁(처음 두 필드의 길이는 고정되어 있음):
uniq -s4 -w8 -u file
-s4
- 처음 4자를 건너뜁니다(예:TOP,
).-w8
- 비교 줄은 8자 이하여야 합니다.-u
- 고유한 라인만 인쇄
산출:
TOP,17236110,T0138,9.65,9.37,0.28,89
TOP,21627012,T0138,0.41,0.33,0.08,24
TOP,42401990,T0137,0.06,0.05,0.01,24
답변3
이를 달성하기 위해 awk를 사용할 수 있습니다.
for k in `awk -F "," '{print $2}' file.txt | uniq -D`; do
sed -i '/'$k'/d' file.txt;
done
산출
TOP,17236110,T0138,9.65,9.37,0.28,89
TOP,21627012,T0138,0.41,0.33,0.08,24
TOP,42401990,T0137,0.06,0.05,0.01,24