특정 열에서 동일한 값을 가진 행 삭제

특정 열에서 동일한 값을 가진 행 삭제

입력 파일이 있습니다(-t를 사용하여 열 2로 정렬).

TOP,25424242,T0137,0.08,0.06,0.02,24
TOP,25424242,T0138,0.07,0.06,0.01,24
TOP,17236110,T0138,9.65,9.37,0.28,89
TOP,23525255,T0137,0.40,0.30,0.11,24
TOP,23525255,T0138,0.08,0.07,0.01,24
TOP,21627012,T0138,0.41,0.33,0.08,24
TOP,75856354,T0137,0.18,0.17,0.01,36
TOP,75856354,T0138,0.18,0.17,0.01,26
TOP,42401990,T0137,0.06,0.05,0.01,24

열 2에서 동일한 값을 가진 두 행을 모두 삭제하고 싶기 때문에 필드 2에 고유한 값이 있는 행만 남게 됩니다. 위의 예에서는 다음과 같습니다.

TOP,17236110,T0138,9.65,9.37,0.28,89
TOP,21627012,T0138,0.41,0.33,0.08,24
TOP,42401990,T0137,0.06,0.05,0.01,24

답변1

이것은 작동합니다:

 $ awk -F, '{a[$2]=$0; b[$2]++;} END{for(i in a){if(b[i]==1){print a[i]}}}' file
TOP,17236110,T0138,9.65,9.37,0.28,89
TOP,21627012,T0138,0.41,0.33,0.08,24
TOP,42401990,T0137,0.06,0.05,0.01,24

답변2

짧은uniq현재 입력 구조에 대한 팁(처음 두 필드의 길이는 고정되어 있음):

uniq -s4 -w8 -u file
  • -s4- 처음 4자를 건너뜁니다(예: TOP,).
  • -w8- 비교 줄은 8자 이하여야 합니다.
  • -u- 고유한 라인만 인쇄

산출:

TOP,17236110,T0138,9.65,9.37,0.28,89
TOP,21627012,T0138,0.41,0.33,0.08,24
TOP,42401990,T0137,0.06,0.05,0.01,24

답변3

이를 달성하기 위해 awk를 사용할 수 있습니다.

for k in `awk -F "," '{print $2}' file.txt | uniq -D`; do
  sed -i '/'$k'/d' file.txt;
done

산출

TOP,17236110,T0138,9.65,9.37,0.28,89
TOP,21627012,T0138,0.41,0.33,0.08,24
TOP,42401990,T0137,0.06,0.05,0.01,24

관련 정보