linux + 기계 번호를 기준으로 파일의 줄을 재정렬

linux + 기계 번호를 기준으로 파일의 줄을 재정렬

다음 파일이 있습니다

    more /home/list.in

    master01.fsdns.com AMBARI_METRICS STARTED
    master02.fsdns.com AMBARI_METRICS STARTED
    master03.fsdns.com AMBARI_METRICS STARTED
    worker01.fsdns.com AMBARI_METRICS STARTED
    worker02.fsdns.com AMBARI_METRICS STARTED
    worker03.fsdns.com AMBARI_METRICS STARTED
    worker05.fsdns.com AMBARI_METRICS STARTED
    worker06.fsdns.com AMBARI_METRICS STARTED
    worker07.fsdns.com AMBARI_METRICS STARTED
    worker08.fsdns.com AMBARI_METRICS STARTED
    worker09.fsdns.com AMBARI_METRICS STARTED

    master01.fsdns.com YARN STARTED
    master02.fsdns.com YARN STARTED
    master03.fsdns.com YARN STARTED
    worker01.fsdns.com YARN STARTED
    worker02.fsdns.com YARN STARTED
    worker03.fsdns.com YARN STARTED
    worker05.fsdns.com YARN STARTED
    worker06.fsdns.com YARN STARTED
    worker07.fsdns.com YARN STARTED
    worker08.fsdns.com YARN STARTED
    worker09.fsdns.com YARN STARTED

    master01.fsdns.com HDFS STARTED
    master02.fsdns.com HDFS STARTED
    master03.fsdns.com HDFS STARTED
    worker01.fsdns.com HDFS STARTED
    worker02.fsdns.com HDFS STARTED
    worker03.fsdns.com HDFS STARTED
    worker05.fsdns.com HDFS STARTED
    worker06.fsdns.com HDFS STARTED
    worker07.fsdns.com HDFS STARTED
    worker08.fsdns.com HDFS STARTED
    worker09.fsdns.com HDFS STARTED

list.in 파일을 다음 구조로 재정렬하려고 합니다(예상 결과).

따라서 기계 번호와 관련된 모든 행은 동일한 그룹에 속하게 됩니다.

기대되는 성과

    master01.fsdns.com AMBARI_METRICS STARTED
    master01.fsdns.com YARN STARTED
    master01.fsdns.com HDFS  STARTED

    master02.fsdns.com AMBARI_METRICS STARTED
    master02.fsdns.com YARN STARTED
    master02.fsdns.com HDFS STARTED

    master03.fsdns.com AMBARI_METRICS STARTED
    master03.fsdns.com YARN STARTED
    master03.fsdns.com HDFS STARTED
    .
    .
    .
    .
    . 
    worker09.fsdns.com AMBARI_METRICS STARTED
    worker09.fsdns.com YARN STARTED
    worker09.fsdns.com HDFS STARTED

나는 지금까지 무엇을 시도했는가

 for i in 01 02 03 04 05 06 07 
 do
  grep  worker$i /tmp/list.in
 done


 worker01.fsdns.com AMBARI_METRICS STARTED
 worker01.fsdns.com YARN STARTED
 worker01.fsdns.com HDFS STARTED
 worker02.fsdns.com AMBARI_METRICS STARTED
 worker02.fsdns.com YARN STARTED
 worker02.fsdns.com HDFS STARTED
 worker03.fsdns.com AMBARI_METRICS STARTED
 worker03.fsdns.com YARN STARTED
 worker03.fsdns.com HDFS STARTED
 worker05.fsdns.com AMBARI_METRICS STARTED
 worker05.fsdns.com YARN STARTED
 worker05.fsdns.com HDFS STARTED
 worker06.fsdns.com AMBARI_METRICS STARTED
 worker06.fsdns.com YARN STARTED
 worker06.fsdns.com HDFS STARTED
 worker07.fsdns.com AMBARI_METRICS STARTED
 worker07.fsdns.com YARN STARTED
 worker07.fsdns.com HDFS STARTED

답변1

빈 줄이 중요하지 않은 경우 간단한 정렬 명령은 다음과 같습니다.

sort -t. -k1 /home/list.in

결과(앞에 빈 줄 포함):

master01.fsdns.com AMBARI_METRICS STARTED
master01.fsdns.com HDFS STARTED
master01.fsdns.com YARN STARTED
master02.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com HDFS STARTED
master02.fsdns.com YARN STARTED
master03.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com HDFS STARTED
master03.fsdns.com YARN STARTED
worker01.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com HDFS STARTED
worker01.fsdns.com YARN STARTED
worker02.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com HDFS STARTED
worker02.fsdns.com YARN STARTED
worker03.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com HDFS STARTED
worker03.fsdns.com YARN STARTED
worker05.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com HDFS STARTED
worker05.fsdns.com YARN STARTED
worker06.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com HDFS STARTED
worker06.fsdns.com YARN STARTED
worker07.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com HDFS STARTED
worker07.fsdns.com YARN STARTED
worker08.fsdns.com AMBARI_METRICS STARTED
worker08.fsdns.com HDFS STARTED
worker08.fsdns.com YARN STARTED
worker09.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com HDFS STARTED
worker09.fsdns.com YARN STARTED

답변2

$ sort -k1,1 list.in  | 
    awk '
      /^[[:space:]]*$/ { next };
      lasthost == "" { lasthost = $1 };
      $1 == lasthost { print $0; next };
      {print "\n" $0 ; lasthost=$1 }' 
master01.fsdns.com AMBARI_METRICS STARTED
master01.fsdns.com HDFS STARTED
master01.fsdns.com YARN STARTED

master02.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com HDFS STARTED
master02.fsdns.com YARN STARTED

master03.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com HDFS STARTED
master03.fsdns.com YARN STARTED

worker01.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com HDFS STARTED
worker01.fsdns.com YARN STARTED

worker02.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com HDFS STARTED
worker02.fsdns.com YARN STARTED

worker03.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com HDFS STARTED
worker03.fsdns.com YARN STARTED

worker05.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com HDFS STARTED
worker05.fsdns.com YARN STARTED

worker06.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com HDFS STARTED
worker06.fsdns.com YARN STARTED

worker07.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com HDFS STARTED
worker07.fsdns.com YARN STARTED

worker08.fsdns.com AMBARI_METRICS STARTED
worker08.fsdns.com HDFS STARTED
worker08.fsdns.com YARN STARTED

worker09.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com HDFS STARTED
worker09.fsdns.com YARN STARTED

awk 스크립트는 $1 필드에 표시된 마지막 호스트 이름을 추적하고 변경 시 현재 입력 줄 앞에 개행 문자를 인쇄합니다. 또한 완전히 비어 있거나 공백 문자만 포함된 줄은 건너뜁니다.

첫 번째 레코드 앞에 빈 줄이 인쇄되는 것을 방지하기 위해 변수가 비어 있는지(즉, 정의되지 않음) 확인 lasthost하고 그렇다면 설정합니다.

답변3

이것은 작동합니다:

awk '$1{a[$1];b[$2]}
END{asorti(a);for( i in a){for(j in b){printf("%s %s\n",a[i],j)};printf("\n")}}' file

$1비어 있지 않은 첫 번째 필드에 대해
{a[$1];b[$2]}배열 a와 b를 만듭니다.
END{모든 파일을 읽은 후
asorti(a)각 서버의 각 컴퓨터에 대해 배열 a를 정렬하고 정렬된 값을 인쇄하고 입력 파일에 대해 새(빈) 값을 인쇄합니다.
for( i in a ){
for(j in b){
printf("%s %s\n",a[i],j)};
printf("\n")}
}' file

답변4

동일한 목적을 달성하려면 awk와 sed를 사용하십시오. 테스트 후 효과는 매우 좋습니다

i=`awk -F "." '{print $1}' l.txt  | sed '/^$/d' | sed  "s/\s+//g" | sort -u`; for j in $i; do sed -n "/$j/p" l.txt; done

산출

master01.fsdns.comAMBARI_METRICS STARTED
master01.fsdns.com YARN STARTED
master01.fsdns.com HDFS STARTED
master02.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com YARN STARTED
master02.fsdns.com HDFS STARTED
master03.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com YARN STARTED
master03.fsdns.com HDFS STARTED
worker01.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com YARN STARTED
worker01.fsdns.com HDFS STARTED
worker02.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com YARN STARTED
worker02.fsdns.com HDFS STARTED
worker03.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com YARN STARTED
worker03.fsdns.com HDFS STARTED
worker05.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com YARN STARTED
worker05.fsdns.com HDFS STARTED
worker06.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com YARN STARTED
worker06.fsdns.com HDFS STARTED
worker07.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com YARN STARTED
worker07.fsdns.com HDFS STARTED
worker08.fsdns.com AMBARI_METRICS STARTED
worker08.fsdns.com YARN STARTED
worker08.fsdns.com HDFS STARTED
worker09.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com YARN STARTED
worker09.fsdns.com HDFS STARTED

관련 정보