텍스트 처리 속도 향상

Question

먼저 이름이 지정된 파일에서 36행의 헤더를 추출한 input다음 파일의 나머지 부분에서 60000행을 무작위로 선택하고 동일한 행을 여러 번 무작위로 선택할 수 있습니다. 모든 출력은 output.

shufGNU coreutils 사용 :

#!/bin/sh

# Fetch header (36 first lines)
head -n 36 <input >output

# Scramble the other lines and pick 60000 (allowing for repeated lines)
tail -n +37 <input | shuf -r -n 60000 >>output

또는:

( head -n 36 <input; tail -n +37 <input | shuf -r -n 60000 ) >output

GNU를 사용하면 head출력의 마지막 줄 바로 뒤에 입력 파일 스트림을 유지합니다. 즉, 읽기가 끝난 위치에서 계속할 shuf수 있습니다 (이 기능은 작동하지 않을 수 있음).head일부비 GNU head구현):

( head -n 36; shuf -r -n 60000 ) <input >output

Answer 1

먼저 이름이 지정된 파일에서 36행의 헤더를 추출한 input다음 파일의 나머지 부분에서 60000행을 무작위로 선택하고 동일한 행을 여러 번 무작위로 선택할 수 있습니다. 모든 출력은 output.

shufGNU coreutils 사용 :

#!/bin/sh

# Fetch header (36 first lines)
head -n 36 <input >output

# Scramble the other lines and pick 60000 (allowing for repeated lines)
tail -n +37 <input | shuf -r -n 60000 >>output

또는:

( head -n 36 <input; tail -n +37 <input | shuf -r -n 60000 ) >output

GNU를 사용하면 head출력의 마지막 줄 바로 뒤에 입력 파일 스트림을 유지합니다. 즉, 읽기가 끝난 위치에서 계속할 shuf수 있습니다 (이 기능은 작동하지 않을 수 있음).head일부비 GNU head구현):

( head -n 36; shuf -r -n 60000 ) <input >output

텍스트 처리 속도 향상

답변1

관련 정보