패턴 일치를 통해 파일을 특정 출력 파일 이름으로 분할

Question 1

나는 여전히 를 사용하는 것을 고려하고 csplit생성된 파일의 이름을 바꿉니다.

#!/bin/sh
mkdir ".tmp.$$" || exit 2
csplit -f ".tmp.$$/tmp_" -zk -n 4 "$1" '/# new file/' '{*}'

for file in ".tmp.$$"/tmp_*
do
    shift
    mv -f "$file" "$1"
done
if ! rmdir ".tmp.$$" 2>/dev/null
then
    echo "Warning: not all file parts were assigned" >&2
    rm -rf ".tmp.$$"
    exit 1
fi
exit 0

용법

mysplit <source_file> <target_names...>

Answer

나는 여전히 를 사용하는 것을 고려하고 csplit생성된 파일의 이름을 바꿉니다.

#!/bin/sh
mkdir ".tmp.$$" || exit 2
csplit -f ".tmp.$$/tmp_" -zk -n 4 "$1" '/# new file/' '{*}'

for file in ".tmp.$$"/tmp_*
do
    shift
    mv -f "$file" "$1"
done
if ! rmdir ".tmp.$$" 2>/dev/null
then
    echo "Warning: not all file parts were assigned" >&2
    rm -rf ".tmp.$$"
    exit 1
fi
exit 0

용법

mysplit <source_file> <target_names...>

Question 2

텍스트 파일과 파일 이름에 공백과 비ASCII 문자가 포함된 경우에도 임시 파일을 사용하지 않고 이 방법을 사용할 수 있습니다.

infile:

# new file
text in file1

blabla
# new file
text in file2
# new file
text in file3

$//*+\

s
# new file
4!
aaaaaaaaa
i^
# new file

#¬}}{][|\~@

split.sh파일 이름은 다음 스크립트 에서 쉘이 확장되지 않도록(큰따옴표) 작은따옴표를 사용하여 별도의 인수로 awk 명령에 제공되어야 합니다 .

awk -v file="0" '
  BEGIN { 
    print "AWK arguments:"
    for (i = 0; i < ARGC; i++){
    ARRAY[i] = ARGV[i]
    print "\047"ARRAY[i]"\047"
    if (i > 1){
      ARGV[i] = ""
    }
  }
  print "Writing:"
}
!/^# new file$/{
  print "writing to: " "\047"ARRAY[file+1]"\047"
  print $0 >> ARRAY[file+1]
}
/^# new file$/{
  close(file)
  ++file
  print "writing to: " "\047"ARRAY[file+1]"\047"
  print $0 > ARRAY[file+1]
}
' 'infile' '1.txt' '2.txt' '3.txt' 'file $_%.txt' '&file  _.txt'

콘솔은 다음과 같습니다.

AWK arguments:
'awk'
'infile'
'1.txt'
'2.txt'
'3.txt'
'file $_%.txt'
'&file  _.txt'
Writing:
writing to: '1.txt'
writing to: '1.txt'
writing to: '1.txt'
writing to: '1.txt'
writing to: '2.txt'
writing to: '2.txt'
writing to: '3.txt'
writing to: '3.txt'
writing to: '3.txt'
writing to: '3.txt'
writing to: '3.txt'
writing to: '3.txt'
writing to: 'file $_%.txt'
writing to: 'file $_%.txt'
writing to: 'file $_%.txt'
writing to: 'file $_%.txt'
writing to: '&file  _.txt'
writing to: '&file  _.txt'
writing to: '&file  _.txt'

인수가 다른 명령의 출력으로 전달되는 경우(파일이 이전에 파일 시스템에 존재해야 함):

' $(ls infile | tr '\n' ' ' ; ls *.txt)

인수를 공백으로 구분합니다.

AWK arguments:
'awk'
'infile'
'&file'
'_.txt'
'1.txt'
'2.txt'
'3.txt'
'_.txt'
'file'
'$_%.txt'
Writing:
writing to: '&file'
writing to: '&file'
writing to: '&file'
writing to: '&file'
writing to: '_.txt'
writing to: '_.txt'
writing to: '1.txt'
writing to: '1.txt'
writing to: '1.txt'
writing to: '1.txt'
writing to: '1.txt'
writing to: '1.txt'
writing to: '2.txt'
writing to: '2.txt'
writing to: '2.txt'
writing to: '2.txt'
writing to: '3.txt'
writing to: '3.txt'
writing to: '3.txt'

이 문제를 해결하려면 공백 대신 개행 문자로 구분된 배열로 매개변수를 awk에 전달하고 다음 split.sh스크립트를 사용하십시오.

array=(infile *.txt)
awk -v file="0" '
  BEGIN { 
    print "AWK arguments:"
    for (i = 0; i < ARGC; i++){
    ARRAY[i] = ARGV[i]
    print "\047"ARRAY[i]"\047"
    if (i > 1){
      ARGV[i] = ""
    }
  }
  print "Writing:"
}
!/^# new file$/{
  print "writing to: " "\047"ARRAY[file+1]"\047"
  print $0 >> ARRAY[file+1]
}
/^# new file$/{
  close(file)
  ++file
  print "writing to: " "\047"ARRAY[file+1]"\047"
  print $0 > ARRAY[file+1]
}
' "${array[@]}"

이제 결과는 다음과 같습니다.

AWK arguments:
'awk'
'infile'
'&file  _.txt'
'1.txt'
'2.txt'
'3.txt'
'file $_%.txt'
Writing:
writing to: '&file  _.txt'
writing to: '&file  _.txt'
writing to: '&file  _.txt'
writing to: '&file  _.txt'
writing to: '1.txt'
writing to: '1.txt'
writing to: '2.txt'
writing to: '2.txt'
writing to: '2.txt'
writing to: '2.txt'
writing to: '2.txt'
writing to: '2.txt'
writing to: '3.txt'
writing to: '3.txt'
writing to: '3.txt'
writing to: '3.txt'
writing to: 'file $_%.txt'
writing to: 'file $_%.txt'
writing to: 'file $_%.txt'

기록할 파일 수는 수행할 분할 수와 최소한 동일해야 합니다. 더 많은 경우 나머지는 무시됩니다.

Answer