AWK는 대용량 파일을 검색하고 변수 이름을 씁니다.

Question 1

기본적으로 작은 상태 머신을 작성하게 됩니다.

awk '
    BEGIN { 
        FS = ","
        OFS = " "    # this is the default
    }

    # create the output file name
    # on the first line of the input, the FILENAME variable will be populated
    FNR == 1 {
        f = FILENAME
        sub(/\.out/,".cm",f)
    }

    # I assume this is the magic closing line.
    # All the backslashes and regular-expression metacharacters 
    # have to be backslash-escaped
    /1\\1\\GINC-C0959\\FOpt\\RB3LYP\\6-31G\(d\)\\C5H9Cl1O1\\SKYLERS\\10-Sep-2013\\0\\\\#/ {
        print "got end"
        exit
    }

    started && /Variables:/ {
        variables = 1
        FS = "="
        next
    }

    started && !variables {
        # do stuff with comma-separated lines
        # rewrite the file using space as separator
        # this looks weird, but it forces awk to re-write the line using OFS
        $1 = $1
        print > f
    }
    started && variables {
        # do stuff with "="-separated lines
        # the FS here is "=", so there should be 2 fields.
        printf "%-5s %15.8f\n", $1, $2 > f
    }

    !started && /Final structure in terms of initial Z-matrix/ {
        started = 1
    }
' abc.out

입력 내용에 따라 "abc.cm" 파일이 생성됩니다.

Cl
C 1 B1
C 2 B2 1 A2
H 2 B3 1 A3 3 D3 0
H 2 B4 1 A4 3 D4 0
C 3 B5 2 A5 1 D5 0
C 6 B6 3 A6 2 D6 0
C 7 B7 6 A7 3 D7 0
H 3 B8 2 A8 1 D8 0
H 3 B9 2 A9 1 D9 0
H 7 B10 6 A10 3 D10 0
H 7 B11 6 A11 3 D11 0
H 8 B12 7 A12 6 D12 0
H 8 B13 7 A13 6 D13 0
H 8 B14 7 A14 6 D14 0
O 6 B15 3 A15 2 D15 0
B1         1.81746475
B2         1.52136867
A2       110.80057513
B3         1.08989670
A3       106.92512231
D3      -121.94499481
B4         1.08989406
A4       106.92581701
D4       121.94497834
B5         1.52808963
A5       111.92359259
D5       179.99770382
B6         1.52319300
A6       116.49970868
D6       179.97424974
B7         1.52739317
A7       113.56269053
D7       179.98802896
B8         1.09816794
A8       110.50682514
D8        58.28546880
B9         1.09816384
A9       110.50888758
D9       -58.28349045
B10        1.10022643
A10      107.84652382
D10       56.40290615
B11        1.10022793
A11      107.84460667
D11      -56.42958848
B12        1.09398015
A12      110.97242167
D12      -59.62466169
B13        1.09473047
A13      110.53459142
D13      179.99742235
B14        1.09397826
A14      110.97204350
D14       59.61905862
B15        1.21736254
A15      121.22780588
D15       -0.02140167

Answer

기본적으로 작은 상태 머신을 작성하게 됩니다.

awk '
    BEGIN { 
        FS = ","
        OFS = " "    # this is the default
    }

    # create the output file name
    # on the first line of the input, the FILENAME variable will be populated
    FNR == 1 {
        f = FILENAME
        sub(/\.out/,".cm",f)
    }

    # I assume this is the magic closing line.
    # All the backslashes and regular-expression metacharacters 
    # have to be backslash-escaped
    /1\\1\\GINC-C0959\\FOpt\\RB3LYP\\6-31G\(d\)\\C5H9Cl1O1\\SKYLERS\\10-Sep-2013\\0\\\\#/ {
        print "got end"
        exit
    }

    started && /Variables:/ {
        variables = 1
        FS = "="
        next
    }

    started && !variables {
        # do stuff with comma-separated lines
        # rewrite the file using space as separator
        # this looks weird, but it forces awk to re-write the line using OFS
        $1 = $1
        print > f
    }
    started && variables {
        # do stuff with "="-separated lines
        # the FS here is "=", so there should be 2 fields.
        printf "%-5s %15.8f\n", $1, $2 > f
    }

    !started && /Final structure in terms of initial Z-matrix/ {
        started = 1
    }
' abc.out

입력 내용에 따라 "abc.cm" 파일이 생성됩니다.

Cl
C 1 B1
C 2 B2 1 A2
H 2 B3 1 A3 3 D3 0
H 2 B4 1 A4 3 D4 0
C 3 B5 2 A5 1 D5 0
C 6 B6 3 A6 2 D6 0
C 7 B7 6 A7 3 D7 0
H 3 B8 2 A8 1 D8 0
H 3 B9 2 A9 1 D9 0
H 7 B10 6 A10 3 D10 0
H 7 B11 6 A11 3 D11 0
H 8 B12 7 A12 6 D12 0
H 8 B13 7 A13 6 D13 0
H 8 B14 7 A14 6 D14 0
O 6 B15 3 A15 2 D15 0
B1         1.81746475
B2         1.52136867
A2       110.80057513
B3         1.08989670
A3       106.92512231
D3      -121.94499481
B4         1.08989406
A4       106.92581701
D4       121.94497834
B5         1.52808963
A5       111.92359259
D5       179.99770382
B6         1.52319300
A6       116.49970868
D6       179.97424974
B7         1.52739317
A7       113.56269053
D7       179.98802896
B8         1.09816794
A8       110.50682514
D8        58.28546880
B9         1.09816384
A9       110.50888758
D9       -58.28349045
B10        1.10022643
A10      107.84652382
D10       56.40290615
B11        1.10022793
A11      107.84460667
D11      -56.42958848
B12        1.09398015
A12      110.97242167
D12      -59.62466169
B13        1.09473047
A13      110.53459142
D13      179.99742235
B14        1.09397826
A14      110.97204350
D14       59.61905862
B15        1.21736254
A15      121.22780588
D15       -0.02140167

Question 2

다음은 Python 스크립트입니다.

#!/usr/bin/env python
from __future__ import print_function
import sys

StartStr = 'Final structure in terms of initial Z-matrix'
StopStr = '1\\1\\GINC-C0959\\FOpt\\RB3LYP\\6-31G(d)\\C5H9Cl1O1\\SKYLERS\\10-Sep-2013\\0\\\\#'

def main():
        v,start = 0,0
        for line in InputFile:
                line = line.strip()
                if StartStr in line: start = 1; continue
                if StopStr in line: break
                if start:
                    if v:  print('\t'.join(line.split('=')))
                    else:
                        if "Variables" in line: v = 1; print(); continue
                        print(' '.join(line.split(',')))

if __name__ == '__main__':
    if len(sys.argv) != 2:
        print( "\nUsage:\t",sys.argv[0],'<InputFile>\n',file=sys.stderr )
    else:
        try:
                ## create the output file name
                outputFile=sys.argv[1].split('.')[0],".cm"
                o = ''.join(outputFile)
                print("Your Final Output Saved in:- ",o)
                with open(sys.argv[1],'r') as InputFile:
                         sys.stdout = open(o,'w')
                         main()
        except:
                print("Problem with Opening file",sys.argv[1],file=sys.stderr)

Answer

다음은 Python 스크립트입니다.

#!/usr/bin/env python
from __future__ import print_function
import sys

StartStr = 'Final structure in terms of initial Z-matrix'
StopStr = '1\\1\\GINC-C0959\\FOpt\\RB3LYP\\6-31G(d)\\C5H9Cl1O1\\SKYLERS\\10-Sep-2013\\0\\\\#'

def main():
        v,start = 0,0
        for line in InputFile:
                line = line.strip()
                if StartStr in line: start = 1; continue
                if StopStr in line: break
                if start:
                    if v:  print('\t'.join(line.split('=')))
                    else:
                        if "Variables" in line: v = 1; print(); continue
                        print(' '.join(line.split(',')))

if __name__ == '__main__':
    if len(sys.argv) != 2:
        print( "\nUsage:\t",sys.argv[0],'<InputFile>\n',file=sys.stderr )
    else:
        try:
                ## create the output file name
                outputFile=sys.argv[1].split('.')[0],".cm"
                o = ''.join(outputFile)
                print("Your Final Output Saved in:- ",o)
                with open(sys.argv[1],'r') as InputFile:
                         sys.stdout = open(o,'w')
                         main()
        except:
                print("Problem with Opening file",sys.argv[1],file=sys.stderr)

Question 3

이는 Perl을 사용하여 수행할 수도 있습니다.트리거 연산자. 셸에 다음을 입력합니다.

INFILE="abc.out"           #Quotes only necessary ...
OUTFILE="${INFILE%.*}".cm  # ... if you have spaces in the file names
perl -nle '
    if(m{\QFinal structure in terms of initial Z-matrix:\E} ..
       m{\Q1\1\GINC-C0959\FOpt\RB3LYP\6-31G(d)\C5H9Cl1O1\SKYLERS\10-Sep-2013\0\\#\E}){
        (s/,/ /g or !/=|:/) and print;
        /([^=]+)=([^=]+)/ and printf "%-4s %13.8f\n", $1,$2
     }
     ' "$INFILE" > "$OUTFILE"

Answer

이는 Perl을 사용하여 수행할 수도 있습니다.트리거 연산자. 셸에 다음을 입력합니다.

INFILE="abc.out"           #Quotes only necessary ...
OUTFILE="${INFILE%.*}".cm  # ... if you have spaces in the file names
perl -nle '
    if(m{\QFinal structure in terms of initial Z-matrix:\E} ..
       m{\Q1\1\GINC-C0959\FOpt\RB3LYP\6-31G(d)\C5H9Cl1O1\SKYLERS\10-Sep-2013\0\\#\E}){
        (s/,/ /g or !/=|:/) and print;
        /([^=]+)=([^=]+)/ and printf "%-4s %13.8f\n", $1,$2
     }
     ' "$INFILE" > "$OUTFILE"

Question 4

이것은 awk의 또 다른 버전입니다.

awk -f- <<\EOF data
    FNR==1 { f = FILENAME".new" }
    /Final structure in terms of initial Z-matrix:/ {
        FS=","
        while ( getline > 0 ) {
            if ( $0 ~ /Variables:/ ) break
            $1=$1
            print $0 > f
        }
        FS="="
        while ( getline > 0 ) {
            if( NF == 2 ) {
                printf "%-5s%15.8f\n", $1, $2 > f
            } else {
                break
            }
        }
    }
EOF

Answer

이것은 awk의 또 다른 버전입니다.

awk -f- <<\EOF data
    FNR==1 { f = FILENAME".new" }
    /Final structure in terms of initial Z-matrix:/ {
        FS=","
        while ( getline > 0 ) {
            if ( $0 ~ /Variables:/ ) break
            $1=$1
            print $0 > f
        }
        FS="="
        while ( getline > 0 ) {
            if( NF == 2 ) {
                printf "%-5s%15.8f\n", $1, $2 > f
            } else {
                break
            }
        }
    }
EOF

AWK는 대용량 파일을 검색하고 변수 이름을 씁니다.

답변1

답변2

답변3

답변4

관련 정보