.txt 파일을 구문 분석하여 .csv를 생성합니다.

2024-5-26 • tag-icon

다음 내용이 포함된 텍스트 파일이 있습니다.

Torrent file  : Linux.Format.-.October.2016.-.True.Pdf.-.Set.1001.[ECLiPSE].torrent
Metadata info : 9968 bytes, 412 pieces, 65536 bytes per piece
Torrent name  : Linux Format - October 2016 - True Pdf - Set 1001 [ECLiPSE]
Content info  : 3 files, 26965176 bytes
Announce URL  : http://explodie.org:6969/announce

F#  Bytes       File name
--- ----------- ---------------------------------------------------------------
  1    26944026 linfor1016.pdf
  2       19963 ECLiPSE.txt
  3        1187 Read Me.txt

Torrent file  : linuxmint-13-cinnamon-dvd-64bit.iso.torrent
Metadata info : 32303 bytes, 1602 pieces, 524288 bytes per piece
Torrent name  : linuxmint-13-cinnamon-dvd-64bit.iso
Content info  : single file, 839909376 bytes
Announce URL  : http://torrents.linuxmint.com/announce.php
Torrent file  : linuxmint-13-kde-dvd-64bit.iso.torrent
Metadata info : 35938 bytes, 1784 pieces, 524288 bytes per piece
Torrent name  : linuxmint-13-kde-dvd-64bit.iso
Content info  : single file, 935329792 bytes
Announce URL  : http://torrents.linuxmint.com/announce.php

파일은 다음을 통해 생성됩니다.

for i in *.torrent;do torrentcheck -t $i >> info.txt;done

이제 csv 파일을 얻을 수 있도록 이 txt 파일을 변환하고 싶습니다.두 개의 열,지금 바로토렌트 파일&콘텐츠 정보(헤더로) 위의 bash 명령을 통해 구문 분석된 각 토렌트 파일에 대해 예를 들면 다음과 같습니다.

Torrent file,Content info 
Linux.Format.-.October.2016.-.True.Pdf.-.Set.1001.[ECLiPSE].torrent,3 files, 26965176 bytes
linuxmint-13-cinnamon-dvd-64bit.iso.torrent,single file, 839909376 bytes
linuxmint-13-kde-dvd-64bit.iso.torrent,single file, 935329792 bytes

그런 다음 이러한 열은 스프레드시트 응용 프로그램에서 추가 처리되어 크기나 파일 수에 따라 토렌트를 정렬할 수 있습니다.

다음과 같은 파일 문자열을 검색할 수 있습니다.

grep 'Torrent file' info.txt or grep 'Content' info.txt

하지만 반환 텍스트 문자열을 사용하여 내가 얻은 것과 같은 필수 정보를 추출하려면 어떻게 해야 할까요? Torrent file : linuxmint-13-cinnamon-dvd-64bit.iso.torrent스프레드시트 MID, LEN 명령을 사용하여 문자열을 다음과 같이 줄일 수 있습니다.linuxmint-13-cinnamon-dvd-64bit.iso.torrent

답변1

간단한 awk 스크립트는 다음과 같이 데이터를 구문 분석할 수 있습니다.

awk -F': ' 'BEGIN { print "Torrent file,Content info,Size" }
$0~/^Torrent file/ { save = $2 }
$0~/^Content info/ { printf "%s,%s\n",save,$2 }'  <info.txt

":"으로 줄을 나누고, 줄의 두 번째 필드를 저장하고 나중에 다른 줄이 발견되면 인쇄하세요.

답변1

관련 정보