나는 순수한 Linux 상자에 접근할 수 없습니다. 파일 이름 주석 줄이 누락된 XML 파일이 많이 있습니다. 해당 줄을 XML 파일의 특정 위치에 삽입하고 작은 변환을 사용하여 XML 파일의 파일 이름을 기반으로 파일 이름 주석을 생성해야 합니다.
예:
24ToLife_AFamilyDivided_191045_DANY.xml 예
<description>Entrepreneur James overcame unconscionable childhood abuse before the sins of his past came back to haunt him.</description>
<media:rating>TV-14</media:rating>
읽으려면 필요합니다.
<description>Entrepreneur James overcame unconscionable childhood abuse before the sins of his past came back to haunt him.</description>
<media:content url="24ToLife_AFamilyDivided_191045.mpg" type="video/mpg" expression="full" />
<media:rating>TV-14</media:rating>
답변1
방금 MacOS High Sierra에서 이것을 작성하고 테스트했습니다.
#!/bin/sh
for fl in *.xml
do
filename=$(echo $fl | cut -f 1 -d '.' | sed 's/_DANY$//')
sed -i .orig '1a\
<media:content url="'$filename'.mpg" type="video/mpg" expression="full" /> \
' $fl
done
ls *.xml search in current directory
-i .orig backup of original files with suffix
'1a ..' insert into second line
sed
MacOS의 BSD는 GNU와 약간 다르기 sed
때문에 다음 표현식을 별도의 줄에 작성해야 합니다.
'1a \ # backslash and newline
some text'
개행 문자 \n
는 인식되지 않으므로 다음과 같이 작성해야 합니다.
'1a \
some text # newline here
'
바꾸다:
'1a \
some text\n'
용법:
yurijs-MacBook-Pro:sed yurij$ cat *.xml
<description>Entrepreneur James overcame unconscionable childhood abuse before the sins of his past came back to haunt him.</description>
<media:rating>TV-14</media:rating>
<description>Entrepreneur James overcame unconscionable childhood abuse before the sins of his past came back to haunt him.</description>
<media:rating>TV-14</media:rating>
yurijs-MacBook-Pro:sed yurij$ ./cli
yurijs-MacBook-Pro:sed yurij$ cat *.xml
<description>Entrepreneur James overcame unconscionable childhood abuse before the sins of his past came back to haunt him.</description>
<media:content url="24ToLife_AFamilyDivided_191045.mpg" type="video/mpg" expression="full" />
<media:rating>TV-14</media:rating>
<description>Entrepreneur James overcame unconscionable childhood abuse before the sins of his past came back to haunt him.</description>
<media:content url="tt.mpg" type="video/mpg" expression="full" />
<media:rating>TV-14</media:rating>
답변2
원하는 작업을 수행하는 Python 스크립트는 다음과 같습니다.
#!/usr/bin/env python
# -*- encoding: ascii -*-
"""insert_xml.py"""
import sys
from bs4 import BeautifulSoup as Soup
# Get the filename from the command-line
filename = sys.argv[1]
with open(filename, 'r') as xmlfile:
# Parse the file
soup = Soup(xmlfile.read(), "html.parser")
# Search for "description" tags
for element in soup.findAll("description"):
# Check to see if the "media:content" element is missing
if element and not element.find_next_sibling("media:content"):
# If so, construct a new "media:content" tag
new_tag = soup.new_tag('media:content')
new_tag["url"] = filename
new_tag["type"] = "video/mpg"
new_tag["expression"] = "full"
# Insert the "media:content" tag after the "description" tag
element.insert_after(new_tag)
# Print the modified XML document - one element per line
for element in soup.findAll():
print(element)
실제 모습은 다음과 같습니다.
$ python insert_xml.py in.xml
<description>Entrepreneur James overcame unconscionable childhood abuse before the sins of his past came back to haunt him.</description>
<media:content expression="full" type="video/mpg" url="in.xml"></media:content>
<media:rating>TV-14</media:rating>