HTML 파일에서 특정 범위 요소를 제거하는 스크립트

Question 1

Perl은 개행 문자에서도 이를 수행할 수 있습니다.

이것을 파일로 덤프합니다(example.html이라고 부르겠습니다).

<p>Here is some <span>foo bar</span> example text.</p>
<p>Some text even <span>foo
bar</span> spans across line breaks.</p>

그런 다음 다음을 시도하십시오.

$ perl -0777 -pe 's/<span.*?<\/span>//gs' example.html
<p>Here is some  example text.</p>
<p>Some text even  spans across line breaks.</p>

Answer

Perl은 개행 문자에서도 이를 수행할 수 있습니다.

이것을 파일로 덤프합니다(example.html이라고 부르겠습니다).

<p>Here is some <span>foo bar</span> example text.</p>
<p>Some text even <span>foo
bar</span> spans across line breaks.</p>

그런 다음 다음을 시도하십시오.

$ perl -0777 -pe 's/<span.*?<\/span>//gs' example.html
<p>Here is some  example text.</p>
<p>Some text even  spans across line breaks.</p>

Question 2

HTML이 올바른 형식의 XML인 경우 XML 처리 도구를 사용할 수 있습니다(예: xmlstarlet파일이 다음과 같다고 가정 original.html).

xmlstarlet ed -O -d '/html//span[@class = "foo"]' original.html

산출

<html>
  <head>
    <title>hello world</title>
  </head>
  <body>
lorem ipsum

alpha beta
  </body>
</html>

Answer

HTML이 올바른 형식의 XML인 경우 XML 처리 도구를 사용할 수 있습니다(예: xmlstarlet파일이 다음과 같다고 가정 original.html).

xmlstarlet ed -O -d '/html//span[@class = "foo"]' original.html

산출

<html>
  <head>
    <title>hello world</title>
  </head>
  <body>
lorem ipsum

alpha beta
  </body>
</html>

관련 정보