입력하다:
<tr><td>FOOBAAR</td><td>FOOO</td><td>BAAR</td><td><font style=BACKGROUND-COLOR:red>2014-02-14 13:34</font></td><td><font style=BACKGROUND-COLOR:red>2014-02-17 13:34</font></td><td><font style=BACKGROUND-COLOR:red>2014-03-07 13:34</font></td></tr>
산출:
<tr><td>FOOBAAR</td><td>FOOO</td><td>BAAR</td><td>2014-02-14 13:34</td><td><font style=BACKGROUND-COLOR:red>2014-02-17 13:34</font></td><td><font style=BACKGROUND-COLOR:red>2014-03-07 13:34</font></td></tr>
차이점:
<font style=BACKGROUND-COLOR:red>
그리고
</font>
네 번째 열에서만 제거합니다.
내 질문:주어진 열에서 주어진 문자열만 삭제하는 방법은 무엇입니까?
</td><td>
구분 기호입니다
답변1
정규식을 사용하는 것보다 HTML 구문 분석 도구를 사용하는 것이 좋습니다. (유명한 답변이 이유를 설명합니다.여기)
다음은 XML 파서를 사용하는 예입니다(참고: 입력은 올바른 형식의 XML이어야 하지만 예제 HTML은 그렇지 않습니다).
# change the value of the style attribute of the font tag of the 4th td tag
# to the empty string
xmlstarlet ed -O -u '//table/tr/td[4]/font[@style]/@style' -v "" <<END
<html><head></head><body><table>
<tr><td>FOOBAAR</td><td>FOOO</td><td>BAAR</td><td><font style="BACKGROUND-COLOR:red">2014-02-14 13:34</font></td><td><font style="BACKGROUND-COLOR:red">2014-02-17 13:34</font></td><td><font style="BACKGROUND-COLOR:red">2014-03-07 13:34</font></td></tr>
</table></body></html>
END
<html>
<head/>
<body>
<table>
<tr>
<td>FOOBAAR</td>
<td>FOOO</td>
<td>BAAR</td>
<td>
<font style="">2014-02-14 13:34</font>
</td>
<td>
<font style="BACKGROUND-COLOR:red">2014-02-17 13:34</font>
</td>
<td>
<font style="BACKGROUND-COLOR:red">2014-03-07 13:34</font>
</td>
</tr>
</table>
</body>
</html>
답변2
이것은 효과가 있을 수 있습니다..
#!/bin/sh
# replace specific strings from the fourth column
INSTRING="<tr><td>FOOBAAR</td><td>FOOO</td><td>BAAR</td><td><font style=BACKGROUND-COLOR:red>2014-02-14 13:34</font></td><td><font style=BACKGROUND-COLOR:red>2014-02-17 13:34</font></td><td><font style=BACKGROUND-COLOR:red>2014-03-07 13:34</font></td></tr>"
DEL_STRING1="<font style=BACKGROUND-COLOR:red>"
DEL_STRING2="</font>"
DELIM="</td><td>"
OUT_FIRST=`echo $INSTRING | awk -F $DELIM '{print $1,$2,$3,$4}' OFS="</td><td>"`
OUT_FIRST=`echo $OUT_FIRST | awk -F "$DEL_STRING1" '{print $1,$2}' OFS=""`
OUT_FIRST=`echo $OUT_FIRST | awk -F "$DEL_STRING2" '{print $1}'`
OUT_LAST=`echo $INSTRING | awk -F $DELIM '{print substr($0, index($0,$5))}' OFS=$DELIM`
echo "$OUT_FIRST$DELIM$OUT_LAST"
답변3
awk 한 줄 명령,
$ awk -F '<\/td><td>' 'BEGIN{OFS=FS;} {gsub (/<font style=BACKGROUND-COLOR:red>/,"",$4); gsub (/<\/font>/,"",$4);}1' file 2>/dev/null
<tr><td>FOOBAAR</td><td>FOOO</td><td>BAAR</td><td>2014-02-14 13:34</td><td><font style=BACKGROUND-COLOR:red>2014-02-17 13:34</font></td><td><font style=BACKGROUND-COLOR:red>2014-03-07 13:34</font></td></tr>
답변4
sed 's|</td><td>|</td>\nTGT_LINE_MARKER<td>|4' |
sed '\|TGT_LINE_MARKER|{function applied to target field}'