JSON에서 X보다 긴 줄을 찾아 전체 개체를 삭제합니다.

Question

원하는 작업을 수행하는 Python 스크립트는 다음과 같습니다.

#!/usr/bin/env python
# -*- coding: ascii -*-
"""filter.py"""

import sys

# Get the file and the maximum line-length as command-line arguments
filepath = sys.argv[1]
maxlen = int(sys.argv[2])

# Initialize a list to store the unfiltered lines
lines = []

# Read the data file line-by-line
jsonfile = open(filepath, 'r')
for line in jsonfile:

    # Only consider non-empty lines
    if line:

        # For "text" lines that are too line, remove the previous line
        # and also skip the next two line
        if "text" in line and len(line) > maxlen: 
            lines.pop()
            next(jsonfile)
            next(jsonfile)
        # Add all other lines to the list
        else:
            lines.append(line)

# Strip trailing comma from the last object
lines[-2] = lines[-2].replace(',', '')

# Output the lines from the list
for line in lines:
    sys.stdout.write(line)

다음과 같이 실행할 수 있습니다.

python filter.py data.json 34

다음과 같은 데이터 파일이 있다고 가정합니다.

[
    {
    "text": "blah blah blah one",
    "author": "John Doe"
    },
    {
    "text": "blah blah blah two",
    "author": "John Doe"
    },
    {
    "text": "blah blah blah three",
    "author": "John Doe"
    }
]

그런 다음 설명된 대로 스크립트를 실행하면 다음과 같은 출력이 생성됩니다.

[
    {
    "text": "blah blah blah one",
    "author": "John Doe"
    },
    {
    "text": "blah blah blah two",
    "author": "John Doe"
    }
]

Answer 1

원하는 작업을 수행하는 Python 스크립트는 다음과 같습니다.

#!/usr/bin/env python
# -*- coding: ascii -*-
"""filter.py"""

import sys

# Get the file and the maximum line-length as command-line arguments
filepath = sys.argv[1]
maxlen = int(sys.argv[2])

# Initialize a list to store the unfiltered lines
lines = []

# Read the data file line-by-line
jsonfile = open(filepath, 'r')
for line in jsonfile:

    # Only consider non-empty lines
    if line:

        # For "text" lines that are too line, remove the previous line
        # and also skip the next two line
        if "text" in line and len(line) > maxlen: 
            lines.pop()
            next(jsonfile)
            next(jsonfile)
        # Add all other lines to the list
        else:
            lines.append(line)

# Strip trailing comma from the last object
lines[-2] = lines[-2].replace(',', '')

# Output the lines from the list
for line in lines:
    sys.stdout.write(line)

다음과 같이 실행할 수 있습니다.

python filter.py data.json 34

다음과 같은 데이터 파일이 있다고 가정합니다.

[
    {
    "text": "blah blah blah one",
    "author": "John Doe"
    },
    {
    "text": "blah blah blah two",
    "author": "John Doe"
    },
    {
    "text": "blah blah blah three",
    "author": "John Doe"
    }
]

그런 다음 설명된 대로 스크립트를 실행하면 다음과 같은 출력이 생성됩니다.

[
    {
    "text": "blah blah blah one",
    "author": "John Doe"
    },
    {
    "text": "blah blah blah two",
    "author": "John Doe"
    }
]

JSON에서 X보다 긴 줄을 찾아 전체 개체를 삭제합니다.

답변1

관련 정보