바이너리 파일에서 null로 끝나는 문자열을 읽는 방법

Question

옵션 1: 직접 변수 할당

걱정되는 것이 널 바이트뿐이라면 선호하는 표준 방법을 사용하여 파일의 데이터를 변수로 직접 읽을 수 있어야 합니다. 즉, 널 바이트를 무시하고 파일에서 데이터를 읽을 수 있어야 합니다. . 다음은 cat명령 및 명령 대체를 사용하는 예 입니다 .

$ data="$(cat eeprom)"
$ echo "${data}"
MAC_ADDRESS=12:34:56:78:90,PCB_MAIN_ID=m/SF-1V/MAIN/0.0,PCB_PIGGY1_ID=n/SF-1V/PS/0.0,CSL_HW_VARIANT=D

이것은 BusyBox Docker 컨테이너에서 저에게 효과적이었습니다.

해결 방법 2: `xxd`and `for`루프 사용

더 많은 제어를 원할 경우 xxd바이트를 16진수 문자열로 변환을 사용하고 해당 문자열을 반복할 수 있습니다. 그런 다음 이러한 문자열을 반복할 때 원하는 논리를 적용할 수 있습니다. 예를 들어 초기 null 값을 명시적으로 건너뛰고 일부 중단 조건에 도달할 때까지 나머지 데이터를 인쇄할 수 있습니다.

다음 스크립트는 유효한 문자(ASCII 32~127)의 "허용 목록"을 지정하고, 다른 문자의 하위 시퀀스를 구분 기호로 처리하고, 유효한 하위 문자열을 모두 추출합니다.

#!/bin/sh
# get_hex_substrings.sh

# Get the path to the data-file as a command-line argument
datafile="$1"

# Keep track of state using environment variables
inside_padding_block="true"
inside_bad_block="false"

# NOTE: The '-p' flag is for "plain" output (no additional formatting)
# and the '-c 1' option specifies that the representation of each byte
# will be printed on a separate line
for h in $(xxd -p -c 1 "${datafile}"); do

    # Convert the hex character to standard decimal
    d="$((0x${h}))"

    # Case where we're still inside the initial padding block
    if [ "${inside_padding_block}" == "true" ]; then
        if [ "${d}" -ge 32 ] && [ "${d}" -le 127 ]; then
            inside_padding_block="false";
            printf '\x'"${h}";
        fi

    # Case where we're passed the initial padding, but inside another
    # block of non-printable characters
    elif [ "${inside_bad_block}" == "true" ]; then
        if [ "${d}" -ge 32 ] && [ "${d}" -le 127 ]; then
            inside_bad_block="false";
            printf '\x'"${h}";
        fi

    # Case where we're inside of a substring that we want to extract
    else
        if [ "${d}" -ge 32 ] && [ "${d}" -le 127 ]; then
            printf '\x'"${h}";
        else
            inside_bad_block="true";
            echo
        fi
    fi
done

if [ "${inside_bad_block}" == "false" ]; then
    echo
fi

\x00이제 하위 문자열을 구분하는 하위 시퀀스 합계를 사용하여 샘플 파일을 생성하여 이를 테스트할 수 있습니다 \xff.

printf '\x00\x00\x00string1\xff\xff\xffstring2\x00\x00\x00string3\x00\x00\x00' > data.hex

다음은 스크립트를 실행할 때 얻는 출력입니다.

$ sh get_hex_substrings.sh data.hex
string1
string2
string3

해결 방법 3: `tr`및 `cut`명령 사용

널 바이트를 처리하기 위해 tr및 명령을 사용해 볼 수도 있습니다 . cut다음은 인접한 널 문자를 압착/접고 이를 개행 문자로 변환하여 널 종료 문자열 목록에서 첫 번째 널 종료 문자열을 추출하는 예입니다.

$ printf '\000\000\000string1\000\000\000string2\000\000\000string3\000\000\000' > file.dat
$ tr -s '\000' '\n' < file.dat | cut -d$'\n' -f2
string1

Answer 1

옵션 1: 직접 변수 할당

걱정되는 것이 널 바이트뿐이라면 선호하는 표준 방법을 사용하여 파일의 데이터를 변수로 직접 읽을 수 있어야 합니다. 즉, 널 바이트를 무시하고 파일에서 데이터를 읽을 수 있어야 합니다. . 다음은 cat명령 및 명령 대체를 사용하는 예 입니다 .

$ data="$(cat eeprom)"
$ echo "${data}"
MAC_ADDRESS=12:34:56:78:90,PCB_MAIN_ID=m/SF-1V/MAIN/0.0,PCB_PIGGY1_ID=n/SF-1V/PS/0.0,CSL_HW_VARIANT=D

이것은 BusyBox Docker 컨테이너에서 저에게 효과적이었습니다.

해결 방법 2: `xxd`and `for`루프 사용

더 많은 제어를 원할 경우 xxd바이트를 16진수 문자열로 변환을 사용하고 해당 문자열을 반복할 수 있습니다. 그런 다음 이러한 문자열을 반복할 때 원하는 논리를 적용할 수 있습니다. 예를 들어 초기 null 값을 명시적으로 건너뛰고 일부 중단 조건에 도달할 때까지 나머지 데이터를 인쇄할 수 있습니다.

다음 스크립트는 유효한 문자(ASCII 32~127)의 "허용 목록"을 지정하고, 다른 문자의 하위 시퀀스를 구분 기호로 처리하고, 유효한 하위 문자열을 모두 추출합니다.

#!/bin/sh
# get_hex_substrings.sh

# Get the path to the data-file as a command-line argument
datafile="$1"

# Keep track of state using environment variables
inside_padding_block="true"
inside_bad_block="false"

# NOTE: The '-p' flag is for "plain" output (no additional formatting)
# and the '-c 1' option specifies that the representation of each byte
# will be printed on a separate line
for h in $(xxd -p -c 1 "${datafile}"); do

    # Convert the hex character to standard decimal
    d="$((0x${h}))"

    # Case where we're still inside the initial padding block
    if [ "${inside_padding_block}" == "true" ]; then
        if [ "${d}" -ge 32 ] && [ "${d}" -le 127 ]; then
            inside_padding_block="false";
            printf '\x'"${h}";
        fi

    # Case where we're passed the initial padding, but inside another
    # block of non-printable characters
    elif [ "${inside_bad_block}" == "true" ]; then
        if [ "${d}" -ge 32 ] && [ "${d}" -le 127 ]; then
            inside_bad_block="false";
            printf '\x'"${h}";
        fi

    # Case where we're inside of a substring that we want to extract
    else
        if [ "${d}" -ge 32 ] && [ "${d}" -le 127 ]; then
            printf '\x'"${h}";
        else
            inside_bad_block="true";
            echo
        fi
    fi
done

if [ "${inside_bad_block}" == "false" ]; then
    echo
fi

\x00이제 하위 문자열을 구분하는 하위 시퀀스 합계를 사용하여 샘플 파일을 생성하여 이를 테스트할 수 있습니다 \xff.

printf '\x00\x00\x00string1\xff\xff\xffstring2\x00\x00\x00string3\x00\x00\x00' > data.hex

다음은 스크립트를 실행할 때 얻는 출력입니다.

$ sh get_hex_substrings.sh data.hex
string1
string2
string3

해결 방법 3: `tr`및 `cut`명령 사용

널 바이트를 처리하기 위해 tr및 명령을 사용해 볼 수도 있습니다 . cut다음은 인접한 널 문자를 압착/접고 이를 개행 문자로 변환하여 널 종료 문자열 목록에서 첫 번째 널 종료 문자열을 추출하는 예입니다.

$ printf '\000\000\000string1\000\000\000string2\000\000\000string3\000\000\000' > file.dat
$ tr -s '\000' '\n' < file.dat | cut -d$'\n' -f2
string1

바이너리 파일에서 null로 끝나는 문자열을 읽는 방법

답변1

옵션 1: 직접 변수 할당

해결 방법 2: `xxd`and `for`루프 사용

해결 방법 3: `tr`및 `cut`명령 사용

관련 정보

답변1

옵션 1: 직접 변수 할당

해결 방법 2: xxdand for루프 사용

해결 방법 3: tr및 cut명령 사용

관련 정보

해결 방법 2: `xxd`and `for`루프 사용

해결 방법 3: `tr`및 `cut`명령 사용