난수를 포함하는 1GB 텍스트 파일을 생성하는 가장 빠른 방법은 무엇입니까?

Question 1

이것:

 LC_ALL=C tr '\0-\377' \
             '[0*25][1*25][2*25][3*25][4*25][5*25][6*25][7*25][8*25][9*25][x*]' \
    < /dev/urandom |
    tr -d x |
    fold -w 1 |
    paste -sd "$(printf '%99s\\n')" - |
    head -c1G

( head지원되는 구현을 가정 -c) 내 시스템에서는 꽤 빠른 것 같습니다.

tr전체 바이트 범위(0에서 255, 8진수로 0에서 0377)를 변환합니다. 처음 25바이트는 0이고 다음 25바이트는 1입니다. 그런 다음 25 9이고 나머지(250에서 255)는 "x ”입니다. tr -d x우리는 균일한 분포( /dev/urandom균일한 분포를 가지고 있다고 가정)를 원하므로 일부 숫자에 편향이 발생하지 않도록 하기 위해 이를 "x"로 변환합니다 .

이는 바이트의 97%에 대해 한 자리 숫자를 생성합니다 /dev/urandom. fold -w 1한 줄에 한 자리씩 만드세요. paste -s호출은 99개의 공백 문자와 개행 문자로 구성된 구분 기호 목록을 사용하여 이루어지므로 각 줄에는 공백으로 구분된 100개의 숫자가 있습니다.

head -c1G그중 첫 번째 GiB(2 ³⁰ )를 얻습니다. 마지막 줄은 잘리고 제한이 없습니다. 2 ³⁰ -1 로 자르고 누락된 줄 바꿈을 수동으로 추가하거나 200바이트 라인 중 5천만 개에 해당하는 10 ⁹ 바이트 로자를 수 있습니다 ( head -n 50000000이 역시 표준/이식 가능한 명령이 됩니다).

zsh쿼드 코어 시스템에서 얻은 이러한 타이밍은 CPU 시간이 소비되는 위치를 나타냅니다.

LC_ALL=C tr '\0-\377'  < /dev/urandom  0.61s user 31.28s system 99% cpu 31.904 total
tr -d x  1.00s user 0.27s system 3% cpu 31.903 total
fold -w 1  14.93s user 0.48s system 48% cpu 31.902 total
paste -sd "$(printf '%99s\\n')" -  7.23s user 0.08s system 22% cpu 31.899 total
head -c1G > /dev/null  0.49s user 1.21s system 5% cpu 31.898 total

첫 번째 tr는 병목 현상입니다. 대부분의 시간은 커널에서 소비됩니다(난수 생성을 위해). 시간은 바이트를 얻을 수 있는 속도와 거의 일치합니다 /dev/uramdom(약 19MiB/s, 여기서는 32MiB/s에서 /dev/urandom의 0.97바이트마다 2바이트를 생성합니다). fold각 바이트 뒤에 줄바꿈을 삽입하는 것만으로도 무리한 CPU 시간(15초)이 소요되는 것 같지만, 제 경우에는 다른 CPU에서 작동했기 때문에 전체 시간에는 영향을 미치지 않습니다(해당 -b옵션을 추가해서 만든 것입니다. 효율적이고 dd cbs=1 conv=unblock더 나은 선택인 것 같습니다).

서브셸에서 head -c1G파일 크기 제한을 설정하여 (대부분의 다른 셸(포함) limit filesize 1024m사용 zsh또는 사용) ulimit -f "$((1024*1024))"재정의하고 몇 초를 절약 할 수 있습니다.zsh

각 바이트에 대해 2자리를 추출하면 상황이 개선될 수 있지만 다른 접근 방식을 취해야 합니다. 위의 코드는 tr256바이트 배열에서 모든 바이트만 찾으면 되기 때문에 매우 효율적입니다. 한 번에 2바이트에 대해 이 작업을 수행할 수 없으며 hexdump -e '1/1 "%02u"'더 복잡한 알고리즘을 사용하여 유사한 접근 방식을 사용하여 바이트의 텍스트 표현을 계산하는 것은 난수 생성 자체보다 비용이 더 많이 듭니다. 그러나 내 경우처럼 사용 가능한 CPU 코어가 있으면 여전히 몇 초를 절약할 수 있습니다.

그리고:

< /dev/urandom LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]' |
  tr -d x |
  hexdump -n250000000 -ve '500/1 "%02u" "\n"' |
  fold -w1 |
  paste -sd "$(printf '%99s\\n')" - > /dev/null

나는 다음과 같은 결과를 얻습니다(단, 1,073,741,824가 아니라 1,000,000,000바이트라는 점에 유의하세요).

LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]' < /dev/urandom  0.32s user 18.83s system 70% cpu 27.001 total
tr -d x  2.17s user 0.09s system 8% cpu 27.000 total
hexdump -n250000000 -ve '500/1 "%02u" "\n"'  26.79s user 0.17s system 99% cpu 27.000 total
fold -w1  14.42s user 0.67s system 55% cpu 27.000 total
paste -sd "$(printf '%99s\\n')" - > /dev/null  8.00s user 0.23s system 30% cpu 26.998 total

전반적으로 CPU 시간은 더 많지만 4개의 CPU 코어에 더 잘 분산되므로 벽시계 시간을 덜 차지하게 됩니다. 이제 병목 현상이 발생합니다 hexdump.

dd대신 행 기반을 사용 하면 fold실제로 수행해야 하는 작업량을 줄이고 hexdumpCPU 간 작업 균형을 향상할 수 있습니다.

< /dev/urandom LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]' |
  tr -d x |
  hexdump -ve '"%02u"' |
  dd bs=50000 count=10000 iflag=fullblock status=none cbs=1 conv=unblock |
  paste -sd "$(printf '%99s\\n')" -

(여기서 GNU dd와 iflag=fullblockGNU를 가정 status=none), 다음을 제공합니다:

LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]' < /dev/urandom  0.32s user 15.58s system 99% cpu 15.915 total
tr -d x  1.62s user 0.16s system 11% cpu 15.914 total
hexdump -ve '"%02u"'  10.90s user 0.32s system 70% cpu 15.911 total
dd bs=50000 count=10000 iflag=fullblock status=none cbs=1 conv=unblock  5.44s user 0.19s system 35% cpu 15.909 total
paste -sd "$(printf '%99s\\n')" - > /dev/null  5.50s user 0.30s system 36% cpu 15.905 total

난수 생성의 병목 현상으로 돌아갑니다.

이제 @OleTange가 지적했듯이 해당 유틸리티가 있으면 openssl이를 사용하여 더 빠른(특히 AES 명령어가 있는 프로세서에서) 의사 무작위 바이트 생성기를 얻을 수 있습니다.

</dev/zero openssl enc -aes-128-ctr -nosalt -pass file:/dev/urandom

내 시스템에서 초당 분출되는 바이트 수는 입니다 /dev/urandom. (비교할 수는 없습니다.)암호화 방식으로 안전한 무작위 소스이것이 귀하의 사용 사례에 적용되는 경우).

</dev/zero openssl enc -aes-128-ctr -nosalt -pass file:/dev/urandom 2> /dev/null | 
  LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]' |
  tr -d x |
  hexdump -ve '"%02u"' |
  dd bs=50000 count=10000 iflag=fullblock status=none cbs=1 conv=unblock |
  paste -sd "$(printf '%99s\\n')" -

이제 주어진:

openssl enc -aes-128-ctr -nosalt -pass file:/dev/urandom < /dev/zero 2>   1.13s user 0.16s system 12% cpu 10.174 total
LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]'  0.56s user 0.20s system 7% cpu 10.173 total
tr -d x  2.50s user 0.10s system 25% cpu 10.172 total
hexdump -ve '"%02u"'  9.96s user 0.19s system 99% cpu 10.172 total
dd bs=50000 count=10000 iflag=fullblock status=none cbs=1 conv=unblock  4.38s user 0.20s system 45% cpu 10.171 total
paste -sd "$(printf '%99s\\n')" - > /dev/null

hexdump병목 현상으로 돌아갑니다 .

아직 여유 CPU가 있기 때문에 그 중 3개를 병렬로 실행할 수 있습니다 hexdump.

</dev/zero openssl enc -aes-128-ctr -nosalt -pass file:/dev/urandom 2> /dev/null | 
  LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]' |
  tr -d x |
  (hexdump -ve '"%02u"' <&3 & hexdump -ve '"%02u"' <&3 & hexdump -ve '"%02u"') 3<&0 |
  dd bs=50000 count=10000 iflag=fullblock status=none cbs=1 conv=unblock |
  paste -sd "$(printf '%99s\\n')" -

(이는 백그라운드에서 실행될 때 /dev/null의 명령을 닫는 stdin을 <&3제외한 모든 쉘에 필요합니다 .)zsh

이제 6.2초로 단축되어 CPU가 거의 완전히 활용됩니다.

Answer

이것:

 LC_ALL=C tr '\0-\377' \
             '[0*25][1*25][2*25][3*25][4*25][5*25][6*25][7*25][8*25][9*25][x*]' \
    < /dev/urandom |
    tr -d x |
    fold -w 1 |
    paste -sd "$(printf '%99s\\n')" - |
    head -c1G

( head지원되는 구현을 가정 -c) 내 시스템에서는 꽤 빠른 것 같습니다.

tr전체 바이트 범위(0에서 255, 8진수로 0에서 0377)를 변환합니다. 처음 25바이트는 0이고 다음 25바이트는 1입니다. 그런 다음 25 9이고 나머지(250에서 255)는 "x ”입니다. tr -d x우리는 균일한 분포( /dev/urandom균일한 분포를 가지고 있다고 가정)를 원하므로 일부 숫자에 편향이 발생하지 않도록 하기 위해 이를 "x"로 변환합니다 .

이는 바이트의 97%에 대해 한 자리 숫자를 생성합니다 /dev/urandom. fold -w 1한 줄에 한 자리씩 만드세요. paste -s호출은 99개의 공백 문자와 개행 문자로 구성된 구분 기호 목록을 사용하여 이루어지므로 각 줄에는 공백으로 구분된 100개의 숫자가 있습니다.

head -c1G그중 첫 번째 GiB(2 ³⁰ )를 얻습니다. 마지막 줄은 잘리고 제한이 없습니다. 2 ³⁰ -1 로 자르고 누락된 줄 바꿈을 수동으로 추가하거나 200바이트 라인 중 5천만 개에 해당하는 10 ⁹ 바이트 로자를 수 있습니다 ( head -n 50000000이 역시 표준/이식 가능한 명령이 됩니다).

zsh쿼드 코어 시스템에서 얻은 이러한 타이밍은 CPU 시간이 소비되는 위치를 나타냅니다.

LC_ALL=C tr '\0-\377'  < /dev/urandom  0.61s user 31.28s system 99% cpu 31.904 total
tr -d x  1.00s user 0.27s system 3% cpu 31.903 total
fold -w 1  14.93s user 0.48s system 48% cpu 31.902 total
paste -sd "$(printf '%99s\\n')" -  7.23s user 0.08s system 22% cpu 31.899 total
head -c1G > /dev/null  0.49s user 1.21s system 5% cpu 31.898 total

첫 번째 tr는 병목 현상입니다. 대부분의 시간은 커널에서 소비됩니다(난수 생성을 위해). 시간은 바이트를 얻을 수 있는 속도와 거의 일치합니다 /dev/uramdom(약 19MiB/s, 여기서는 32MiB/s에서 /dev/urandom의 0.97바이트마다 2바이트를 생성합니다). fold각 바이트 뒤에 줄바꿈을 삽입하는 것만으로도 무리한 CPU 시간(15초)이 소요되는 것 같지만, 제 경우에는 다른 CPU에서 작동했기 때문에 전체 시간에는 영향을 미치지 않습니다(해당 -b옵션을 추가해서 만든 것입니다. 효율적이고 dd cbs=1 conv=unblock더 나은 선택인 것 같습니다).

서브셸에서 head -c1G파일 크기 제한을 설정하여 (대부분의 다른 셸(포함) limit filesize 1024m사용 zsh또는 사용) ulimit -f "$((1024*1024))"재정의하고 몇 초를 절약 할 수 있습니다.zsh

각 바이트에 대해 2자리를 추출하면 상황이 개선될 수 있지만 다른 접근 방식을 취해야 합니다. 위의 코드는 tr256바이트 배열에서 모든 바이트만 찾으면 되기 때문에 매우 효율적입니다. 한 번에 2바이트에 대해 이 작업을 수행할 수 없으며 hexdump -e '1/1 "%02u"'더 복잡한 알고리즘을 사용하여 유사한 접근 방식을 사용하여 바이트의 텍스트 표현을 계산하는 것은 난수 생성 자체보다 비용이 더 많이 듭니다. 그러나 내 경우처럼 사용 가능한 CPU 코어가 있으면 여전히 몇 초를 절약할 수 있습니다.

그리고:

< /dev/urandom LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]' |
  tr -d x |
  hexdump -n250000000 -ve '500/1 "%02u" "\n"' |
  fold -w1 |
  paste -sd "$(printf '%99s\\n')" - > /dev/null

나는 다음과 같은 결과를 얻습니다(단, 1,073,741,824가 아니라 1,000,000,000바이트라는 점에 유의하세요).

LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]' < /dev/urandom  0.32s user 18.83s system 70% cpu 27.001 total
tr -d x  2.17s user 0.09s system 8% cpu 27.000 total
hexdump -n250000000 -ve '500/1 "%02u" "\n"'  26.79s user 0.17s system 99% cpu 27.000 total
fold -w1  14.42s user 0.67s system 55% cpu 27.000 total
paste -sd "$(printf '%99s\\n')" - > /dev/null  8.00s user 0.23s system 30% cpu 26.998 total

전반적으로 CPU 시간은 더 많지만 4개의 CPU 코어에 더 잘 분산되므로 벽시계 시간을 덜 차지하게 됩니다. 이제 병목 현상이 발생합니다 hexdump.

dd대신 행 기반을 사용 하면 fold실제로 수행해야 하는 작업량을 줄이고 hexdumpCPU 간 작업 균형을 향상할 수 있습니다.

< /dev/urandom LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]' |
  tr -d x |
  hexdump -ve '"%02u"' |
  dd bs=50000 count=10000 iflag=fullblock status=none cbs=1 conv=unblock |
  paste -sd "$(printf '%99s\\n')" -

(여기서 GNU dd와 iflag=fullblockGNU를 가정 status=none), 다음을 제공합니다:

LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]' < /dev/urandom  0.32s user 15.58s system 99% cpu 15.915 total
tr -d x  1.62s user 0.16s system 11% cpu 15.914 total
hexdump -ve '"%02u"'  10.90s user 0.32s system 70% cpu 15.911 total
dd bs=50000 count=10000 iflag=fullblock status=none cbs=1 conv=unblock  5.44s user 0.19s system 35% cpu 15.909 total
paste -sd "$(printf '%99s\\n')" - > /dev/null  5.50s user 0.30s system 36% cpu 15.905 total

난수 생성의 병목 현상으로 돌아갑니다.

이제 @OleTange가 지적했듯이 해당 유틸리티가 있으면 openssl이를 사용하여 더 빠른(특히 AES 명령어가 있는 프로세서에서) 의사 무작위 바이트 생성기를 얻을 수 있습니다.

</dev/zero openssl enc -aes-128-ctr -nosalt -pass file:/dev/urandom

내 시스템에서 초당 분출되는 바이트 수는 입니다 /dev/urandom. (비교할 수는 없습니다.)암호화 방식으로 안전한 무작위 소스이것이 귀하의 사용 사례에 적용되는 경우).

</dev/zero openssl enc -aes-128-ctr -nosalt -pass file:/dev/urandom 2> /dev/null | 
  LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]' |
  tr -d x |
  hexdump -ve '"%02u"' |
  dd bs=50000 count=10000 iflag=fullblock status=none cbs=1 conv=unblock |
  paste -sd "$(printf '%99s\\n')" -

이제 주어진:

openssl enc -aes-128-ctr -nosalt -pass file:/dev/urandom < /dev/zero 2>   1.13s user 0.16s system 12% cpu 10.174 total
LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]'  0.56s user 0.20s system 7% cpu 10.173 total
tr -d x  2.50s user 0.10s system 25% cpu 10.172 total
hexdump -ve '"%02u"'  9.96s user 0.19s system 99% cpu 10.172 total
dd bs=50000 count=10000 iflag=fullblock status=none cbs=1 conv=unblock  4.38s user 0.20s system 45% cpu 10.171 total
paste -sd "$(printf '%99s\\n')" - > /dev/null

hexdump병목 현상으로 돌아갑니다 .

아직 여유 CPU가 있기 때문에 그 중 3개를 병렬로 실행할 수 있습니다 hexdump.

</dev/zero openssl enc -aes-128-ctr -nosalt -pass file:/dev/urandom 2> /dev/null | 
  LC_ALL=C tr '\0-\377' '\0-\143\0-\143[x*]' |
  tr -d x |
  (hexdump -ve '"%02u"' <&3 & hexdump -ve '"%02u"' <&3 & hexdump -ve '"%02u"') 3<&0 |
  dd bs=50000 count=10000 iflag=fullblock status=none cbs=1 conv=unblock |
  paste -sd "$(printf '%99s\\n')" -

(이는 백그라운드에서 실행될 때 /dev/null의 명령을 닫는 stdin을 <&3제외한 모든 쉘에 필요합니다 .)zsh

이제 6.2초로 단축되어 CPU가 거의 완전히 활용됩니다.

Question 2

이것은 질문 제목으로 인해 부분적으로 혀를 맞대는 답변입니다.

당신이 찾을 때"가장 빠른 방법은...", 대답은 거의 항상 전문적인 도구입니다. 이 "답변"은 실험할 수 있는 도구 중 하나를 보여줍니다.

한 번만 수행하거나 거의 수행하지 않는 작업을 위한 전문 도구를 찾아서는 안 되므로 진지한 대답은 아닙니다. 실제로 일을 하는 것보다 도구를 찾고 배우는 데 더 많은 시간을 소비하게 됩니다. bash및 와 같은 셸 및 유틸리티는 awk가장 빠르지는 않지만 일반적으로 다음과 같이 작성할 수 있습니다.하나의 선작업을 수행하는 데는 몇 초 밖에 걸리지 않습니다. perl학습 곡선이 가파르지만 이와 같은 더 나은 스크립팅 언어를 사용하는 것도 가능하며 perl나쁜 Perl 프로젝트로 인해 상처를 입었기 때문에 그러한 목적으로 권장하는 것을 주저합니다. python반면에 느린 I/O 속도로 인해 약간의 단점이 있지만 이는 기가바이트의 데이터를 필터링하거나 생성하는 경우에만 문제가 됩니다.

그럼에도 불구하고 다음 C89 예제 프로그램(고정밀 시계에 사용할 수 있는 경우에만 POSIX.1 사용)은 약 100MB/s의 생성 속도를 달성해야 합니다(Intel i5-4200U 프로세서가 탑재된 노트북의 Linux). 테스트에서 출력을 파이프하십시오. ) /dev/null, 꽤 좋은 의사 난수 생성기를 사용합니다. (출력은 MatrixRank 테스트를 제외한 모든 BigCrunch 테스트를 통과해야 합니다.XOR 시프트 64*수치적 편향을 피하기 위한 제거 방법. )

십진수.c:

#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <locale.h>
#include <ctype.h>
#include <stdio.h>
#include <errno.h>
#include <time.h>

/* This program is licensed under the CC0 license,
       https://creativecommons.org/publicdomain/zero/1.0/
   In other words, this is dedicated to the public domain.
   There are no warranties either, so if something breaks,
   you only have yourself to blame.
*/

#if _POSIX_C_SOURCE-199309 >= 0
static uint64_t time_seed(void)
{
    struct timespec  ts;

    if (clock_gettime(CLOCK_REALTIME, &ts))
        return (uint64_t)time(NULL);

    return (uint64_t)ts.tv_sec
         ^ (((uint64_t)ts.tv_nsec) << 32);
}
#else
static uint64_t time_seed(void)
{
    return (uint64_t)time(NULL);
}
#endif

/* Preferred output I/O block size.
 * Currently, about 128k blocks yield
 * maximum I/O throughput on most devices.
 * Note that this is a heuristic value,
 * and may be increased in the future.
*/
#ifndef  IO_BLOCK_SIZE
#define  IO_BLOCK_SIZE  262144
#endif

/* This is the Xorshift* pseudo-random number generator.
 * See https://en.wikipedia.org/wiki/Xorshift#xorshift.2A
 * for details. This is an incredibly fast generator that
 * passes all but the MatrixRank test of the BigCrush
 * randomness test suite, with a period of 2^64-1.
 * Note that neither xorshift_state, nor the result of
 * this function, will ever be zero.
*/
static uint64_t xorshift_state;

static uint64_t xorshift_u64(void)
{
    xorshift_state ^= xorshift_state >> 12;
    xorshift_state ^= xorshift_state << 25;
    xorshift_state ^= xorshift_state >> 27;
    return xorshift_state * UINT64_C(2685821657736338717);
}

/* This function returns a number between (inclusive)
 * 0 and 999,999,999,999,999,999 using xorshift_u64()
 * above, using the exclusion method. Thus, there is
 * no bias in the results, and each digit should be
 * uniformly distributed in 0-9.
*/
static uint64_t quintillion(void)
{
    uint64_t result;

    do {
        result = xorshift_u64() & UINT64_C(1152921504606846975);
    } while (!result || result > UINT64_C(1000000000000000000));

    return result - UINT64_C(1);
}

/* This function returns a single uniformly random digit.
*/
static unsigned char digit(void)
{
    static uint64_t       digits_cache = 0;
    static unsigned char  digits_cached = 0;
    unsigned char         retval;

    if (!digits_cached) {
        digits_cache = quintillion();
        digits_cached = 17; /* We steal the first one! */
    } else
        digits_cached--;
    
    retval = digits_cache % (uint64_t)(10);
    digits_cache /= (uint64_t)(10);

    return retval;
}

static int parse_ulong(const char *src, unsigned long *to)
{
    const char   *end = src;
    unsigned long value;

    if (!src)
        return errno = EINVAL;

    errno = 0;
    value = strtoul(src, (char **)&end, 0);
    if (errno)
        return errno;

    if (end == src)
        return errno = EINVAL;
    while (*end)
        if (isspace(*end))
            end++;
        else
            return errno = EINVAL;

    if (to)
        *to = value;
    return 0;
}

int main(int argc, char *argv[])
{
    unsigned long lines, cols, line, col, seed;
    
    /* When parsing the command-line parameters,
     * use locale conventions. */
    setlocale(LC_ALL, "");

    /* Standard output should be fully buffered, if possible.
     * This only affects output speed, so we're not too worried
     * if this happens to fail. */
    (void)setvbuf(stdout, NULL, _IOFBF, (size_t)IO_BLOCK_SIZE);

    if (argc < 3 || argc > 4 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
        fprintf(stderr, "\n");
        fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
        fprintf(stderr, "       %s COLS LINES [ SEED ]\n", argv[0]);
        fprintf(stderr, "\n");
        fprintf(stderr, "This program generates random decimal digits\n");
        fprintf(stderr, "0 - 9, separated by spaces, COLS per line,\n");
        fprintf(stderr, "LINES lines.  In total, COLS*LINES*2 bytes\n");
        fprintf(stderr, "will be used.\n");
        fprintf(stderr, "\n");
        fprintf(stderr, "SEED is the optional seed for the Xorshift64*\n");
        fprintf(stderr, "pseudo-random number generator used in this program.\n");
        fprintf(stderr, "If omitted, current time is used as the seed.\n");
        fprintf(stderr, "\n");
        return EXIT_SUCCESS;
    }

    if (parse_ulong(argv[1], &cols) || cols < 1UL) {
        fprintf(stderr, "%s: Invalid number of digits per line.\n", argv[1]);
        return EXIT_FAILURE;
    }
    if (parse_ulong(argv[2], &lines) || lines < 1UL) {
        fprintf(stderr, "%s: Invalid number of lines.\n", argv[2]);
        return EXIT_FAILURE;
    }

    if (argc > 3) {
        if (parse_ulong(argv[3], &seed)) {
            fprintf(stderr, "%s: Invalid Xorshift64* seed.\n", argv[3]);
            return EXIT_FAILURE;
        }
    } else
        seed = time_seed();

    /* Since zero seed is invalid, we map it to ~0. */
    xorshift_state = seed;
    if (!xorshift_state)
        xorshift_state = ~(uint64_t)0;

    /* Discard first 1000 values to make the initial values unpredictable. */
    for (col = 0; col < 1000; col++)
        xorshift_u64();

    for (line = 0UL; line < lines; line++) {
        fputc('0' + digit(), stdout);
        for (col = 1UL; col < cols; col++) {
            fputc(' ', stdout);
            fputc('0' + digit(), stdout);
        }
        fputc('\n', stdout);

        /* Check for write errors. */
        if (ferror(stdout))
            return EXIT_FAILURE;
    }

    return EXIT_SUCCESS;
}

fwrite()라인 버퍼로 전환하여 한 번에 하나씩 출력하는 대신 한 번에 하나씩 각 숫자를 출력하면 더 빠르게 만들 수 있습니다. 출력이 블록 장치인 경우 부분(2의 거듭제곱이 아닌) 쓰기를 방지하기 위해 스트림을 완전히 버퍼링된 상태로 유지합니다.

#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <locale.h>
#include <ctype.h>
#include <stdio.h>
#include <errno.h>
#include <time.h>

#if _POSIX_C_SOURCE-199309 >= 0
static uint64_t time_seed(void)
{
    struct timespec  ts;

    if (clock_gettime(CLOCK_REALTIME, &ts))
        return (uint64_t)time(NULL);

    return (uint64_t)ts.tv_sec
         ^ (((uint64_t)ts.tv_nsec) << 32);
}
#else
static uint64_t time_seed(void)
{
    return (uint64_t)time(NULL);
}
#endif

/* Preferred output I/O block size.
 * Currently, about 128k blocks yield
 * maximum I/O throughput on most devices.
 * Note that this is a heuristic value,
 * and may be increased in the future.
*/
#ifndef  IO_BLOCK_SIZE
#define  IO_BLOCK_SIZE  262144
#endif

/* This is the Xorshift* pseudo-random number generator.
 * See https://en.wikipedia.org/wiki/Xorshift#xorshift.2A
 * for details. This is an incredibly fast generator that
 * passes all but the MatrixRank test of the BigCrush
 * randomness test suite, with a period of 2^64-1.
 * Note that neither xorshift_state, nor the result of
 * this function, will ever be zero.
*/
static uint64_t xorshift_state;

static uint64_t xorshift_u64(void)
{
    xorshift_state ^= xorshift_state >> 12;
    xorshift_state ^= xorshift_state << 25;
    xorshift_state ^= xorshift_state >> 27;
    return xorshift_state * UINT64_C(2685821657736338717);
}

/* This function returns a number between (inclusive)
 * 0 and 999,999,999,999,999,999 using xorshift_u64()
 * above, using the exclusion method. Thus, there is
 * no bias in the results, and each digit should be
 * uniformly distributed in 0-9.
*/
static uint64_t quintillion(void)
{
    uint64_t result;

    do {
        result = xorshift_u64() & UINT64_C(1152921504606846975);
    } while (!result || result > UINT64_C(1000000000000000000));

    return result - UINT64_C(1);
}

/* This function returns a single uniformly random digit.
*/
static unsigned char digit(void)
{
    static uint64_t       digits_cache = 0;
    static unsigned char  digits_cached = 0;
    unsigned char         retval;

    if (!digits_cached) {
        digits_cache = quintillion();
        digits_cached = 17; /* We steal the first one! */
    } else
        digits_cached--;
    
    retval = digits_cache % (uint64_t)(10);
    digits_cache /= (uint64_t)(10);

    return retval;
}

static int parse_ulong(const char *src, unsigned long *to)
{
    const char   *end = src;
    unsigned long value;

    if (!src)
        return errno = EINVAL;

    errno = 0;
    value = strtoul(src, (char **)&end, 0);
    if (errno)
        return errno;

    if (end == src)
        return errno = EINVAL;
    while (*end)
        if (isspace(*end))
            end++;
        else
            return errno = EINVAL;

    if (to)
        *to = value;
    return 0;
}

int main(int argc, char *argv[])
{
    unsigned long lines, cols, line, col, seed;
    char         *oneline;
    
    /* When parsing the command-line parameters,
     * use locale conventions. */
    setlocale(LC_ALL, "");

    /* Standard output should be fully buffered, if possible.
     * This only affects output speed, so we're not too worried
     * if this happens to fail. */
    (void)setvbuf(stdout, NULL, _IOFBF, (size_t)IO_BLOCK_SIZE);

    if (argc < 3 || argc > 4 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
        fprintf(stderr, "\n");
        fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
        fprintf(stderr, "       %s COLS LINES [ SEED ]\n", argv[0]);
        fprintf(stderr, "\n");
        fprintf(stderr, "This program generates random decimal digits\n");
        fprintf(stderr, "0 - 9, separated by spaces, COLS per line,\n");
        fprintf(stderr, "LINES lines.  In total, COLS*LINES*2 bytes\n");
        fprintf(stderr, "will be used.\n");
        fprintf(stderr, "\n");
        fprintf(stderr, "SEED is the optional seed for the Xorshift64*\n");
        fprintf(stderr, "pseudo-random number generator used in this program.\n");
        fprintf(stderr, "If omitted, current time is used as the seed.\n");
        fprintf(stderr, "\n");
        return EXIT_SUCCESS;
    }

    if (parse_ulong(argv[1], &cols) || cols < 1UL) {
        fprintf(stderr, "%s: Invalid number of digits per line.\n", argv[1]);
        return EXIT_FAILURE;
    }
    if (parse_ulong(argv[2], &lines) || lines < 1UL) {
        fprintf(stderr, "%s: Invalid number of lines.\n", argv[2]);
        return EXIT_FAILURE;
    }

    if (argc > 3) {
        if (parse_ulong(argv[3], &seed)) {
            fprintf(stderr, "%s: Invalid Xorshift64* seed.\n", argv[3]);
            return EXIT_FAILURE;
        }
    } else
        seed = time_seed();

    /* Since zero seed is invalid, we map it to ~0. */
    xorshift_state = seed;
    if (!xorshift_state)
        xorshift_state = ~(uint64_t)0;

    /* Discard first 1000 values to make the initial values unpredictable. */
    for (col = 0; col < 1000; col++)
        xorshift_u64();

    /* Allocate memory for a full line. */
    oneline = malloc((size_t)(2 * cols + 1));
    if (!oneline) {
        fprintf(stderr, "Not enough memory for %lu column buffer.\n", cols);
        return EXIT_FAILURE;
    }

    /* Set spaces and terminating newline. */
    for (col = 0; col < cols; col++)
        oneline[2*col + 1] = ' ';
    oneline[2*cols-1] = '\n';

    /* Not needed, but in case a code modification treats it as a string. */
    oneline[2*cols] = '\0';

    for (line = 0UL; line < lines; line++) {
        for (col = 0UL; col < cols; col++)
            oneline[2*col] = digit();

        if (fwrite(oneline, 2*cols, 1, stdout) != 1)
            return EXIT_FAILURE; 
    }

    /* Check for write errors. */
    if (ferror(stdout))
        return EXIT_FAILURE;

    return EXIT_SUCCESS;
}

참고: 두 예시 모두 2016년 11월 18일에 편집되었습니다.확실하게 하다숫자의 균일한 분포(0 제외, 예를 들어여기다양한 의사 난수 생성기에 대한 비교 및 세부 정보)

예를 들어 다음을 사용하여 컴파일합니다.

gcc -Wall -O2 decimal-digits.c -o decimal-digits

선택적으로 시스템 전체에 설치하여 /usr/bin사용할 수 있습니다.

sudo install -o root -g root -m 0755 decimal-digits /usr/bin

행당 비트 수와 행 수가 필요합니다. 1000000000 / 100 / 2 = 5000000(500만; 총 바이트를 열 수로 나눈 값을 2로 나눈 값) 이므로 다음을 사용할 수 있습니다.

./decimal-digits 100 5000000 > digits.txt

digits.txtOP의 요구 사항에 따라 기가바이트 크기의 데이터를 생성합니다.

프로그램 자체는 효율성보다는 가독성을 위해 작성되었습니다. 여기서 나의 목적은 코드의 효율성을 보여주는 것이 아닙니다. 어쨌든 저는 일반 C 인터페이스 대신 POSIX.1과 저수준 I/O를 사용할 것입니다. 단일 라인 코드, 짧은 쉘 또는 awk 스크립틀릿에 들인 노력을 개발 관련 도구 및 성능과 비교합니다.

GNU C 라이브러리를 사용하면 각 문자 출력에 대한 오버헤드(간접 함수 호출 또는 조건문 - 보시다시피 인터페이스는 실제로 매우 복잡하고 일반적임) fputc()가 거의 없습니다 . FILE이 특정 Intel Core i5-4200U 노트북에서 출력을 /dev/null첫 번째(fputc) 버전으로 리디렉션하는 데 약 11초가 걸리는 반면, 한 번에 한 줄 버전은 1.3초만 걸립니다.

나는 거대한 데이터 세트로 작업하는 것을 좋아하기 때문에 이와 같은 프로그램과 생성기를 자주 작성합니다. 나에게는 이것이 이상하다. 예를 들어, 구문 분석 시 정확히 동일한 값을 생성할 수 있을 만큼 충분한 정밀도로 모든 유한 양수 IEEE-754 부동 소수점 값을 텍스트 파일에 인쇄하는 프로그램을 작성한 적이 있습니다. 파일 크기는 몇 기가바이트(아마도 4G 정도)입니다. 유한 양수는 float생각만큼 크지 않습니다. 나는 이를 사용하여 그러한 데이터를 읽고 구문 분석하는 구현을 비교합니다.

OP와 같은 일반적인 사용 사례의 경우 쉘 스크립트, 스크립틀릿 및 단일 라이너가 더 나은 접근 방식입니다. 전체 작업을 완료하는 데 시간이 덜 걸립니다. (매일 다른 파일이 필요하거나 다른 파일이 필요한 사람이 많은 경우를 제외하고 위와 같은 전용 도구를 사용하는 것이 그만한 가치가 있을 수 있습니다.)

Answer

이것은 질문 제목으로 인해 부분적으로 혀를 맞대는 답변입니다.

당신이 찾을 때"가장 빠른 방법은...", 대답은 거의 항상 전문적인 도구입니다. 이 "답변"은 실험할 수 있는 도구 중 하나를 보여줍니다.

한 번만 수행하거나 거의 수행하지 않는 작업을 위한 전문 도구를 찾아서는 안 되므로 진지한 대답은 아닙니다. 실제로 일을 하는 것보다 도구를 찾고 배우는 데 더 많은 시간을 소비하게 됩니다. bash및 와 같은 셸 및 유틸리티는 awk가장 빠르지는 않지만 일반적으로 다음과 같이 작성할 수 있습니다.하나의 선작업을 수행하는 데는 몇 초 밖에 걸리지 않습니다. perl학습 곡선이 가파르지만 이와 같은 더 나은 스크립팅 언어를 사용하는 것도 가능하며 perl나쁜 Perl 프로젝트로 인해 상처를 입었기 때문에 그러한 목적으로 권장하는 것을 주저합니다. python반면에 느린 I/O 속도로 인해 약간의 단점이 있지만 이는 기가바이트의 데이터를 필터링하거나 생성하는 경우에만 문제가 됩니다.

그럼에도 불구하고 다음 C89 예제 프로그램(고정밀 시계에 사용할 수 있는 경우에만 POSIX.1 사용)은 약 100MB/s의 생성 속도를 달성해야 합니다(Intel i5-4200U 프로세서가 탑재된 노트북의 Linux). 테스트에서 출력을 파이프하십시오. ) /dev/null, 꽤 좋은 의사 난수 생성기를 사용합니다. (출력은 MatrixRank 테스트를 제외한 모든 BigCrunch 테스트를 통과해야 합니다.XOR 시프트 64*수치적 편향을 피하기 위한 제거 방법. )

십진수.c:

#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <locale.h>
#include <ctype.h>
#include <stdio.h>
#include <errno.h>
#include <time.h>

/* This program is licensed under the CC0 license,
       https://creativecommons.org/publicdomain/zero/1.0/
   In other words, this is dedicated to the public domain.
   There are no warranties either, so if something breaks,
   you only have yourself to blame.
*/

#if _POSIX_C_SOURCE-199309 >= 0
static uint64_t time_seed(void)
{
    struct timespec  ts;

    if (clock_gettime(CLOCK_REALTIME, &ts))
        return (uint64_t)time(NULL);

    return (uint64_t)ts.tv_sec
         ^ (((uint64_t)ts.tv_nsec) << 32);
}
#else
static uint64_t time_seed(void)
{
    return (uint64_t)time(NULL);
}
#endif

/* Preferred output I/O block size.
 * Currently, about 128k blocks yield
 * maximum I/O throughput on most devices.
 * Note that this is a heuristic value,
 * and may be increased in the future.
*/
#ifndef  IO_BLOCK_SIZE
#define  IO_BLOCK_SIZE  262144
#endif

/* This is the Xorshift* pseudo-random number generator.
 * See https://en.wikipedia.org/wiki/Xorshift#xorshift.2A
 * for details. This is an incredibly fast generator that
 * passes all but the MatrixRank test of the BigCrush
 * randomness test suite, with a period of 2^64-1.
 * Note that neither xorshift_state, nor the result of
 * this function, will ever be zero.
*/
static uint64_t xorshift_state;

static uint64_t xorshift_u64(void)
{
    xorshift_state ^= xorshift_state >> 12;
    xorshift_state ^= xorshift_state << 25;
    xorshift_state ^= xorshift_state >> 27;
    return xorshift_state * UINT64_C(2685821657736338717);
}

/* This function returns a number between (inclusive)
 * 0 and 999,999,999,999,999,999 using xorshift_u64()
 * above, using the exclusion method. Thus, there is
 * no bias in the results, and each digit should be
 * uniformly distributed in 0-9.
*/
static uint64_t quintillion(void)
{
    uint64_t result;

    do {
        result = xorshift_u64() & UINT64_C(1152921504606846975);
    } while (!result || result > UINT64_C(1000000000000000000));

    return result - UINT64_C(1);
}

/* This function returns a single uniformly random digit.
*/
static unsigned char digit(void)
{
    static uint64_t       digits_cache = 0;
    static unsigned char  digits_cached = 0;
    unsigned char         retval;

    if (!digits_cached) {
        digits_cache = quintillion();
        digits_cached = 17; /* We steal the first one! */
    } else
        digits_cached--;
    
    retval = digits_cache % (uint64_t)(10);
    digits_cache /= (uint64_t)(10);

    return retval;
}

static int parse_ulong(const char *src, unsigned long *to)
{
    const char   *end = src;
    unsigned long value;

    if (!src)
        return errno = EINVAL;

    errno = 0;
    value = strtoul(src, (char **)&end, 0);
    if (errno)
        return errno;

    if (end == src)
        return errno = EINVAL;
    while (*end)
        if (isspace(*end))
            end++;
        else
            return errno = EINVAL;

    if (to)
        *to = value;
    return 0;
}

int main(int argc, char *argv[])
{
    unsigned long lines, cols, line, col, seed;
    
    /* When parsing the command-line parameters,
     * use locale conventions. */
    setlocale(LC_ALL, "");

    /* Standard output should be fully buffered, if possible.
     * This only affects output speed, so we're not too worried
     * if this happens to fail. */
    (void)setvbuf(stdout, NULL, _IOFBF, (size_t)IO_BLOCK_SIZE);

    if (argc < 3 || argc > 4 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
        fprintf(stderr, "\n");
        fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
        fprintf(stderr, "       %s COLS LINES [ SEED ]\n", argv[0]);
        fprintf(stderr, "\n");
        fprintf(stderr, "This program generates random decimal digits\n");
        fprintf(stderr, "0 - 9, separated by spaces, COLS per line,\n");
        fprintf(stderr, "LINES lines.  In total, COLS*LINES*2 bytes\n");
        fprintf(stderr, "will be used.\n");
        fprintf(stderr, "\n");
        fprintf(stderr, "SEED is the optional seed for the Xorshift64*\n");
        fprintf(stderr, "pseudo-random number generator used in this program.\n");
        fprintf(stderr, "If omitted, current time is used as the seed.\n");
        fprintf(stderr, "\n");
        return EXIT_SUCCESS;
    }

    if (parse_ulong(argv[1], &cols) || cols < 1UL) {
        fprintf(stderr, "%s: Invalid number of digits per line.\n", argv[1]);
        return EXIT_FAILURE;
    }
    if (parse_ulong(argv[2], &lines) || lines < 1UL) {
        fprintf(stderr, "%s: Invalid number of lines.\n", argv[2]);
        return EXIT_FAILURE;
    }

    if (argc > 3) {
        if (parse_ulong(argv[3], &seed)) {
            fprintf(stderr, "%s: Invalid Xorshift64* seed.\n", argv[3]);
            return EXIT_FAILURE;
        }
    } else
        seed = time_seed();

    /* Since zero seed is invalid, we map it to ~0. */
    xorshift_state = seed;
    if (!xorshift_state)
        xorshift_state = ~(uint64_t)0;

    /* Discard first 1000 values to make the initial values unpredictable. */
    for (col = 0; col < 1000; col++)
        xorshift_u64();

    for (line = 0UL; line < lines; line++) {
        fputc('0' + digit(), stdout);
        for (col = 1UL; col < cols; col++) {
            fputc(' ', stdout);
            fputc('0' + digit(), stdout);
        }
        fputc('\n', stdout);

        /* Check for write errors. */
        if (ferror(stdout))
            return EXIT_FAILURE;
    }

    return EXIT_SUCCESS;
}

fwrite()라인 버퍼로 전환하여 한 번에 하나씩 출력하는 대신 한 번에 하나씩 각 숫자를 출력하면 더 빠르게 만들 수 있습니다. 출력이 블록 장치인 경우 부분(2의 거듭제곱이 아닌) 쓰기를 방지하기 위해 스트림을 완전히 버퍼링된 상태로 유지합니다.

#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <locale.h>
#include <ctype.h>
#include <stdio.h>
#include <errno.h>
#include <time.h>

#if _POSIX_C_SOURCE-199309 >= 0
static uint64_t time_seed(void)
{
    struct timespec  ts;

    if (clock_gettime(CLOCK_REALTIME, &ts))
        return (uint64_t)time(NULL);

    return (uint64_t)ts.tv_sec
         ^ (((uint64_t)ts.tv_nsec) << 32);
}
#else
static uint64_t time_seed(void)
{
    return (uint64_t)time(NULL);
}
#endif

/* Preferred output I/O block size.
 * Currently, about 128k blocks yield
 * maximum I/O throughput on most devices.
 * Note that this is a heuristic value,
 * and may be increased in the future.
*/
#ifndef  IO_BLOCK_SIZE
#define  IO_BLOCK_SIZE  262144
#endif

/* This is the Xorshift* pseudo-random number generator.
 * See https://en.wikipedia.org/wiki/Xorshift#xorshift.2A
 * for details. This is an incredibly fast generator that
 * passes all but the MatrixRank test of the BigCrush
 * randomness test suite, with a period of 2^64-1.
 * Note that neither xorshift_state, nor the result of
 * this function, will ever be zero.
*/
static uint64_t xorshift_state;

static uint64_t xorshift_u64(void)
{
    xorshift_state ^= xorshift_state >> 12;
    xorshift_state ^= xorshift_state << 25;
    xorshift_state ^= xorshift_state >> 27;
    return xorshift_state * UINT64_C(2685821657736338717);
}

/* This function returns a number between (inclusive)
 * 0 and 999,999,999,999,999,999 using xorshift_u64()
 * above, using the exclusion method. Thus, there is
 * no bias in the results, and each digit should be
 * uniformly distributed in 0-9.
*/
static uint64_t quintillion(void)
{
    uint64_t result;

    do {
        result = xorshift_u64() & UINT64_C(1152921504606846975);
    } while (!result || result > UINT64_C(1000000000000000000));

    return result - UINT64_C(1);
}

/* This function returns a single uniformly random digit.
*/
static unsigned char digit(void)
{
    static uint64_t       digits_cache = 0;
    static unsigned char  digits_cached = 0;
    unsigned char         retval;

    if (!digits_cached) {
        digits_cache = quintillion();
        digits_cached = 17; /* We steal the first one! */
    } else
        digits_cached--;
    
    retval = digits_cache % (uint64_t)(10);
    digits_cache /= (uint64_t)(10);

    return retval;
}

static int parse_ulong(const char *src, unsigned long *to)
{
    const char   *end = src;
    unsigned long value;

    if (!src)
        return errno = EINVAL;

    errno = 0;
    value = strtoul(src, (char **)&end, 0);
    if (errno)
        return errno;

    if (end == src)
        return errno = EINVAL;
    while (*end)
        if (isspace(*end))
            end++;
        else
            return errno = EINVAL;

    if (to)
        *to = value;
    return 0;
}

int main(int argc, char *argv[])
{
    unsigned long lines, cols, line, col, seed;
    char         *oneline;
    
    /* When parsing the command-line parameters,
     * use locale conventions. */
    setlocale(LC_ALL, "");

    /* Standard output should be fully buffered, if possible.
     * This only affects output speed, so we're not too worried
     * if this happens to fail. */
    (void)setvbuf(stdout, NULL, _IOFBF, (size_t)IO_BLOCK_SIZE);

    if (argc < 3 || argc > 4 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
        fprintf(stderr, "\n");
        fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
        fprintf(stderr, "       %s COLS LINES [ SEED ]\n", argv[0]);
        fprintf(stderr, "\n");
        fprintf(stderr, "This program generates random decimal digits\n");
        fprintf(stderr, "0 - 9, separated by spaces, COLS per line,\n");
        fprintf(stderr, "LINES lines.  In total, COLS*LINES*2 bytes\n");
        fprintf(stderr, "will be used.\n");
        fprintf(stderr, "\n");
        fprintf(stderr, "SEED is the optional seed for the Xorshift64*\n");
        fprintf(stderr, "pseudo-random number generator used in this program.\n");
        fprintf(stderr, "If omitted, current time is used as the seed.\n");
        fprintf(stderr, "\n");
        return EXIT_SUCCESS;
    }

    if (parse_ulong(argv[1], &cols) || cols < 1UL) {
        fprintf(stderr, "%s: Invalid number of digits per line.\n", argv[1]);
        return EXIT_FAILURE;
    }
    if (parse_ulong(argv[2], &lines) || lines < 1UL) {
        fprintf(stderr, "%s: Invalid number of lines.\n", argv[2]);
        return EXIT_FAILURE;
    }

    if (argc > 3) {
        if (parse_ulong(argv[3], &seed)) {
            fprintf(stderr, "%s: Invalid Xorshift64* seed.\n", argv[3]);
            return EXIT_FAILURE;
        }
    } else
        seed = time_seed();

    /* Since zero seed is invalid, we map it to ~0. */
    xorshift_state = seed;
    if (!xorshift_state)
        xorshift_state = ~(uint64_t)0;

    /* Discard first 1000 values to make the initial values unpredictable. */
    for (col = 0; col < 1000; col++)
        xorshift_u64();

    /* Allocate memory for a full line. */
    oneline = malloc((size_t)(2 * cols + 1));
    if (!oneline) {
        fprintf(stderr, "Not enough memory for %lu column buffer.\n", cols);
        return EXIT_FAILURE;
    }

    /* Set spaces and terminating newline. */
    for (col = 0; col < cols; col++)
        oneline[2*col + 1] = ' ';
    oneline[2*cols-1] = '\n';

    /* Not needed, but in case a code modification treats it as a string. */
    oneline[2*cols] = '\0';

    for (line = 0UL; line < lines; line++) {
        for (col = 0UL; col < cols; col++)
            oneline[2*col] = digit();

        if (fwrite(oneline, 2*cols, 1, stdout) != 1)
            return EXIT_FAILURE; 
    }

    /* Check for write errors. */
    if (ferror(stdout))
        return EXIT_FAILURE;

    return EXIT_SUCCESS;
}

참고: 두 예시 모두 2016년 11월 18일에 편집되었습니다.확실하게 하다숫자의 균일한 분포(0 제외, 예를 들어여기다양한 의사 난수 생성기에 대한 비교 및 세부 정보)

예를 들어 다음을 사용하여 컴파일합니다.

gcc -Wall -O2 decimal-digits.c -o decimal-digits

선택적으로 시스템 전체에 설치하여 /usr/bin사용할 수 있습니다.

sudo install -o root -g root -m 0755 decimal-digits /usr/bin

행당 비트 수와 행 수가 필요합니다. 1000000000 / 100 / 2 = 5000000(500만; 총 바이트를 열 수로 나눈 값을 2로 나눈 값) 이므로 다음을 사용할 수 있습니다.

./decimal-digits 100 5000000 > digits.txt

digits.txtOP의 요구 사항에 따라 기가바이트 크기의 데이터를 생성합니다.

프로그램 자체는 효율성보다는 가독성을 위해 작성되었습니다. 여기서 나의 목적은 코드의 효율성을 보여주는 것이 아닙니다. 어쨌든 저는 일반 C 인터페이스 대신 POSIX.1과 저수준 I/O를 사용할 것입니다. 단일 라인 코드, 짧은 쉘 또는 awk 스크립틀릿에 들인 노력을 개발 관련 도구 및 성능과 비교합니다.

GNU C 라이브러리를 사용하면 각 문자 출력에 대한 오버헤드(간접 함수 호출 또는 조건문 - 보시다시피 인터페이스는 실제로 매우 복잡하고 일반적임) fputc()가 거의 없습니다 . FILE이 특정 Intel Core i5-4200U 노트북에서 출력을 /dev/null첫 번째(fputc) 버전으로 리디렉션하는 데 약 11초가 걸리는 반면, 한 번에 한 줄 버전은 1.3초만 걸립니다.

나는 거대한 데이터 세트로 작업하는 것을 좋아하기 때문에 이와 같은 프로그램과 생성기를 자주 작성합니다. 나에게는 이것이 이상하다. 예를 들어, 구문 분석 시 정확히 동일한 값을 생성할 수 있을 만큼 충분한 정밀도로 모든 유한 양수 IEEE-754 부동 소수점 값을 텍스트 파일에 인쇄하는 프로그램을 작성한 적이 있습니다. 파일 크기는 몇 기가바이트(아마도 4G 정도)입니다. 유한 양수는 float생각만큼 크지 않습니다. 나는 이를 사용하여 그러한 데이터를 읽고 구문 분석하는 구현을 비교합니다.

OP와 같은 일반적인 사용 사례의 경우 쉘 스크립트, 스크립틀릿 및 단일 라이너가 더 나은 접근 방식입니다. 전체 작업을 완료하는 데 시간이 덜 걸립니다. (매일 다른 파일이 필요하거나 다른 파일이 필요한 사람이 많은 경우를 제외하고 위와 같은 전용 도구를 사용하는 것이 그만한 가치가 있을 수 있습니다.)

Question 3

당신이 가지고 있다면shuf사용 가능합니다(최근 GNU coreutils에서는 가능). 다음을 수행할 수 있습니다.

time shuf -r -n $((512*1024*1024)) -i 0-9 | paste -sd "$(printf '%99s\\n')" -

내 가상 머신에서는 이제 Stéphane의 답변보다 약 3:4 느립니다.

Answer

당신이 가지고 있다면shuf사용 가능합니다(최근 GNU coreutils에서는 가능). 다음을 수행할 수 있습니다.

time shuf -r -n $((512*1024*1024)) -i 0-9 | paste -sd "$(printf '%99s\\n')" -

내 가상 머신에서는 이제 Stéphane의 답변보다 약 3:4 느립니다.

Question 4

간단하고 이해하기 쉬운 솔루션이기를 바랍니다.

od -An -x /dev/urandom | tr -dc 0-9 | fold -w100 | awk NF=NF FS= | head -c1G

od여기에서 16진수의 통합 스트림을 만듭니다 /dev/random.
tr문자를 제거하고 0-9숫자 만 유지
fold각 행에 100개의 숫자가 있는지 확인하세요.
awk줄 안에 공백 삽입
head입력을 1GB로 자릅니다.

Answer

간단하고 이해하기 쉬운 솔루션이기를 바랍니다.

od -An -x /dev/urandom | tr -dc 0-9 | fold -w100 | awk NF=NF FS= | head -c1G

od여기에서 16진수의 통합 스트림을 만듭니다 /dev/random.
tr문자를 제거하고 0-9숫자 만 유지
fold각 행에 100개의 숫자가 있는지 확인하세요.
awk줄 안에 공백 삽입
head입력을 1GB로 자릅니다.

난수를 포함하는 1GB 텍스트 파일을 생성하는 가장 빠른 방법은 무엇입니까?

답변1

답변2

십진수.c:

답변3

답변4

관련 정보