How to succinctly, portably, and thoroughly seed the mt19937 PRNG?

I seem to see many answers in which someone suggests using <random> to generate random numbers, usually along with code like this:

std::random_device rd;  
std::mt19937 gen(rd());
std::uniform_int_distribution<> dis(0, 5);
dis(gen);

Usually this replaces some kind of "unholy abomination" such as:

srand(time(NULL));
rand()%6;

We might criticize the old way by arguing that time(NULL) provides low entropy, time(NULL) is predictable, and the end result is non-uniform.

But all of that is true of the new way: it just has a shinier veneer.

  • rd() returns a single unsigned int . This has at least 16 bits and probably 32. That's not enough to seed MT's 19937 bits of state.

  • Using std::mt19937 gen(rd());gen() (seeding with 32 bits and looking at the first output) doesn't give a good output distribution. 7 and 13 can never be the first output. Two seeds produce 0. Twelve seeds produce 1226181350. (Link)

  • std::random_device can be, and sometimes is, implemented as a simple PRNG with a fixed seed. It might therefore produce the same sequence on every run. (Link) This is even worse than time(NULL) .

  • Worse yet, it is very easy to copy and paste the foregoing code snippets, despite the problems they contain. Some solutions to the this require acquiring largish libraries which may not be suitable to everyone.

    In light of this, my question is How can one succinctly, portably, and thoroughly seed the mt19937 PRNG in C++?

    Given the issues above, a good answer:

  • Must fully seed the mt19937/mt19937_64.
  • Cannot rely solely on std::random_device or time(NULL) as a source of entropy.
  • Should not rely on Boost or other libaries.
  • Should fit in a small number of lines such that it would look nice copy-pasted into an answer.
  • Thoughts

  • My current thought is that outputs from std::random_device can be mashed up (perhaps via XOR) with time(NULL) , values derived from address space randomization, and a hard-coded constant (which could be set during distribution) to get a best-effort shot at entropy.

  • std::random_device::entropy() does not give a good indication of what std::random_device might or might not do.


  • I would argue the greatest flaw with std::random_device is the that it is allowed a deterministic fallback if no CSPRNG is available. This alone is a good reason not to seed a PRNG using std::random_device , since the bytes produced may be deterministic. It unfortunately doesn't provide an API to find out when this happens, or to request failure instead of low-quality random numbers.

    That is, there is no completely portable solution: however, there is a decent, minimal approach. You can use a minimal wrapper around a CSPRNG (defined as sysrandom below) to seed the PRNG.

    Windows


    You can rely on CryptGenRandom , a CSPRNG. For example, you may use the following code:

    bool acquire_context(HCRYPTPROV *ctx)
    {
        if (!CryptAcquireContext(ctx, nullptr, nullptr, PROV_RSA_FULL, 0)) {
            return CryptAcquireContext(ctx, nullptr, nullptr, PROV_RSA_FULL, CRYPT_NEWKEYSET);
        }
        return true;
    }
    
    
    size_t sysrandom(void* dst, size_t dstlen)
    {
        HCRYPTPROV ctx;
        if (!acquire_context(&ctx)) {
            throw std::runtime_error("Unable to initialize Win32 crypt library.");
        }
    
        BYTE* buffer = reinterpret_cast<BYTE*>(dst);
        if(!CryptGenRandom(ctx, dstlen, buffer)) {
            throw std::runtime_error("Unable to generate random bytes.");
        }
    
        if (!CryptReleaseContext(ctx, 0)) {
            throw std::runtime_error("Unable to release Win32 crypt library.");
        }
    
        return dstlen;
    }
    

    Unix-Like


    On many Unix-like systems, you should use /dev/urandom when possible (although this is not guaranteed to exist on POSIX-compliant systems).

    size_t sysrandom(void* dst, size_t dstlen)
    {
        char* buffer = reinterpret_cast<char*>(dst);
        std::ifstream stream("/dev/urandom", std::ios_base::binary | std::ios_base::in);
        stream.read(buffer, dstlen);
    
        return dstlen;
    }
    

    Other


    If no CSPRNG is available, you might choose to rely on std::random_device . However, I would avoid this if possible, since various compilers (most notably, MinGW) implement it with as a PRNG (in fact, producing the same sequence every time to alert humans that it's not properly random).

    Seeding


    Now that we have our pieces with minimal overhead, we can generate the desired bits of random entropy to seed our PRNG. The example uses (an obviously insufficient) 32-bits to seed the PRNG, and you should increase this value (which is dependent on your CSPRNG).

    std::uint_least32_t seed;    
    sysrandom(&seed, sizeof(seed));
    std::mt19937 gen(seed);
    

    Comparison To Boost


    We can see parallels to boost::random_device (a true CSPRNG) after a quick look at the source code. Boost uses MS_DEF_PROV on Windows, which is the provider type for PROV_RSA_FULL . The only thing missing would be verifying the cryptographic context, which can be done with CRYPT_VERIFYCONTEXT . On *Nix, Boost uses /dev/urandom . IE, this solution is portable, well-tested, and easy-to-use.

    Linux Specialization


    If you're willing to sacrifice succinctness for security, getrandom is an excellent choice on Linux 3.17 and above, and on recent Solaris. getrandom behaves identically to /dev/urandom , except it blocks if the kernel hasn't initialized its CSPRNG yet after booting. The following snippet detects if Linux getrandom is available, and if not falls back to /dev/urandom .

    #if defined(__linux__) || defined(linux) || defined(__linux)
    #   // Check the kernel version. `getrandom` is only Linux 3.17 and above.
    #   include <linux/version.h>
    #   if LINUX_VERSION_CODE >= KERNEL_VERSION(3,17,0)
    #       define HAVE_GETRANDOM
    #   endif
    #endif
    
    // also requires glibc 2.25 for the libc wrapper
    #if defined(HAVE_GETRANDOM)
    #   include <sys/syscall.h>
    #   include <linux/random.h>
    
    size_t sysrandom(void* dst, size_t dstlen)
    {
        int bytes = syscall(SYS_getrandom, dst, dstlen, 0);
        if (bytes != dstlen) {
            throw std::runtime_error("Unable to read N bytes from CSPRNG.");
        }
    
        return dstlen;
    }
    
    #elif defined(_WIN32)
    
    // Windows sysrandom here.
    
    #else
    
    // POSIX sysrandom here.
    
    #endif
    

    OpenBSD


    There is one final caveat: modern OpenBSD does not have /dev/urandom . You should use getentropy instead.

    #if defined(__OpenBSD__)
    #   define HAVE_GETENTROPY
    #endif
    
    #if defined(HAVE_GETENTROPY)
    #   include <unistd.h>
    
    size_t sysrandom(void* dst, size_t dstlen)
    {
        int bytes = getentropy(dst, dstlen);
        if (bytes != dstlen) {
            throw std::runtime_error("Unable to read N bytes from CSPRNG.");
        }
    
        return dstlen;
    }
    
    #endif
    

    Other Thoughts


    If you need cryptographically secure random bytes, you should probably replace the fstream with POSIX's unbuffered open/read/close. This is because both basic_filebuf and FILE contain an internal buffer, which will be allocated via a standard allocator (and therefore not wiped from memory).

    This could easily be done by changing sysrandom to:

    size_t sysrandom(void* dst, size_t dstlen)
    {
        int fd = open("/dev/urandom", O_RDONLY);
        if (fd == -1) {
            throw std::runtime_error("Unable to open /dev/urandom.");
        }
        if (read(fd, dst, dstlen) != dstlen) {
            close(fd);
            throw std::runtime_error("Unable to read N bytes from CSPRNG.");
        }
    
        close(fd);
        return dstlen;
    }
    

    Thanks


    Special thanks to Ben Voigt for pointing out FILE uses buffered reads, and therefore should not be used.

    I would also like to thank Peter Cordes for mentioning getrandom , and OpenBSD's lack of /dev/urandom .


    In a sense, this can't be done portably. That is, one can conceive a valid fully-deterministic platform running C++ (say, a simulator which steps the machine clock deterministically, and with "determinized" I/O) in which there is no source of randomness to seed a PRNG.


    You can use a std::seed_seq and fill it to at least the requires state size for the generator using Alexander Huszagh's method of getting the entropy:

    size_t sysrandom(void* dst, size_t dstlen); //from Alexander Huszagh answer above
    
    void foo(){
    
        std::uint_fast32_t[std::mt19937::state_size] state;
        sysrandom(state, sizeof(state));
        std::seed_seq s(std::begin(state), std::end(state));
    
        std::mt19937 g;
        g.seed(s);
    }
    

    If there was a proper way to fill or create a SeedSequence from a UniformRandomBitGenerator in the standard library using std::random_device for seeding properly would be much simpler.

    链接地址: http://www.djcxy.com/p/37280.html

    上一篇: 生成所有可能的组合

    下一篇: 如何简洁,便携地彻底播种mt19937 PRNG?