Will Python Systemrandom/Os.Urandom Always Have Enough Entropy for Good Crypto

Will python SystemRandom / os.urandom always have enough entropy for good crypto

There's a subtle difference between the output of /dev/random and /dev/urandom. As has been pointed out, /dev/urandom doesn't block. That's because it gets its output from a pseudo-random number generator, seeded from the 'real' random numbers in /dev/random.

The output of /dev/urandom will almost always be sufficiently random -- it's a high-quality PRNG with a random seed. If you really need a better source of random data, you could consider getting a system with a hardware random number generator -- my netbook has a VIA C7 in it, which can generate quite a lot of properly random data (I get a consistent 99.9kb/s out of /dev/random, 545kb/s out of /dev/urandom).

As an aside, if you're generating passwords then you might want to look at pwgen -- it makes nice pronounceable passwords for you :).

is Python's 3 random.random good enough for security if seeded with high entropy seeds before every use?

Ok so after asking elsewhere, if the random generator produce a uniform distribution, the whole operation is useless but does not introduce security risks.

as in the question, if the random generator is seeded with new seed for every generation. It is like applying a simple useless transformation x => f(x).

The response to this question is very simple if you use high entropy and pass it through python random.random ou random.choice, it is as secure as the seed quality. I would not recommend changing this as a high priority change to a running system.

The other part of the response is: don't do that, it is useless. Use a better way to secure pick instead of random.choice.

Is the default entropy of the secrets module still good enough in 2018?

The answer to the entropy question is, yes, 256-bit secrets can't be brute forced and won't become practical to brute force any time soon.

However you will need to use larger keys with asymmetric cryptographic algorithms. It's also harder to predict how strong those types of keys are. Key strength and key length are not always the same. (256-bit RSA does not have 256-bit strength.) The world of asymmetric algorithms has more of an arms-race quality than symmetric algorithms.

I frequently see brute force effort related to Bitcoin hash rates or DES cracking hardware. You can't directly translate performance metrics for these two purposes into brute force performance metrics targeting other algorithms, but we can use them for the purpose of estimation.

It looks like the peak estimated hash rate for Bitcoin 2018 is about 60 million terra-hashes per second. Let's round 60 * 106+12 hashes per second up to the next power of two, 266, for ease of calculation.

Let's now assume

  • We improve efficiency of our hardware. Say some new technology does one million times as much work as modern day hardware while using the same amount of energy.
  • Humanity manages to generate a million times as much electricty somehow. (Maybe using fusion or magic.)
  • Say we get new technology and our computer clock speed increases by a factor of one million.
  • Our GPUs have one million times as many cores or we can shrink computer sizes by the same factor.
  • Humanity populates other star systems and we increase our population by a factor of one million.
  • Every human can afford one million times as many computers as they can now and they all want to guess your 256-bit secret.

Let's pretend that each improvement is orthogonal and that our brute force power scales linearly with all of those improvements. Lets round each one-million to 220. Our new performance measure is 266+6(20) = 2186 guesses per second. How long would it take to test every possible 256-bit value?

It takes 2256 / 2186 = 2256-186 = 270 seconds. That's over 37 trillion years. Thousands of times longer than the time elapsed since the Big Bang. So using 256-bits of entropy is fairly conservative.

(And with what technology and resources we currently have in the real world we can't even brute force 128-bit secrets.)

Quantum computers aren't a huge concern regarding symmetric algorithms. If we use 256-bit symmetric keys then it will still take 2128 function evaluations using Grover's algorithm. However it would be reasonable to assume that the cost of n evaluations on a quantum computer is at least the cost of n evaluations on a classical computer.

If you generate many random values and expect each one to be unique you need to use twice as many bits as you might think you would need due to the birthday problem. Generic collision attacks on a k-bit hash function cost about the equivalent of 2k/2 hash function evaluations. For quantum computers it could be 2k/3. (So don't mix up key length and hash function output lengths.)

These generic attacks assume an ideal function. Specific algorithms may be "cracked", meaning that an attack better than brute force is discovered.

It's important that whatever secrets you generate are derived using unpredictable inputs. If you want an n-bit security level you need to have n-bits of entropy. (So you can't use Mersenne Twister or PCG and you can't initialize your RNG using system time or a password.)

256 bits of entropy is good in 2018 and, barring inconceivable sci-fi-like technological advances or magic, will still be secure in the year 3018.

Also, see Landauer's principle, which limits how efficiently we can make calculations. I'm not even going to try to translate this time-based argument to a dollar amount for obvious reasons.

Python's pycrypto library for random number generation vs os.urandom

In the link you gave, the only reason given to prefer urandom() is that it pulled less code in (the OS implements "most of it", and os.urandom() is built in to Python).

If you're going to distribute a Python package, you can simplify users' lives by minimizing external dependencies. That's the entire point of the link you found.

In terms of quality, either way should work fine. I prefer urandom() because I understand what it does; I never dug into the guts of PyCrypto. But urandom() has been criticized for use in some environments. Click this and scroll down to the part that starts

Gutterman, Pinkas, & Reinman in March 2006 published a detailed
cryptographic analysis of the Linux random number generator ...

Is `/dev/urandom` suitable for simulation purpose?

In the underlying implementation of /dev/urandom is a CSPRNG, the output pool of which has a maximal period of less than 2^(26∗32) − 1, which is then fed into SHA-1 to produce output for /dev/urandom. As such, urandom can obviously produce the amount of random numbers you want, however it can not offer you reproducible results - you will have to cache the sequence you get yourself.

You do not have to worry about what happens when the entropy pool is estimated to be depleted, /dev/urandom will output whatever you request of it. The "theoretical attacks" the urandom(4) man page speaks of are nonexistent. (the "issue" is a huge misunderstanding of what "entropy estimation" is)

Many other PRNGs with large periods exist which reproducible seeding: the Mersenne Twister in C++, xorshift PRNGs, etc. You should be able to adapt any PRNG to the distribution which is suitable for your purposes.

Whats more random, hashlib or urandom?

This solution:

os.urandom(16).encode('hex')

is the best since it uses the OS to generate randomness which should be usable for cryptographic purposes (depends on the OS implementation).

random.random() generates pseudo-random values.

Hashing a random value does not add any new randomness.

Reading /dev/urandom as early as possible

urandom is provided via device driver and the first thing kernel does with the driver is to call init call.

If you take a look here: http://lxr.free-electrons.com/source/drivers/char/random.c#L1401

  * Note that setup_arch() may call add_device_randomness()
* long before we get here. This allows seeding of the pools
* with some platform dependent data very early in the boot
* process. But it limits our options here. We must use
* statically allocated structures that already have all
* initializations complete at compile time. We should also
* take care not to overwrite the precious per platform data
* we were given.
*/
static int rand_initialize(void)
{
init_std_data(&input_pool);
init_std_data(&blocking_pool);
init_std_data(&nonblocking_pool);
return 0;
}
early_initcall(rand_initialize);

So, init function for this driver is rand_initialize. However note that comment says that setup_arch may call add_device randomness() before this device is even initialized. However, calling that function does not add any actual entropy (it feeds the pool with stuff like MAC addresses, so if you have two exactly the same VMs, you're good there). From the comment:

  * add_device_randomness() is for adding data to the random pool that
* is likely to differ between two devices (or possibly even per boot).
* This would be things like MAC addresses or serial numbers, or the
* read-out of the RTC. This does *not* add any actual entropy to the
* pool, but it initializes the pool to different values for devices
* that might otherwise be identical and have very little entropy
* available to them (particularly common in the embedded world).

Also, note that entropy pools are stored on shutdown and restored on boot time via init script (on my Ubuntu 14.04, it's in /etc/init.d/urandom), so you might want to call your script from that script before

 53     (
54 date +%s.%N
55
56 # Load and then save $POOLBYTES bytes,
57 # which is the size of the entropy pool
58 if [ -f "$SAVEDFILE" ]
59 then
60 cat "$SAVEDFILE"
61 fi
62 # Redirect output of subshell (not individual commands)
63 # to cope with a misfeature in the FreeBSD (not Linux)
64 # /dev/random, where every superuser write/close causes
65 # an explicit reseed of the yarrow.
66 ) >/dev/urandom

or similar call is made.



Related Topics



Leave a reply



Submit