[Pkg-sysvinit-devel] init.d/urandom : saving random-seed

Sat Jul 31 20:36:53 UTC 2010

On 07/31/2010 08:49 AM, Henrique de Moraes Holschuh wrote:

> .... the best way of fixing a Debian
> system to be more secure as far as the quality of the randomness used by a
> random user application will be, AFAIK, is to simply get a medium or high
> bandwidth TRNG,

Yes indeed!

> I don't have a proper TRNG at the moment, but if I manage to get one 

You can get a high-quality TRNG for free from
  http://www.av8n.com/turbid/

>> Now, to answer the question:  A random-seed file should never be reused.
>> Never ever.
> 
> Ok.  Let's differentiate writing to /dev/urandom WITHOUT credit (write to
> that device) and with entropy credit (what rng-tools does using IOCTLs).

OK.  

Similarly let's differentiate the behavior of /dev/random from the
behavior of /dev/urandom.  My previous note was almost entirely about
/dev/urandom, as the Subject: line suggests.  The concept of "credit"
does not apply to /dev/urandom.

Furthermore we must differentiate the various kinds of "randomness".
Even among experts there is some divergence of opinion about what the
word "random" should mean.  When I use the word, the meaning is 
intentionally somewhat vague, namely "random enough for the application" 
... whatever that means.  In contrast, the word "entropy" has a very 
specific quantitative meaning rooted in mathematics and physics.

This allows us to understand that the "random" bits coming from /dev/urandom
may or may not contain any entropy.  If your application requires real 
entropy, you should not be using /dev/urandom.  Whether or not the "random"
bits coming from /dev/urandom are good for your application depends greatly
on the application.

> Writing without credits is supposed to be always safe, even if you write
> twice the same data. 

That is strictly true, but perhaps misleading in some abnormal situations.
If you reboot and then reseed the PRNG using the random-seed file, then:
  a) No damage is ever caused by the random-seed file, strictly speaking.
  b) The damage was caused by the _reboot_, which caused the PRNG to lose 
   state.
  c) The question is whether the random-seed file is sufficient to undo the
   damage caused by the reboot.  Using the file a second time does *not* 
   undo the damage, and leaves the system in a much worse state than it 
   was before the reboot.  A random-seed file that has been used must be
   considered severely compromised and unsuitable for re-use.

> If the kernel got any entropy from anywhere else, you
> won't dilute the pool and you're not worse off, correct?

There is never a problem with "dilution".  Shuffling (even lame shuffling)
can never undo the effect of previous shuffling.

> So, if you can never be worse off, why shouldn't we allow for a
> *failure-mode* where we could possibly reuse the random-seed file?

As discussed above, the real damage is caused by the reboot, not by 
the random-seed file.  The random-seed file has the power to repair
this damage ... once ... but not more than once.  If the PRNG is rebooted
and then the seed is reused, the PRNG will replay the same outputs, which
is very bad.  The motto is:

          A  _used_  seed 
      is a  _used-up_  seed.

> After all, userspace CANNOT completely seed the Linux kernel RNG, we can
> never reset its state using writes (it just shuffles the pool without any
> entropy credit), and even a very large entropy credit using IOCTLs will
> cause just a catastrophic reseed, which STILL uses the some of the current
> state of the pool anyway AFAIK.

That's irrelevant.  The reboot CAN and DOES completely reset the state
of the PRNG.  This doesn't come from userspace, but it does happen.

The notion of "credit" does not apply to /dev/urandom.

> If we cannot tolerate this failure mode, we will never be able to use
> anything but a TRNG to do early pool shuffling, and must retain what we have
> right now (we shuffle after we have write access, so that we can immedialely
> remove the seed file).

OK, you have highlighted a point that was not clear in my previous 
note.  There are several possibilities on the table:
  a) a PRNG that has not been seeded at all, 
  b) a PRNG that has re-used a previous seed,
  c) a PRNG that has properly used a seed for the first time, and
  d) a PRNG that has been initialized from a TRNG.

Obviously we would prefer (d).  The next-best choice is (c).  If in 
some emergency situation (c) is not possible, then (b) is bad and (a)
is even worse.  In some impractical sense (b) is not as bad as (a),
but still (b) is so bad that I cannot recommend it.

There is a tricky tradeoff that needs to be made here.  The question is,
do you want the system to reboot into a state where /dev/urandom is
seemingly functional but insecure, or a state where it is not functional
at all?  This is a nontrivial question.

For more discussion of this point, see below.

We face this tradeoff directly when we consider whether we believe that
it good to reseed early, the earlier the better.  For best security, we
should wait until the random-seed file is available read/write, so that 
we can avoid bad situation (b).  

  Note that if the filesystem is not writeable, there is no way for 
  initscripts to tell whether the random-seed file has been previously
  used, so there is no way of distinguishing the bad situation (b) from
  the good situation (c).

For typical "live CD" systems and other _attended_ systems where the 
random-seed file is not readable, it is probably best to wait until the 
kernel has collected some real entropy (from keyboard events etc.) and 
then use that to seed the PRNG.  And of course if the system has a good 
hardware TRNG then we should rely on that.

If you want a decent level of security, it seems advisable to require
that any unattended system should have a hardware TRNG.  This is what I
have specified whenever I have faced this question in the past.

We must avoid situation (a).  One way to do this would be to make sure
/dev/urandom blocks or throws an error if it is used before it is seeded.

>> Reusing the random-seed file makes the PRNG very much worse than it would
>> otherwise be.  By way of illustration, suppose you are using the computer
> 
> See above.  Note that in this thread we are concerned only with writes to
> /dev/urandom, not PRNGs in general.  /dev/urandom, as far as we've been told
> (and as far as I could understand from the Linux kernel code) CANNOT be
> truly reseeded, it can only be shuffled.

I do not understand the distinction between "reseeded" and "shuffled".

The essential requirement is to put the PRNG into a state where the
attackers cannot possibly know the state.  If you are using a deck 
of cards as your PRNG, then shuffling is how this is accomplished.  
I don't care whether you call this reseeding or shuffling or whatever.

If you are contemplating putting the /dev/urandom PRNG into some kind
of _known_ state, that is not what we mean by reseeding.  We agree that 
there is no way that userspace software -- by itself -- can do this.  
But a reboot can do it.

For present purposes, shuffling is a metaphor for reseeding, or an 
example of reseeding.  There is no important distinction.

The fundamental problem we are trying to solve is this:  A reboot put 
the PRNG into a known state, a state vulnerable to attack.  The purpose 
of reseeding aka shuffling is to get it out of that state into a state 
that is unknown to the attackers.

>> On the other hand, it is harmless to make the random-seed file larger than
>> it needs to be.
> 
> Let's just use a static size of 512 bytes at late init time, and of 2048
> bytes at shutdown time, then.  Makes for easier scripts, that only need dd
> and nothing else.  

It is easy to calculate the size:

    if ! DD_BYTES=$((  
      $(cat /proc/sys/kernel/random/poolsize 2>/dev/null) + 7 >> 3
    )) ; then 
      DD_BYTES=512
    fi

    dd if=/dev/urandom of=/var/lib/urandom/random-seed \
      bs=1 count=$DD_BYTES 2>/dev/null

This requires dd and some standard sh features, nothing more.

Failing that, using a hard-coded size of 512 is acceptable.  This value
is specified in the comments in drivers/char/random.c so we can at least 
say we are following the instructions.

I see no advantage to using preferring 2048 bytes to the calculated and/or
documented size, not at shutdown time or otherwise.  This would just be extra
complexity.  It would cause people to ask why it was done, and we would not
have a good answer for them.

> .... without draining more entropy than the strictly needed
> during boot.

The fact that reading from /dev/urandom depletes the supply of entropy used
by /dev/random is a longstanding weakness of /dev/urandom.  It allows a
nonprivileged user to create a denial-of-service -- intentionally or even 
accidentally -- against /dev/random.  This should have been fixed years ago.

Note that Turbid provides a yarrow-like "stretched RNG" that does not have
this problem.

> Alternatively, we could just use 512 bytes for everything by default, and
> have that parameter configurable in /etc/default/random.

Again, I would prefer to see it calculated, but if for any reason it
cannot be calculated, a parameterized "512" should be good enough for
all practical purposes.

>> Sometimes /dev/urandom is advertised as a "cryptographically secure PRNG".
>> If that's what it is supposed to be, it should block or throw an error if
>> it is used before it is seeded.  To say the same thing the other way: if 
> 
> It is considered to be always seeded, 

"Considering" it so does not make it actually so.

> Its seed (the primary pool)
> gets a few bits of entropy right away, from disk activity and other sources.

For a diskless client, there many not be any entropy from disk activity or
any other sources.  Also note that solid-state disks are becoming more
popular, and they do not produce entropy the same way rotating disks do.

> It doesn't matter for this discussion how disputable the randomness of these
> bits are, because unless you're going to remove such entropy contributions
> from the kernel, the pool will *always* already have something inside when
> control is handled to userspace.

Alas, that is not true.  Certainly not "always".  It is a notorious 
weakness of the Linux RNG system that in the worst case there is no 
positive lower bound on the amount of entropy available to the kernel.
Also note that "a few bits" of entropy are not much better than nothing, 
because the attacker can just check all the possibilities.

Exponentials are tricky.  Exp(large number) is very large indeed, but 
exp(small number) is not large at all.

> So, unless one is going to do kernel surgery, urandom will be already
> "initialized" when control is handled over to userspace.  

But not initialized in a securely usable way.  It will be usable only
for the lowest of low-grade applications.  It will not be safe to use
for any security-related applications.

>   But I am certainly NOT going to
> advocate increasing pool size over the default 512 bytes (which would help
> the kernel handle sudden spikes of entropy requests),

Nobody is recommending any such increase.

>  because a smaller size
> probably helps protects you from overestimation of the entropy entering the
> pool (as we keep adding data to it even after it is full, thus eventually we
> will really have that much entropy in the pool, and the larger the pool, the
> more time it will take for that to happen).

That's irrelevant.  First of all, the concept of overestimation does not
apply to /dev/urandom.  (It might sometimes apply to /dev/random, but that
is not the topic of today's conversation.  And messing with the poolsize
is not a valid way of preventing overestimation.)

> One could always have a init pool (small), and swap it with a large pool
> later, but I doubt people would accept such a change in the kernel unless it
> is backed up by some serious research that proves it is well worth the
> complexity on scenarios for which "get a real TRNG" isn't an answer.

We agree that people would not accept such a change in the kernel.
Usually "get a real TRNG" is the only sensible answer.  If that is
not the answer, then pool-swapping is not the answer either.  I cannot 
imagine any scenario in which pool-swapping is the answer.

>> I recommend not messing with the built-in poolsize.
> 
> In that, we agree.

:-)

>> Seeding should happen
>>  -- after the random-seed file becomes readable, i.e. after the
>>   relevant filesystem is mounted.
>>  -- as soon thereafter as possible
> 
> Agreed, but that DOES mean we have a failure mode where the seed file can be
> used twice to stir the pool, should we crash/reboot before filesystems are
> mounted read-write.

Good point.  I stand corrected.  I am no longer confident that "earlier
is better".  I am no longer confident that readable (but not writeable)
is acceptable.

> And that failure mode is not rare.  It will happen when fsck finds anything
> wrong in the / partition, and that is not nearly as uncommon as one would
> hope it to be.

There are nasty tradeoffs involved here.  
 -- Reseeding as early as possible is clearly better in the normal case,
  but I cannot in good conscience recommend an "earlier is better" policy
  since it causes security failures in abnormal cases.  We have to take
  this seriously because these cases are not rare.  We have to take this
  doubly seriously because an attacker could actively _cause_ such cases
  by forcing repeated reboots.
 -- It could be asked whether reseeding badly is better than not reseeding
  at all, but this is a question from hell.  The one case is like leaving
  the front door of your house standing open, and the other case is like
  leaving it seemingly closed but unlocked.  It is a security problem either
  way.  I cannot endorse either option.

Therefore, as a starting point for further discussion, I would propose
 *) /dev/urandom should block or throw an error if it is used before
  it is properly seeded.
 *) if a random-seed file is needed at all, reseeding should wait until 
  the file is readable and writable, since this is the only way AFAICT 
  that we can ensure that it will not be reused.

I leave this as a topic for discussion.  We foresee the possibility
that under some abnormal (but not rare) conditions, the system will 
come up in a state where /dev/urandom is unusable, because the 
filesystem holding the random-seed is not writable.  This is not ideal
but AFAICT it is preferable to the alternatives.  If anybody can think
of cases where this is not acceptable, please speak up.

Note that in the most common case, namely a single-user "maintenance"
shell provoked by fsck problems, a great many subsystems are offline,
and adding the /dev/urandom to the list of unusable subsystems would
not be completely out of character.

> And we should move the seed file to somewhere inside /etc or /lib.  It is as
> simple as that.  /var cannot be used for any data you need at early
> userspace.

There are strong arguments for _not_ putting the random-seed in /etc
or /lib.  There are lots of systems out there which for security 
reasons and/or performance reasons have /etc and /lib on permanently
readonly partitions.

I think /var is as good a place as any.  More generally, if a random-seed
file is needed at all, it needs to be on a partition with the following
properties:
 -- local
 -- persistent
 -- readable and writable
 -- mounted read/write early enough so that /dev/urandom can be reseeded
  early enough.  That is, before any outputs from /dev/urandom are needed.

On a system with a good TRNG, the random-seed file is not needed at all.

>> Updating should happen again during shutdown, if possible.
> 
> Agreed.

:-)

> Any embedded system worth its price has a TRNG inside somewhere.  Servers
> and workstations can add TRNGs to their internal USB ports,

Or use Turbid.  Most mainboards these days include enough built-in
audio to provide a source of real industrial-strength entropy, so
you don't need to spend even one cent on additional hardware, USB
or otherwise.

If by chance the board lacks an audio system, you can get USB 
audio devices for $5.00 or less, and connect Turbid to that.

Failing that, you could plug in a USB memory stick, and use that 
as a convenient place to store the random-seed file.  This is way 
better than nothing, but obviously not as good as a TRNG.

> Well, nowadays datacenters would boot an hypervisor, do all the proper RNG
> initialization for the hypervisor 

Yes, that is a reasonable tactic.  I've done similar things in the
past. 

> (over the network from an entropy source
> box, if needed),

Network entropy boxes give rise to an interesting (but manageable) 
chicken-and-egg problem:  If the client has enough stored randomness to
get started, it can establish a secure network connection and use that
to acquire more randomness from an entropy source box.  In contrast,
if the client starts out with no randomness, it cannot obtain any,
because it has no way of setting up a secure connection.  This applies
equally to a hypervisor and/or to its hosted VMs.  The design considerations
are the same either way.