[Pkg-sysvinit-devel] init.d/urandom : random-seed [patches]

John Denker jsd at av8n.com
Sun Sep 12 18:15:48 UTC 2010


On 09/12/2010 08:45 AM, Henrique de Moraes Holschuh wrote:

> 1. We are NOT dealing with the initial seeding of a regular PRNG, we're
> dealing with an entropy pool (and we're not seeding it either).
> 
> This pool has two dimensions (while typical FIPS-style PRNGs have only
> one): the pool state vector, AND the pool entropy tracking counter.

I agree that it is useful to distinguish these two ideas
 -- the state vector, versus
 -- the "entropy tracking counter"
 
> These two dimensions are orthogonal in the Linux kernel random pool.
> The kernel assumes a certain amount of entropy is lost from the pool
> each time a byte is read from it (I believe it is a 1:1 tracking, but I
> didn't check) and tracks that, not allowing any more entropy withdrawals
> when the count reaches zero.  It depends on external information to
> increase the pool entropy tracking.

We must also distinguish the behavior of
 -- /dev/urandom, versus
 -- /dev/random

a) The underlying driver links these two in a peculiar limited way.  
b) Opinions differ as to whether the details of this linkage are 
 done in the optimal way.
c) Neither (a) or (b) is in any way relevant to the patches for
 init.d/urandom that we are considering here.

The tracking between /dev/random and /dev/urandom is 1:1 *except* 
when the entropy counter is zero or near-zero.  When the entropy 
counter is zero, there is no tracking whatsoever.  In this case, 
the behavior of /dev/urandom is for all practical purposes the
same as a "FIPS-style PRNG" ... and meanwhile /dev/random has no 
interesting behavior at all, i.e. it just blocks.

This case is precisely the use-case that has not been properly
dealt with heretofore, and is precisely the case that makes the
patches in question important.
 -- the entropy counter is zero
 -- /dev/urandom is decoupled from /dev/random
 -- /dev/urandom is just a PRNG
 -- the PRNG needs to be properly seeded.

The behavior of /dev/random under other conditions is interesting
but not relevant.

> 2. The Linux kernel DOES NOT offer any way to initialize the state
> vector.  There is no seeding of the entropy pool, just shuffling.

Two remarks:
 -- The important case we are considering is the case of a reboot
  followed by initialization via init.d/urandom.  In the worst
  case, which is also the typical case, this *does* have the effect 
  of putting the state vector into a reproducible state.
 -- The init.d/urandom script uses a file called "random.seed" 
  and IMHO it is within bounds to refer to it as the seed.  This
  is consistent with long-established terminology in the crypto / 
  RNG community.  I will continue to use this terminology.  For
  example, we can say that the whole purpose of init.d/urandom 
  is to seed the PRNG.

> In fact, the pool state vector is *NOT* initialized at all (it starts
> with whatever residual state was left there by the system platform after
> a cold or warm boot -- which often is all zeros or all 0xFF or a mix of
> the two, so it may well offer zero variance across boots on many
> systems).

Agreed.  Zero variance is the worst case and also the typical case.

> The entropy pool counter starts at zero when the kernel boots.

Agreed.  Also note that init.d/urandom does not attempt to set this 
counter.

> 3. The Linux entropy pool suffle is a *reversible* transformation, and
> only four operations are offered to the rest of the kernel and to
> userspace to interact with the pool:

That's true but misses the point.  The important point is the
*irreversible* loss of state due to the reboot.

The raison d'être of the random.seed file, and the init.d/urandom
script, is to minimize the harm caused by this irreversible loss.
Overemphasizing the reversible operations to the neglect of the
centrally-important irreversible event is not helpful.

> 3.1. entropy credit (a shuffle operation that can add to the entropy
> tracking counter), which is priviledged.

This is true but irrelevant to init.d/urandom.
 
> 3.2. shuffle operation WITHOUT entropy credit, which is non-priviledged.
> 
> 3.3. entropy drain, locked to entropy tracking counter (/dev/random)
> 
> 3.4. entropy drain, unlocked to entropy tracking counter (/dev/urandom)
> 
> Both 3.1 and 3.2 operations take any variable amount of data which is
> used to shuffle the pool state vector.  As such, you can never reduce
> the entropy stored in the state vector by using operation (3.2).

This is true but irrelevant to init.d/urandom.  The entropy counter 
starts at zero and init.d/urandom does not change it.  When the
entropy counter is zero, the question of "reducing" it really
doesn't need to be discussed.
 
> 3.3 and 3.4 at not very relevant to our problem, other than the fact 3.3
> will always drain entropy from the pool and 3.4 can drain the entropy
> from the pool up to a certain limit (but not empty it).  If entropy is
> drained from the pool, the entropy tracking counter is updated to
> account for the withdrawal of entropy.
> 
> The kernel protects the state vector from direct observation using a
> hash function, but that's not important for this analysis.

We agree that these two points are not relevant to the present
discussion.

> 4. The operation we are dealing with (3.2) credits NO entropy to the
> pool entropy counter.  In fact, it is a non-priviledged operation that
> can be carried out by *any user* in the system, including non-trusted
> ones.  It is not restricted in any way whatsoever, not even through
> filesystem permissions.
> 
> I have no idea if that was clear to the people on the crypto list.  But
> those are all relevant points.

These points are open to misinterpretation.  Seeding the PRNG
by writing stuff to /dev/urandom is unprivileged because it is
harmless.  However, if this is done in an unprotected way, it
is useless as well as harmless.  The init.d/urandom script goes
to some trouble -- as it should -- to protect the random.seed
file using file permissions et cetera.  A properly _protected_
write to /dev/urandom is useful in ways that an unprotected
write is not.

This is a subtle point, but central to the present discussion.
Harmless is different from useless, and vice versa.

>> There was 100% consensus on the cryptography list that using
>> the date/time was a good idea.  Using the entire kernel log
>> was not discussed, and I guarantee you that it would not
>> receive consensus.  I for one would object that it is not
>> useful, let alone necessary.
> 
> Well, AFAIK, you have three situations you want to "fix":
> 
> S1. low variance (or no variance) across cold or warm boots on the same
> equipment.

Yes, that is part of the problem that needs to be fixed.

> S2. low variance (or no variance) across cold or warm boots across
> different units of the same equipment model, especially when booted at
> the same time.

Agreed. To that I might add that various things you might
think of as distinguishing features (such as MAC addresses 
and mobo serial numbers) are all-too-easy for attackers to
figure out, so they are of little value.

> S3. low variance across synchronized or unsynchronized cold or warm boots
> of unrelated equipment.
> 
> With the constraints:
> 
> C1. All equipment are booting exactly the same software (live cd).

Yes and no.  This is an issue, but not necessarily an inflexible
constraint.  Some specific constructive suggestions have been
made for alleviating this problem.  They are tangentially
related to the present discussion.
 
> C2. Entropy from previous runs is not available
> 
> C3. There is very little real entropy being gathered by the kernel from
> the network and storage devices

We agree that C2 and C3 define the use-case where the patches
in question are important.

> Now, the RTC date and time helps you when your RTC is not broken, but it
> is not of much help when a number of devices are powered up/rebooted at
> the same time should their RTCs be in sync.  It also has not that much
> entropy.

That's true, and that's why we need to address issue C1.

> The boot kernel log CAN supply you with some variance in situations
> where the RTC will not.  You should use both in case the kernel is not
> logging the RTC timestamp, but it is interesting to note that it usually
> does log that.

Variance that is known (or guessable) by the attacker is not
useful variance.  Increasing the volume of known "stuff" that 
gets written to /dev/urandom is not useful.

The following combination *is* useful:
 Part 1: a goodly amount (e.g. 4k bits) of seed that is 
  unknown to the attacker, and is unique to this machine, plus
 Part 2: something that is provably different each time the
  machine is rebooted, i.e. each time there has been an
  irreversible loss of state.

This two-part solution is precisely what is being proposed.
The patches being discussed here are documented to be part
1 of this two-part solution.

> And neither is sufficient for full security, as both are known to the
> rest of the system.

We agree that neither part 1 /by itself/ nor part 2 /by itself/
is anything to write home about.  However, the whole is much much
greater than the sum of the parts.

Adopting the "part 1" patches we are discussing is harmless in the
absence of part 2, and highly valuable if/when part 2 is implemented.

Among the experts on the cryptography list, there was 100% consensus
that this two-part solution made sense.

Therefore it seems that these patches should be adopted.

If anyone wants to *also* dump the logs into /dev/urandom, I 
suggest opening a separate ticket, rather than piggybacking 
it onto the tickets that have already been opened.  This is
consistent with the policy of having only one issue per ticket.

>> For present purposes, the clock serves as a counter,
>> with the advantage that it is present on almost all
>> platforms.
> 
> So is the kernel log.

No, the kernel log is not a counter, except insofar as it
contains timestamps derived from the clock.  Using the clock
directly is simpler and better.  It is significantly better,
since date +$s.%N contains more resolution than the usual
timestamps.

>> The clock-time is guaranteed to be different on each
>> reboot.  The log is not guaranteed to be different,
> 
> The clock-time is not different across a series of devices powered up at
> the same time, 

That's why we need the unique, unshared random.seed file,
i.e. part 2 of the two-part solution.

> nor across boots of a device with a defective RTC.  

If the hardware is broken, don't expect software to solve
the problem.  The fact that the patch submitted here fails
to work on broken hardware cannot be considered a valid
criticism of the patch.

> We _need_ to know the properties of operation 3.2 on the other kernels.
> If it is a "seed" instead of "shuffle", you have to send all the data in
> a single write.  If it is a shuffle that can weaken the pool, you cannot
> use anything that is available later.

There is no known Linux /dev/random that has the weakness you 
describe.  Furthermore, since the patch in question deposits the 
date/time before depositing the random.seed, it would be harmless 
even in this fanciful scenario.

Again, this cannot be considered a valid criticism of the patch.



More information about the Pkg-sysvinit-devel mailing list