[Pkg-sysvinit-devel] init.d/urandom : random-seed [patches]

Henrique de Moraes Holschuh hmh at debian.org
Sun Sep 12 15:45:37 UTC 2010


On Sat, 11 Sep 2010, John Denker wrote:
> On 09/11/2010 11:33 AM, Henrique de Moraes Holschuh wrote:
> > BTW, don't just cat the date into /dev/random.  Cat the entire contents of
> > the kernel log buffer as well.
> 
> Can you explain why you think that would be worthwhile?

Yes.  Just to make points clear:

1. We are NOT dealing with the initial seeding of a regular PRNG, we're
dealing with an entropy pool (and we're not seeding it either).

This pool has two dimensions (while typical FIPS-style PRNGs have only
one): the pool state vector, AND the pool entropy tracking counter.

These two dimensions are orthogonal in the Linux kernel random pool.
The kernel assumes a certain amount of entropy is lost from the pool
each time a byte is read from it (I believe it is a 1:1 tracking, but I
didn't check) and tracks that, not allowing any more entropy withdrawals
when the count reaches zero.  It depends on external information to
increase the pool entropy tracking.


2. The Linux kernel DOES NOT offer any way to initialize the state
vector.  There is no seeding of the entropy pool, just shuffling.

In fact, the pool state vector is *NOT* initialized at all (it starts
with whatever residual state was left there by the system platform after
a cold or warm boot -- which often is all zeros or all 0xFF or a mix of
the two, so it may well offer zero variance across boots on many
systems).

The entropy pool counter starts at zero when the kernel boots.

3. The Linux entropy pool suffle is a *reversible* transformation, and
only four operations are offered to the rest of the kernel and to
userspace to interact with the pool:

3.1. entropy credit (a shuffle operation that can add to the entropy
tracking counter), which is priviledged.

3.2. shuffle operation WITHOUT entropy credit, which is non-priviledged.

3.3. entropy drain, locked to entropy tracking counter (/dev/random)

3.4. entropy drain, unlocked to entropy tracking counter (/dev/urandom)

Both 3.1 and 3.2 operations take any variable amount of data which is
used to shuffle the pool state vector.  As such, you can never reduce
the entropy stored in the state vector by using operation (3.2).

3.3 and 3.4 at not very relevant to our problem, other than the fact 3.3
will always drain entropy from the pool and 3.4 can drain the entropy
from the pool up to a certain limit (but not empty it).  If entropy is
drained from the pool, the entropy tracking counter is updated to
account for the withdrawal of entropy.

The kernel protects the state vector from direct observation using a
hash function, but that's not important for this analysis.


4. The operation we are dealing with (3.2) credits NO entropy to the
pool entropy counter.  In fact, it is a non-priviledged operation that
can be carried out by *any user* in the system, including non-trusted
ones.  It is not restricted in any way whatsoever, not even through
filesystem permissions.

I have no idea if that was clear to the people on the crypto list.  But
those are all relevant points.

> There was 100% consensus on the cryptography list that using
> the date/time was a good idea.  Using the entire kernel log
> was not discussed, and I guarantee you that it would not
> receive consensus.  I for one would object that it is not
> useful, let alone necessary.

Well, AFAIK, you have three situations you want to "fix":

S1. low variance (or no variance) across cold or warm boots on the same
equipment.

S2. low variance (or no variance) across cold or warm boots across
different units of the same equipment model, especially when booted at
the same time.

S3. low variance across synchronized or unsynchronized cold or warm boots
of unrelated equipment.

With the constraints:

C1. All equipment are booting exactly the same software (live cd).

C2. Entropy from previous runs is not available

C3. There is very little real entropy being gathered by the kernel from
the network and storage devices

Now, the RTC date and time helps you when your RTC is not broken, but it
is not of much help when a number of devices are powered up/rebooted at
the same time should their RTCs be in sync.  It also has not that much
entropy.

The boot kernel log CAN supply you with some variance in situations
where the RTC will not.  You should use both in case the kernel is not
logging the RTC timestamp, but it is interesting to note that it usually
does log that.

And neither is sufficient for full security, as both are known to the
rest of the system.

There are other sources of device-specific information, but they're
harder to get to during early system init or have less variance and are
restricted to subsets of the devices (lspci, dmidecode...).  They could
be used as well, of course.

The use of any such sources DOES *require* that operation 3.2 is
entirely safe at all times, even if it is supplied known or specially
crafted data.

Now, the entropy density of the kernel log is low, so feeding it to
shuffle the pool should be "inneficient".  But whole deal is not about
speed.  I did not take timings on a slow device to know how much time it
takes to feed a 8KiB text file to /dev/random.  In my desktop it is
negligible (it accepted 1MiB at 64MiB/s), so I can't get a relevant
datapoint.

> The cryptographic purpose would be fully accomplished by
> a humble counter, so long as each time it was used it
> differed by even _one bit_ from all previous values.

Which is a pretty strong requirement.

> For present purposes, the clock serves as a counter,
> with the advantage that it is present on almost all
> platforms.

So is the kernel log.

> The clock-time is guaranteed to be different on each
> reboot.  The log is not guaranteed to be different,

The clock-time is not different across a series of devices powered up at
the same time, nor across boots of a device with a defective RTC.  The
kernel log is sensitive to device initialization timing (even with
timestamping off) and *may* provide a bit more variance (this is not
guaranteed, as usual).  Unlike the RTC, it *is* sensitive to platform
setup, and very often to unique device IDs (such as MAC addresses).

> except insofar as it includes timestamps that depend
> on the clock.

Indeed.  It is not *guaranteed* to be different across reboots of the
same device (although it WILL be on any x86/x86-64 using the standard
Debian kernel if the RTC is not broken, as the RTC timestamp is logged).
But it has other relevant features, as explained above.

> > HOWEVER one should contact the porters for the arches with other kernels and
> > get the relevant data from them, nobody around here claimed any knowledge of
> > how /dev/random in FreeBSD (or The Hurd for that matter) behaves.  Heck, I
> > don't even KNOW if the initscript runs there or not... :(
> 
> That is IMHO a good enough reason to not bother.  Since

You misunderstand.  The above is needed for ALL of your proposed
changes.

We _need_ to know the properties of operation 3.2 on the other kernels.
If it is a "seed" instead of "shuffle", you have to send all the data in
a single write.  If it is a shuffle that can weaken the pool, you cannot
use anything that is available later.

Also, a different kernel might allow a different pool size, or resizing.

And they might not be using (or be compatible with) the initscript at
all in the first place.

> it is not worth doing at all, it is not worth bothering
> the architecture folks about it.

See above.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh



More information about the Pkg-sysvinit-devel mailing list