dieharder
by
Robert G. Brown
Duke University Physics Department
Durham, NC 27708-0305
Copyright Robert G. Brown, 2008
Abstract
Dieharder: A Random Number Test Suite
Version 2.28.1
Robert G. Brown (rgb)
Dirk Eddelbuettel
Welcome to the dieharder distribution website.
Version 2.28.1 is the current snapshot. Some of the documentation
below may not quite be caught up to it, but it should be close.
Dieharder is a random number generator (rng) testing suite.
It is intended to test generators, not files of possibly
random numbers as the latter is a fallacious view of what it means
to be random. Is the number 7 random? If it is generated by a random
process, it might be. If it is made up to serve the purpose of some
argument (like this one) it is not. Perfect random number generators
produce "unlikely" sequences of random numbers -- at exactly the right
average rate. Testing a rng is therefore quite subtle.
dieharder is a tool designed to permit one to push a weak generator
to unambiguous failure (at the e.g. 0.0001% level), not leave one in the
"limbo" of 1% or 5% maybe-failure. It also contains many tests and is
extensible so that eventually it will contain many more tests than it
already does.
If you are using dieharder for testing rngs either in one of its
prebuilt versions (rpm or apt) or built from source (which gives you the
ability to e.g. add more tests or integrate your rng directly with
dieharder for ease of use) you may want to join either or both of the
dieharder-announce
or the
dieharder-devel
mailing lists here. The former should be very low traffic -- basically
announcing when a snapshot makes it through development to where I'm
proud of it. The latter will be a bit more active, and is a good place
to post bug reports, patches, suggestions, fixes, complaints and
generally participate in the development process.
About Dieharder
At the suggestion of Linas Vepstas on the Gnu Scientific Library
(GSL) list this GPL'd suite of random number tests will be named
"Dieharder". Using a movie sequel pun for the name is a double tribute
to George Marsaglia, whose "Diehard battery of
tests" of random number generators has enjoyed years of enduring
usefulness as a test suite.
The dieharder suite is more than just the diehard tests cleaned up
and given a pretty GPL'd source face in native C. Tests from the Statistical Test Suite (STS)
developed by the National Institute for Standards and Technology (NIST)
are being incorporated, as are new tests developed by rgb. Where
possible or appropriate, all tests that can be parameterized
("cranked up") to where failure, at least, is unambiguous are so
parameterized and controllable from the command line.
A further design goal is to provide some indication of why a
generator fails a test, where such information can be extracted during
the test process and placed in usable form. For example, the
bit-distribution tests should (eventually) be able to display the actual
histogram for the different bit ntuplets.
Dieharder is by design extensible. It is intended to be the "Swiss
army knife of random number test suites", or if you prefer, "the last
suite you'll ever ware" for testing random numbers.
Dieharder can be freely downloaded from the Dieharder
download site. On this page there should be a long list of previous
versions of dieharder, and it should tell you what is the current
snapshot. The version numbers have the following specific
meaning which is a bit different than usual:
- First number (major). Bumped only when major goals in the design
roadmap are reached (for example, finishing all the diehard tests).
Version 1.x.x, for example, means that ALL of diehard (and more) is now
incorporated in the program. Version 2.x.x means that the tests
themselves have been split off into the libdieharder library, so that
they can be linked into scripting languages such as R, new UIs, or user
code. 3.x.x would be expected to indicate that the entire STS suite is
incorporated, and so on.
- Second number (first minor). This number indicates the number of
tests currently supported. When it bumps, it means new tests have been
added from e.g. STS, Knuth, Marsaglia and Tsang, rgb, or elsewhere.
- Third number (second minor). This number is bumped when
significant features are added or altered. Bug fixes bump this number,
usually after a few bumps of the release number for testing snapshots.
This number and the release are reset to 0 when the major is bumped or a
new test is added to maintain the strictly increasing numerical value on
which e.g. yum upgrades rely.
The single-tree dieharder sources (.tgz and .src.rpm) files can be
downloaded from this directory. In addition, binary rpm's built on top
of Fedora Core whatever (for either i386 or both of x86_64) may be
present. Be warned: the GSL is a build requirement. The current
packaging builds both the library and the dieharder UI from a single
source rpm, or from running "make" in the toplevel directory of the
source tarball. With a bit of effort (making a private rpm building
tree), "make rpm" should work for you as well in this toplevel
directory.
This project is under very active development. Considerable effort
is being expended so that the suite will "run out of the box" to produce
a reasonably understandable report for any given random number generator
it supports via the "-a" flag, in addition to the ability to
considerably vary most specific tests as applied to the generator. A
brief synopsis of command options to get you started is presented below.
In general, though, documentation (including this page, the man page,
and built-in documentation) may lag the bleeding edge snapshot by a few
days or more.
An rpm installation note from Court Shrock:
I was reading about your work on dieharder. First, some info
about getting dieharder working in Gentoo:
cd ~
emerge rpm gsl
wget
http://www.phy.duke.edu/~rgb/General/dieharder/dieharder-0.6.11-1.i386.rpm
rpm -i --nodeps dieharder-0.6.11-1.i386.rpm
Rebuilding from tarball source should always work as well, and if you
are planning to play a lot with the tool may be a desireable way to
proceed as there are some documentation goodies in the ./doc
subdirectory and the ./manual subdirectory of the source tarball (such
as the original diehard test descriptions and the STS white paper).
George Marsaglia retired from FSU in 1996. For a brief time diehard
appeared to have finally disappeared from FSU webspace, but what had
really happened is google's favorite path to it had disappeared when his
personal home directory was removed. Diehard is still there, at the URL
http://www.stat.fsu.edu/pub/diehard
as well as at a Hong Kong website. The source code of diehard itself is
(of course) Copyright George Marsaglia but Marsaglia did not incorporate
an explicit license into his code which muddles the issue of how
and when it can be distributed, freely or otherwise. Existing diehard
sources are not directly incorporated into dieharder in source
form for that reason, to keep authorship and GPL licensing issues
clear.
Note that the same is not true about data. Several of the diehard
tests require that one use precomputed numbers as e.g. target mean,
sigma for some test statistic. Obviously in these cases we use the same
numbers as diehard so we get the same, or comparable, results. These
numbers were all developed with support from Federal grants and have all
been published in the literature, though, and should therefore be in the
public domain as far as reuse in a program is concerned.
Note also that most of the diehard tests are modified in
dieharder, usually in a way that should improve them. There are three
improvements that were basically always made if possible.
- The number of test sample p-value that contribute to the final
Kolmogorov-Smirnov test for the uniformity of the distribution of
p-values of the test statistic is a variable with default 100, which is
much larger than most diehard default values. This change alone
causes many generators that are asserted to "pass diehard" to in fact
fail -- any given test run generates a p-value that is acceptable, but
the distribution of p-values is not uniform.
- The number of actual samples within a test that contribute
to the single-run test statistic was made a variable when possible.
This was generally possible when the target was an easily computable
function of the number of samples, but a number of the tests have
pre-computed targets for specific numbers of samples and that number
cannot be varied because no general function is known relating the
target value to the number of samples.
- Many of diehard's tests investigated overlapping bit sequences as
being "independent identically distributed (iid) samples. This was
generally done because it used file-based input of random numbers and
the size of files that could reasonably be generated and tested by in
the mid-90's contained on the order of a million random deviates. The
restriction of testing to small, overlapping samples is neither
necessary nor desireable in modern testing -- numerical simulation can
easily consume ten to the eighteenth or more uniform deviates and the
integration of the test with the built-in generators in the GSL permits
"most" generators to be tested with essentially no limits on the number
that can be generated for testing purposes. Indeed, some tests are
likely to reveal the short period or limited number of returns by some
generators because they can consume far more numbers than are
available within the period, which was not so easy with diehard.
dieharder therefore permits overlapping sequences to be a
non-default option selected by the user wherever possible.
In a few cases other variations are possible for specific tests.
This should be noted in the built-in test documentation for that test
where appropriate.
Aside from these major differences, note that the algorithms were
independently written more or less from the test descriptions alone
(sometimes illuminated by a look at the code implementations, but only
to clear up just what was meant by the description). They may well do
things in a different (but equally valid) order or using different (but
ultimately equivalent) algorithms altogether and hence produce slightly
different (but equally valid) results even when run on the same data
with the same basic parameters. Then, there may be bugs in the
code, which might have the same general effect. Finally, it is always
possible that diehard implementations have bugs and can be in
error. Your Mileage May Vary. Be Warned.
About Dieharder
The primary point of dieharder (like diehard before it) is to make it
easy to time and test (pseudo)random number generators, both software
and hardware, for a variety of purposes in research and cryptography.
The tool is built entirely on top of the GSL's random number generator
interface and uses a variety of other GSL tools (e.g. sort, erfc,
incomplete gamma, distribution generators) in its operation.
Dieharder differs significantly from diehard in many ways. For
example, diehard uses file based sources of random numbers exclusively
and by default works with only roughly ten million random numbers in
such a file. However, modern random number generators in a typical
simulation application can easily need to generate 10^18 or more random
numbers, generated from hundreds, thousands, millions of different seeds
in independent (parallelized) simulation threads, as the application
runs over a period of months to years. Those applications can easily be
sensitive to rng weaknesses that might not be revealed by sequences as
short as 10^7 uints in length even with excellent and sensitive
tests. One of dieharder's primary design goals was to permit tests to
be run on very long sequences.
To facilitate this, dieharder prefers to test generators that
have been wrapped up in a GSL-compatible interface so that they can
return an unbounded stream of random numbers -- as many as any
single test or the entire suite of tests might require. Numerous
examples are provided of how one can wrap one's own random number
generator so that it is can be called via the GSL interface.
Dieharder also supports file-based input three distinct ways. The
simplest is to use the (raw binary) stdin interface to pipe a bit stream
from any rng, hardware or software, through dieharder for
testing. In addition, one can use "direct" file input of either raw
binary or ascii formatted (usually uint) random numbers. The man page
contains examples of how to do all three of these things, and dieharder
itself can generate sample files to use as templates for the appropriate
formatting.
Note Well! Dieharder can consume a lot of random
numbers in the course of running all the tests! To facilitate this,
dieharder should (as of 2.27.11 and beyond) support large file (> 2GB)
input, although this is still experimental. Large files are clunky and
relatively slow, and the LFS (large file system) in linux/gcc is still
relatively new and may have portability issues if dieharder is built
with a non-gcc compiler. It is therefore strongly recommended
that both hardware and software generators be tested by being wrapped
within the GSL interface by emulating the source code examples so that
they can return an essentially unbounded rng stream.
Dieharder also goes beyond diehard in that it is deliberately
extensible. In addition to implementing all of the diehard tests it is
expected that dieharder will eventually contain all of the NIST STS and
a variety of tests contributed by users, invented by the dieharder
authors, or implemented from descriptions in the literature. As a true
open source project, dieharder can eventually contain all rng
tests that prove useful in one place with a consistent interface that
permits one to apply those tests to many generators for purposes of
comparison and validation of the tests themselves as much as the
generators. In other words, it is intended to be a vehicle for the
computer science of random number generation testing as well as a
practical test harness for random number generators.
To expand on this, the development of dieharder was motivated by the
following, in rough order of importance:
- To provide a readily available, rpm- or apt- installable
toolset so that "consumers" of random numbers (who typically use
large numbers of random numbers in e.g. simulation or other
research) can test the generator(s) they are using to verify their
quality or lack thereof.
- To provide a very simple user interface for that toolset for
random number consumers. At the moment, this means a command line
interface (CLI) that can easily be embedded in scripts or run repeatedly
with different parameters. A graphical user interface (GUI) is on the
list of things to do, although it adds little to the practical utility
of the tool.
- To provide lots of lots of knobs and dials and low level
control for statistical researchers that want to study particular
generators with particular tests in more detail. This includes full
access to test sources -- no parameter or aspect of the test algorithms
is "hidden" and needs to be taken on faith.
- To have the entire test code and documentation be fully Gnu
Public Licensed and hence openly available for adaptation, testing,
comment, and modification so that the testing suite itself becomes (over
time) reliable.
- To be extensible. Dieharder provides a fairly simple
API for adding new tests with a common set of low-level testing
tools and a common test structure that leads (one hopes) to an
unambiguous decision to accept or reject any given random number
generator on the basis of any given test for a suitable choice of
controllable test parameters.
- To allow all researchers to be able to directly test, in
particular, the random number generators interfaced with the GSL.
This is a deliberate design decision justified by the extremely large
and growing number of random number generators prebuilt into the GSL and
the ease of adding new ones (either contributing them to the project or
for the sole purpose of local testing).
- To allow researchers that use e.g. distributions directly
generated by GSL routines (which can in principle fail two ways, due to
the failure of the underlying random number generator or due to a
failure of the generating algorithm) to be able to directly validate
their particular generator/distribution combination at the cost of
implementing a suitable test in dieharder (using the code of existing
tests as a template).
- To allow dieharder to be directly interfaced with other tools and
interfaces. For example, dieharder can be directly called within
the R interface, permitting its rngs to be tested and R-based graphics
and tools to be used to analyze test results. Note well, however, that
because it uses the GSL (which is GPL viral) dieharder itself is GPL
viral and cannot be embedded directly into a non-GPL tool such as
matlab.
Although this tool is being developed on Linux/GCC-based platforms,
it should port with no particular difficulty to other Unix-like
environments (at least ones that support the GSL), with the further
warning that certain features (in particular large file support) may
require tweaking and that the dieharder authors may not be able to help
you perform that tweaking.
Essential Usage Synopsis
If you compile the test or install the provided binary rpm's and run
it as:
dieharder -a
it should run -a(ll) tests on the default GSL generator.
dieharder -a -T 0
should provide a look at the new "table" output mode. This is
a concise and informative report that can be selected -- and customized
via the flag (argument) -- to "just test" any given rng.
Choose alternative tests with -g number where:
dieharder -g -1
will list all possible numbers known to the current snapshot of the
dieharder.
dieharder -l
should list all the tests implemented in the current snapshop of
DieHarder. Finally, the venerable and time tested:
dieharder -h
provides a Usage synopsis (which can quite long) and
man dieharder
is the (installed) man page, which may or many not be completely up
to date as the suite is under active development. For developers,
additional documentation is available in the toplevel directory or doc
subdirectory of the source tree. Eventually, a complete DieHard manual
in printable PDF form will be available both on this website and in
/usr/share/doc/dieharder-*/.
List of Random Number Generators and Tests
Available
List of GSL and user-defined random number generators that can be
tested by dieharder:
==========================================================================|
dieharder version 2.27.11 Copyright 2003 Robert G. Brown |
==========================================================================|
Id Test Name | Id Test Name | Id Test Name |
==========================================================================|
0 borosh13 | 1 cmrg | 2 coveyou |
3 fishman18 | 4 fishman20 | 5 fishman2x |
6 gfsr4 | 7 knuthran | 8 knuthran2 |
9 knuthran2002 | 10 lecuyer21 | 11 minstd |
12 mrg | 13 mt19937 | 14 mt19937_1999 |
15 mt19937_1998 | 16 r250 | 17 ran0 |
18 ran1 | 19 ran2 | 20 ran3 |
21 rand | 22 rand48 | 23 random128-bsd |
24 random128-glibc2 | 25 random128-libc5 | 26 random256-bsd |
27 random256-glibc2 | 28 random256-libc5 | 29 random32-bsd |
30 random32-glibc2 | 31 random32-libc5 | 32 random64-bsd |
33 random64-glibc2 | 34 random64-libc5 | 35 random8-bsd |
36 random8-glibc2 | 37 random8-libc5 | 38 random-bsd |
39 random-glibc2 | 40 random-libc5 | 41 randu |
42 ranf | 43 ranlux | 44 ranlux389 |
45 ranlxd1 | 46 ranlxd2 | 47 ranlxs0 |
48 ranlxs1 | 49 ranlxs2 | 50 ranmar |
51 slatec | 52 taus | 53 taus2 |
54 taus113 | 55 transputer | 56 tt800 |
57 uni | 58 uni32 | 59 vax |
60 waterman14 | 61 zuf | |
==========================================================================|
200 stdin_input_raw |201 file_input_raw |202 file_input |
203 ca |204 uvag | |
==========================================================================|
400 R_wichmann_hill |401 R_marsaglia_multic. |402 R_super_duper |
403 R_mersenne_twister |404 R_knuth_taocp |405 R_knuth_taocp2 |
==========================================================================|
500 /dev/random |501 /dev/urandom | |
==========================================================================|
600 empty | | |
==========================================================================|
Note that the stdin_input_raw interface (-g 200) is a "universal"
interface. Any generator that can produce a (continuous) stream of
presumably random bits can be tested with dieharder. The easiest way to
demonstrate this is by running:
dieharder -S 1 -B -o -t 1000000000 | dieharder -g 75 -r 3 -n 2
where the first invocation of dieharder generates a stream of binary
bits drawn from the default generator with seed 1 and the second reads
those bits from stdin and tests them with the rgb bitdist test on two
bit sequences. Compare the output to:
dieharder -S 1 -r 3 -n 2
which runs the same test on the same generator with the same seed
internally. They should be the same.
Similarly the file_input generator requires a file of "cooked" (ascii
readable) random numbers, one per line, with a header that describes the
format to dieharder. Note Well! File or stream input rands (with any
of the three methods for input) are delivered to the tests on demand,
but if the test needs more than are available dieharder either fails (in
the case of a stdin stream) or rewinds the file and cycles through it
again, and again, and again as needed. Obviously this significantly
reduces the sample space and can lead to completely incorrect results
for the p-value histograms unless there are enough rands to run EACH
test without repetition (it is harmless to reuse the sequence for
different tests). Let the user beware!
List of the CURRENT fully implemented tests (as of the 08/18/08
snapshot):
========================================================================
Dieharder version 2.27.11 Copyright 2003 Robert G. Brown
========================================================================
Bug/Development Key (use at your own risk):
rft = ready for testing (the test may - or may not - work correctly)
sus = suspect test (consistently fails "good" generators)
dev = development only (may crash, isn't finished)
Only fully reliable tests will be run when diehard -a is invoked,
with "reasonable" default parameters.
PLEASE READ THE MAN PAGE to learn about p-values and the null hypothesis
in testing BEFORE using this testing tool!
Diehard Tests
-d 1 Diehard Birthdays test
[sus: -d 2 Diehard Overlapping Permutations test]
-d 3 Diehard 32x32 Binary Rank test
-d 4 Diehard 6x8 Binary Rank test
-d 5 Diehard Bitstream test
-d 6 Diehard OPSO test
-d 7 Diehard OQSO test
-d 8 Diehard DNA test
-d 9 Diehard Count the 1s (stream) test
-d 10 Diehard Count the 1s (byte) test
-d 11 Diehard Parking Lot test
-d 12 Diehard Minimum Distance (2D Spheres) test
-d 13 Diehard 3D Spheres (minimum distance) test
-d 14 Diehard Squeeze test
[sus: -d 15 Diehard Sums test]
-d 16 Diehard Runs test
-d 17 Diehard Craps test
-d 18 Marsaglia and Tsang GCD test
[dev: -d 19 Marsaglia and Tsang Gorilla test]
RGB Tests
-r 1 RGB Timing test (times the rng)
-r 2 RGB Bit Persistence test
-r 3 RGB Ntuple Bit Distribution test suite (-n ntuple)
-r 4 RGB Generalized Minimum Distance test
-r 5 RGB Permutations test
[rft: -r 6 RGB Lagged Sums test (runs over all lags out to 32)]
(do not use the following as tests yet)
[dev: -r 7 RGB L-M-Ntuple Distribution test suite (quite long)]
[dev: -r 8 RGB Overlapping Permutations test]
Statistical Test Suite (STS)
-s 1 STS Monobit test
-s 2 STS Runs test
-s 3 STS Serial test [for all bit combinations out to 16 bits]
User Tests
-u 1 User Template [Lagged Sum Test, one bit pattern]
Full descriptions of the tests are available (as you can see) from
within the tool and source documentation. All tests have been
independently rewritten from their description, and may be functionally
modified or extended relative to the original source code published in
the originating suite(s). The author (rgb) bears complete
responsibility for these changes, subject to the standard GPL code
disclaimer that the code has no warranty (in essence, yes it's my
fault if they don't work but using the tool is at your own risk and you
can fix it if it bothers you and/or I don't fix it first).
Development Notes
Be warned that very few rngs come through dieharder
flawlessly. The best ones currently available appear to be a bit too
uniform (not quite "random enough") for at least certain sampling
displacements. This appears as having a p-value that is too
high, not too low. Also, as is documented in the list of tests,
there are both development tests and tests that are "suspect" in the
full suite of available tests.
Don't use the development tests -- they will often crash dieharder
and don't yet work anyway (and are there to facilitate development so I
can release snapshots with future tests "in place" but not necessarily
yet functional). The suspect tests are tests that consistently fail
all rngs tested, including hardware tests. Theoretically this
could just mean that these are really good tests and very sensitive to
non-randomness. Practically it could equally well mean that the
software implementation of the test contains a subtle bug or flawed
initialization data. Use the suspect tests with a grain of salt. For
sure don't assume that a generator that fails those tests is "bad" in
any way. Be sure to let me know if you encounter a generator that
consistently passes the suspect tests (and does optimally well
on the rest of them) because I am eagerly searching for a "gold
standard" rng to aid me in testing the tests.
Note that all tests are encapsulated to be as standard as possible in
the way they compute p-values from single statistics or from vectors of
statistics, and in the way they implement the underlying KS and chisq
tests. Diehard is now complete in dieharder, and attention will turn
towards implementing more selected tests from the STS. I also have my
eye on the as-yet unimplemented tests from Knuth's The Art of
Programming, lagged correlation, and more bitwise tests that have
occurred to me as I ported diehard (which does some things somewhat
backwards or indirectly, IMO).
Thoughts for the Future/Wish List/To Do
- Tests of GSL random distribution (as opposed to number) generators,
as indirect tests of the generators that feed them.
- Anderson-Darling KS test. Kuiper works, but AD is more common. It
therefore should be a user choice, or should even do both. Why not?
The computation for either is trivial compared to the effort required to
run the tests in the first place.
- New tests, compressions of existing ones that are "different" but
really the same. Hyperplane tests. Spectral tests. Especially the bit
distribution test with user defineable lag or lag pattern (to look for
subtle, long period correlations in the bit patterns produced).
- Collaborators. Co-developers welcome, as are contributions or
suggestions from users. Note well that users have already provided
critical help debugging the early code! Part of the point of a GPL
project is that you are NOT at the mercy of a black box piece of code.
If you are using dieharder and are moderately expert at statistics and
random numbers and observe something odd, please help out!
Conclusions
I hope that even during its development, you find dieharder useful.
Remember, it is fully open source, so you can freely modify and
redistribute the code according to the rules laid out in the Gnu Public
License (version 2b), which might cost you as much as a beer one day.
In particular, you can easily add random number generators using the
provided examples as templates, or you can add tests of your own by
copying the general layout of the existing tests (working toward a
p-value per run, cumulating (say) 100 runs, and turning the resulting KS
test into an overall p-value). Best of all, you can look inside the
code and see how the tests work, which may inspire you to create a new
test -- or a new generator that can pass a test.
To conclude, if you have any interest in participating in the
development of dieharder, be sure to let me know, especially if you have
decent C coding skills (including familiarity with Subversion and the
GSL) and a basic knowledge of statistics. I even have documents to help
with the latter, if you have the programming skills and want to LEARN
statistics. Bug reports or suggestions are also welcome.
Submit bug reports, etc. to
rgb at phy dot duke dot edu
|