In the documentation of generators with weak scramblers such as
xoshiro256+
it is mentioned
that the lowest bits will fail linearity tests, even if the whole
generator does not fail BigCrush. This page explains in detail what this means to
non-experts.
First of all, note these failures are not specific features of xoroshiro128+
or any
other generator I designed: all generators with a linear engine and a weak scrambler
have the same problem: for example, Ranq1
from the third edition of “Numerical Recipes”,
XSadd, and so on.
The output of a linear generator satisfies some linear relations between the bits. Moreover, the linear complexity of its bits (i.e., the lowest degree of a linear-feedback shift register representing a bit) will be quite low. The result is that all linear generators (Mersenne Twister, WELL, etc.) fail two basic linearity tests which were designed to “catch” them: the binary-rank test and the linear-complexity test (these are called MatrixRank and LinearComp in TestU01; the second one is actually a combination of three tests—see the user guide). Researchers working on linear generators consider these failures irrelevant to applications, and this explains why, for example, the SFMT is the stock PRNG of Python, C, C++, etc. in spite of failing such tests. We consider them of some importance, and that's why we design scramblers. If you want to understand the influence of linearity in applications, I provide an insightful example.
When you scramble a linear generator using a sum or a multiplication (which we call “weak scramblers”), its lowest bit (two bits at least, in case of a multiplication) has the same linear complexity of the base generator, and then complexity grows as you move towards the high bits (we computed a precise estimate of the linear complexity of the lowest bits). The complexity grows so quickly that often the influence of the lowest bits is not sufficient to trigger failure in linearity tests: this is what happens, for instance, with our 64-bit generators, for which one submits the two 32-bit halves of the output in sequence to TestU01.
However, as noted in the documentation bits of low degree will fail linearity tests when tested in isolation (or almost isolation). There are a few ways to obtain this result.
- If you are proficient in TestU01, you can easily run those tests tuning the parameters (usually, r and s) so they examine the bit of interest in isolation.
- The linear-complexity test (LinearComp) in BigCrush is applied in isolation to the highest bit and to the third lowest bit (you can see this from the parameters of the test suite contained in the user guide). Thus, to test in isolation, say, the lowest bit, it is sufficient to pass it to TestU01 continuously as the highest bit. There are many ways to do this: you can rotate right by one position the 64-bit output, and test just the upper 32 bits; or reverse the output (inverse the bit direction) and test just the upper 32 bits; or take the lower 32 bits and reverse them. All these transformations will result in failures of the LinearComp test (for weak scramblers). Rotating and taking the upper bits is possibly the easiest approach, as you can test easily also the second, third lowest bit, etc.
- The binary-rank test (MatrixRank) in BigCrush has several instances, but three of them (out of six) are applied to the 4, 5, and 30 highest bits, so you can use the same trick as before. The various methods however are technically no longer equivalent, as reversing will put in the highest position the bits of lower complexity.
- For the binary-rank test you have also a lazy man's option: PractRand will apply directly
the binary-rank test for you on the lowest bit only (run it with options
-tf 2 -te 1
). It will fail immediately on all generators with low-complexity lowest bits. If you wait enough, you will see failures also on the whole output, as PractRand will run the test with progressively larger matrices, and at some point the influence of the lowest bits will be sufficient to bias the rank of the overall output. This is why, for example, the Tiny Mersenne Twister or our weakly scrambled generators pass BigCrush but fail the binary-rank test in PractRand.
Of course you can do the opposite: extract a subset of 32 bits not containing the lowest bits, and test it (possibly reversed). It will of course pass the linearity tests of BigCrush, PractRand, etc.
The 32-bit generators with weak scramblers (e.g.,
xoshiro128+
) do not have
enough high-complexity bits to “hide” the low linear
complexity of the lowest bits (as there is no interleaving); hence the
failures in MatrixRank and LinearComp when you just reverse the output.
As in the 64-bit case, a subset of bits not containing the
lowest ones will pass all linearity tests.
Remember, however, that simple linear generators with sparse transition matrices suffer also from Hamming-weight dependencies.