Issue 7080005: Fix base::RandGenerator bug (it had non-uniform random distribution).

Jói

Ilya: Main reviewer. Brett: base/OWNERS approval. Cheers, Jói

9 years, 7 months ago (2011-05-27 18:03:16 UTC) #1

brettw

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc File base/rand_util_unittest.cc (right): http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc#newcode65 base/rand_util_unittest.cc:65: TEST(RandUtilTest, RandGeneratorIsUniform) { I'm not super excited about this ...

9 years, 7 months ago (2011-05-27 18:13:21 UTC) #2

Jói

It takes 5 seconds to run if it never converges (on a not-so-fast Linux box). ...

9 years, 7 months ago (2011-05-27 18:17:13 UTC) #3

brettw

I'm interested in Pawel's opinion on whether this test is useful. http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc File base/rand_util_unittest.cc (right): ...

9 years, 7 months ago (2011-05-27 18:22:20 UTC) #4

Jói

Well, the test is useful in that it shows the previous implementation did not generate ...

9 years, 7 months ago (2011-05-27 18:24:11 UTC) #5

Jói

Just as a further data point, running the test with --gtest_repeat=10000 took 2m23s wall-clock time, ...

9 years, 7 months ago (2011-05-27 18:42:00 UTC) #6

Just as a further data point, running the test with --gtest_repeat=10000
took 2m23s wall-clock time, for an average run-time of 14 ms per cycle of
the test, including GUnit overhead.  There were no failures.

I suspect given the current parameters for the maximum number of tries and
the size of the expected interval (+/- 1% from the expected value), the test
is already less likely to fail (unless the implementation gets broken) than
any random test you take, because of gamma rays or memory glitches on your
machine.  I'd be happy to increase the timeout or widen the expected
interval if you think it's important, but the average of a bunch of samples
of a uniform random distribution converges very, very quickly, see e.g.
http://en.wikipedia.org/wiki/Expected_value

I'd hate to remove the test, since it verifies one of the most important
pieces of behavior that a "number from a random interval" type of generator
can have, i.e. that it is uniform.  If it is ever flaky, I will of course
widen the interval or do something else about it, but until it's failed at
least once I don't think we should do anything.

Cheers,
Jói

2011/5/27 Jói Sigurðsson <joi@chromium.org>

> Well, the test is useful in that it shows the previous implementation
> did not generate a uniform random distribution.  It generally passes
> in less than 40 ms.  You shouldn't expect it to take multiple seconds,
> unless the implementation that it's testing is actually broken.
>
> Cheers,
> Joi
>
>
> On Fri, May 27, 2011 at 2:22 PM,  <brettw@chromium.org> wrote:
> > I'm interested in Pawel's opinion on whether this test is useful.
> >
> >
> > http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc
> > File base/rand_util_unittest.cc (right):
> >
> >
>
http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc#newc...
> > base/rand_util_unittest.cc:96: if (count >= kReportEveryNAttempts &&
> > count % kReportEveryNAttempts == 0) {
> > I don't think tests should be printing stuff out. This typically just
> > confuses people when they see it in the logs.
> >
> > http://codereview.chromium.org/7080005/
> >
>

Ilya Sherman

+ Jim for comment on probabilistic tests, as I think he's contributed a few to ...

9 years, 7 months ago (2011-05-27 19:23:26 UTC) #7

Paweł Hajdan Jr.

Some comments, no need to wait for me. One note though: if more and more ...

9 years, 7 months ago (2011-05-27 19:28:11 UTC) #8

Some comments, no need to wait for me.

One note though: if more and more things start to depend on rand_util and
require various probabilistic/statistical properties, it may be worth it to do
some more in-depth evaluation of that code and fix all problems we can detect at
once.

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc
File base/rand_util_unittest.cc (right):

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc#newc...
base/rand_util_unittest.cc:65: TEST(RandUtilTest, RandGeneratorIsUniform) {
On 2011/05/27 18:13:21, brettw wrote:
> I'm not super excited about this test. It seems by definition flaky. You've
> tried to make it less flaky by running it a million times. How long does this
> take?

Right, everything based on randomness is quite hard to test. I think it's
important to keep this test fast, ideally no longer than few seconds, or even
less.

I wonder if it would be useful to do slightly different - rather than calling
RandGenerator many times, feed it different values of |max| and
|base::RandUint64()|, and see how it behaves. That should be deterministic, and
still test the right thing...

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc#newc...
base/rand_util_unittest.cc:78: const uint64 kAllowedVariance = kExpectedAverage
/ 100L;  // 1% either way.
nit: 1% seems quite strict, how about something more like 5% ?

Initially rand_util was not intended for anything other than test code iirc, and
it has no guarantees about strength or quality anyway.

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc#newc...
base/rand_util_unittest.cc:84: while (count < kMaxAttempts) {
nit: Why not a for loop then?

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc#newc...
base/rand_util_unittest.cc:96: if (count >= kReportEveryNAttempts && count %
kReportEveryNAttempts == 0) {
On 2011/05/27 18:22:20, brettw wrote:
> I don't think tests should be printing stuff out. This typically just confuses
> people when they see it in the logs.

Yeah, and also it's better to use LOG or gtest macros instead of printf.

Jói

Thanks for the reviews guys. It seems several pieces of production code depend on RandGenerator ...

9 years, 7 months ago (2011-05-27 19:41:48 UTC) #9

Thanks for the reviews guys.

It seems several pieces of production code depend on RandGenerator and the
derived function RandInt:

http://codesearch.google.com/codesearch?as_q=base%3A%3ARandGenerator&btnG=Sea...

http://codesearch.google.com/codesearch?as_q=base%3A%3ARandInt&btnG=Search+Co...

One of them actually uses RandGenerator and duplicates the bug I am fixing; I
will send a separate review to that code's owner.

Cheers,
Jói

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc
File base/rand_util_unittest.cc (right):

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc#newc...
base/rand_util_unittest.cc:65: TEST(RandUtilTest, RandGeneratorIsUniform) {
On 2011/05/27 19:28:11, Paweł Hajdan Jr. wrote:
> On 2011/05/27 18:13:21, brettw wrote:
> > I'm not super excited about this test. It seems by definition flaky. You've
> > tried to make it less flaky by running it a million times. How long does
this
> > take?
> 
> Right, everything based on randomness is quite hard to test. I think it's
> important to keep this test fast, ideally no longer than few seconds, or even
> less.
> 
> I wonder if it would be useful to do slightly different - rather than calling
> RandGenerator many times, feed it different values of |max| and
> |base::RandUint64()|, and see how it behaves. That should be deterministic,
and
> still test the right thing...

That wouldn't verify that the random distribution is uniform. Asserting that it
converges to close to the expected average value is the only good way to do that
AFAIK.

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc#newc...
base/rand_util_unittest.cc:78: const uint64 kAllowedVariance = kExpectedAverage
/ 100L;  // 1% either way.
On 2011/05/27 19:28:11, Paweł Hajdan Jr. wrote:
> nit: 1% seems quite strict, how about something more like 5% ?
> 
> Initially rand_util was not intended for anything other than test code iirc,
and
> it has no guarantees about strength or quality anyway.

The test already passes in an average of 14 ms wall-clock time on a somewhat
slow machine, and in about 12 thousand test runs I've never seen it need more
than 50K iterations (which takes a couple of hundred milliseconds or so on a
reasonable machine) to converge.  I've changed it to +/- 2% though, as the
failure mode that existed before was much more skewed than that, but I'd prefer
not to make the interval even wider in case future failure modes are more
subtle.

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc#newc...
base/rand_util_unittest.cc:84: while (count < kMaxAttempts) {
On 2011/05/27 19:28:11, Paweł Hajdan Jr. wrote:
> nit: Why not a for loop then?

Because I want to test the value of count after the loop.

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc#newc...
base/rand_util_unittest.cc:89: if (count > 1000) {
On 2011/05/27 19:23:26, Ilya Sherman wrote:
> nit: As long as you're making everything else a named constant, let's make
1000
> a named constant too.

Done.

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc#newc...
base/rand_util_unittest.cc:96: if (count >= kReportEveryNAttempts && count %
kReportEveryNAttempts == 0) {
On 2011/05/27 18:22:20, brettw wrote:
> I don't think tests should be printing stuff out. This typically just confuses
> people when they see it in the logs.

Done.

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc#newc...
base/rand_util_unittest.cc:96: if (count >= kReportEveryNAttempts && count %
kReportEveryNAttempts == 0) {
On 2011/05/27 19:23:26, Ilya Sherman wrote:
> On 2011/05/27 18:22:20, brettw wrote:
> > I don't think tests should be printing stuff out. This typically just
confuses
> > people when they see it in the logs.
> 
> Yes, please remove these printf()'s, or find a way to log them only on
failure.

Done.

http://codereview.chromium.org/7080005/diff/1/base/rand_util_unittest.cc#newc...
base/rand_util_unittest.cc:96: if (count >= kReportEveryNAttempts && count %
kReportEveryNAttempts == 0) {
On 2011/05/27 19:28:11, Paweł Hajdan Jr. wrote:
> On 2011/05/27 18:22:20, brettw wrote:
> > I don't think tests should be printing stuff out. This typically just
confuses
> > people when they see it in the logs.
> 
> Yeah, and also it's better to use LOG or gtest macros instead of printf.

Done.

jar (doing other things)

http://codereview.chromium.org/7080005/diff/4003/base/rand_util.cc File base/rand_util.cc (right): http://codereview.chromium.org/7080005/diff/4003/base/rand_util.cc#newcode59 base/rand_util.cc:59: } while (value > max_acceptable_value); Very cute.... but I ...

9 years, 7 months ago (2011-05-27 20:25:14 UTC) #11

Jói

http://codereview.chromium.org/7080005/diff/4003/base/rand_util.cc File base/rand_util.cc (right): http://codereview.chromium.org/7080005/diff/4003/base/rand_util.cc#newcode59 base/rand_util.cc:59: } while (value > max_acceptable_value); On 2011/05/27 20:25:14, jar ...

9 years, 7 months ago (2011-05-27 23:18:01 UTC) #12

jar (doing other things)

Note: I don't think I wrote the original code... and I'm not blocking the change ...

9 years, 7 months ago (2011-05-28 17:26:40 UTC) #13

Note: I don't think I wrote the original code... and I'm not blocking the change
or being defensive... but I am curious.

LGTM with the additional change in the test below. If you really want this class
of test of rand, please choose the optimal ratio (or tell me why your ratio is
optimal.. see comment in test).

If the bias of randomness is critical to our application, we may perchance want
to add some tests for the quality of the rand on the various platforms (e.g.,
check that at least each bit in a rand assumes both values sometimes... and
maybe that looking at only one bit at a time, that it is reasonably random).  

I wouldn't want to make this a doctoral study, or waste a ton of test cycles
doing this, but it might be worth having some tests of at least some simple
things. (example: with maximal modulus, for each single bit, it is possible to
get a zero and to get a one).

If you're going to add tests to watch out for this you may as well write a few
"fast" tests (since you suggest it will impact your use) and be done with it
while you're hacking in this area. (...and I can believe on some platforms this
will be broken big time, such as by building a wimpy rand from a clock),

http://codereview.chromium.org/7080005/diff/4003/base/rand_util.cc
File base/rand_util.cc (right):

http://codereview.chromium.org/7080005/diff/4003/base/rand_util.cc#newcode59
base/rand_util.cc:59: } while (value > max_acceptable_value);
OK... With that change I'm convinced it is "right," but when I've seen this code
historically in many codebases, (despite being aware of the bias), I never
thought it was significant (with uint64 max being a whopping 2^64).  The fact
that your first patch introduced this class of bias (due to a typo) and was
virtually undetectable partially substantiates my wonder.

For my edification, can you cite how this bias was burning you in your code? 

For example, when max is 3, the bias in randomness amounts to roughly 1 in 2^62.
 I strongly suspect the underlying rand to have biases of that order.   I'd
expect you'd need to perform a LOT of experiments before you could detect that
bias.  

You tested the other extreme, where max is very close to 2^64.  (see interesting
comment in the test).

When max is that large, each sample comes from a GIANT range.  As a result, any
small number of samples will come with a large bias because we don't come any
where near to getting enough samples to prove the range of answers are all even
possible :-/.

What is the example where this was harming you in a measurable way?

Again, the change is IMO "right."  rand is a real slow function usually, so I
don't like to see too many extra calls... but the likelihood of such extras are
small for most all users (who stay away from a giant max, that is not nearly a
power of 2).

On 2011/05/27 23:18:01, Jói wrote:
> On 2011/05/27 20:25:14, jar wrote:
> > Very cute.... but I think there is a typo, and this should read:
> > 
> > while(value >= max_acceptable_value)
> > 
> > Am I confused??
> > 
> > I think this currently gives a higher probability of returning 0 than any
> other
> > number in the range.
> 
> You're right, thanks for catching that. Fixed and uploaded a new version.

http://codereview.chromium.org/7080005/diff/4006/base/rand_util_unittest.cc
File base/rand_util_unittest.cc (right):

http://codereview.chromium.org/7080005/diff/4006/base/rand_util_unittest.cc#n...
base/rand_util_unittest.cc:73: // that is 2/3rds of the way to MAX_UINT64, in
which case the bottom
You got me to wondering about this ratio, and it wasn't clear that 2/3 was
optimally problematic.  I applied calculus (which I've only done about 4 times
in my life, after gettin' out'a school), and concluded that the maximum delta
between the broken-code-mean and the correct (range over 2 mean) is actually
achieved with a ratio R of:

3/4

(If I'm wrong, please correct me!  I'm notorious for clever ideas... and typos
in my arithmetic!).

As always, after grinding through the calculus, there is an easy way to see this
result (that the delta in the means is maximized with R=3/4)

We only consider R in the range of 1/2 to 1, so that we never have to consider
cases where the rand() values mapped more than twice into any return value.

The difference function is then a quadratic in R, where R is the ratio.  You can
pretty much (in retrospect) guess that since the calculation of the means is
done by summing all numbers in a range, and dividing by the count of numbers. 
We just have two different ranges that we need to sum (one after the modular
re-mapping).  Everything in that equation is based on R and (1-R), and the sum
of numbers from 1 to K is quadratic.

So you know it is *some* quadratic (i.e., a parabola)

If the ratio was 1/2, then there would be no difference in the mean (no bias
would be introduced).  Similarly, if the ratio were 1, there would be no bias. 
Hence we know that this parabola has a zero value at R=1/2, and R = 1.

Since parabolas are nice and symetric, the the zero-derivative (max) point is
half-way between those two equal values.

The precise equation (if I didn't screw up) for the difference in the means is:

MaxInt  * (R^2  -3R/2 + 1/2)

Again, feel free to correct my math.

Jói

Hi Jim, First off, this bug wasn't affecting me in any practical way (Ilya asked ...

9 years, 6 months ago (2011-05-30 15:31:42 UTC) #14

Hi Jim,

First off, this bug wasn't affecting me in any practical way (Ilya
asked the same question in the bug report).  It's just something I
noticed when reading the code and figured I'd fix since (a) the fix is
simple and (b) making it right avoids any possibility of extremely
subtle future bugs caused by the non-uniformity of
base::RandGenerator.

I agree that for small intervals, the non-uniformity would be hard to
detect, and for intervals that are powers of 2 there is no bias.  For
small intervals, the fix adds very close to zero overhead as in these
cases it's very unlikely that we call the underlying base::RandUint64
more than once (and if we do, it's because otherwise there could have
been bias).

I didn't really mean to spend much time in this area, thought it would
be a quick, end-of-Friday-afternoon kind of fix :)  But since you got
me interested, I wrote another quick test that verifies that our
random generator (RandUint64) generates randomness where we at least
see every bit take both possible values after looking at some number
of random values (in practice, on my Linux box, this only takes 10-20
iterations or less).  Please take a quick look at the new version I
uploaded, so I can get this checked in.

Your math, and the argument that the expected value would be a
hyperbola, looks right to me.  However, if I do the math another way,
just calculating the expected value, then for max = 2/3 * MAX_UINT64,
I get an expected value of max*(2*1/4 + 3/4)/3 = max*0.41667, and if I
use your max = 3/4 * MAX_UINT64, I get the same value, max*(2*1/6 +
3/6 + 5/6) / 4 = max*0.41667 (other values yield different results,
e.g. max=4/5*MAX_UINT64 gives an expected value of max*0.34).  Not
sure why the discrepancy since I agree with your argument.  Anyway, I
don't want to get too deep into it so I just changed the comment in
the file to not state that 2/3 is absolutely the worst case, just that
it is a degenerate case:

  // A degenerate case for such an implementation is e.g. a top of range
  // that is 2/3rds of the way to MAX_UINT64, in which case the bottom
  // half of the range would be twice as likely to occur as the top
  // half.

Hope that covers it.

Cheers,
Jói



On Sat, May 28, 2011 at 1:26 PM, <jar@chromium.org> wrote:
>
> Note: I don't think I wrote the original code... and I'm not blocking the
change
> or being defensive... but I am curious.
>
> LGTM with the additional change in the test below. If you really want this
class
> of test of rand, please choose the optimal ratio (or tell me why your ratio is
> optimal.. see comment in test).
>
> If the bias of randomness is critical to our application, we may perchance
want
> to add some tests for the quality of the rand on the various platforms (e.g.,
> check that at least each bit in a rand assumes both values sometimes... and
> maybe that looking at only one bit at a time, that it is reasonably random).
>
> I wouldn't want to make this a doctoral study, or waste a ton of test cycles
> doing this, but it might be worth having some tests of at least some simple
> things. (example: with maximal modulus, for each single bit, it is possible to
> get a zero and to get a one).
>
> If you're going to add tests to watch out for this you may as well write a few
> "fast" tests (since you suggest it will impact your use) and be done with it
> while you're hacking in this area. (...and I can believe on some platforms
this
> will be broken big time, such as by building a wimpy rand from a clock),
>
>
>
>
>
> http://codereview.chromium.org/7080005/diff/4003/base/rand_util.cc
> File base/rand_util.cc (right):
>
> http://codereview.chromium.org/7080005/diff/4003/base/rand_util.cc#newcode59
> base/rand_util.cc:59: } while (value > max_acceptable_value);
> OK... With that change I'm convinced it is "right," but when I've seen
> this code historically in many codebases, (despite being aware of the
> bias), I never thought it was significant (with uint64 max being a
> whopping 2^64).  The fact that your first patch introduced this class of
> bias (due to a typo) and was virtually undetectable partially
> substantiates my wonder.
>
> For my edification, can you cite how this bias was burning you in your
> code?
>
> For example, when max is 3, the bias in randomness amounts to roughly 1
> in 2^62.  I strongly suspect the underlying rand to have biases of that
> order.   I'd expect you'd need to perform a LOT of experiments before
> you could detect that bias.
>
> You tested the other extreme, where max is very close to 2^64.  (see
> interesting comment in the test).
>
> When max is that large, each sample comes from a GIANT range.  As a
> result, any small number of samples will come with a large bias because
> we don't come any where near to getting enough samples to prove the
> range of answers are all even possible :-/.
>
> What is the example where this was harming you in a measurable way?
>
> Again, the change is IMO "right."  rand is a real slow function usually,
> so I don't like to see too many extra calls... but the likelihood of
> such extras are small for most all users (who stay away from a giant
> max, that is not nearly a power of 2).
>
>
> On 2011/05/27 23:18:01, Jói wrote:
>>
>> On 2011/05/27 20:25:14, jar wrote:
>> > Very cute.... but I think there is a typo, and this should read:
>> >
>> > while(value >= max_acceptable_value)
>> >
>> > Am I confused??
>> >
>> > I think this currently gives a higher probability of returning 0
>
> than any
>>
>> other
>> > number in the range.
>
>> You're right, thanks for catching that. Fixed and uploaded a new
>
> version.
>
> http://codereview.chromium.org/7080005/diff/4006/base/rand_util_unittest.cc
> File base/rand_util_unittest.cc (right):
>
>
http://codereview.chromium.org/7080005/diff/4006/base/rand_util_unittest.cc#n...
> base/rand_util_unittest.cc:73: // that is 2/3rds of the way to
> MAX_UINT64, in which case the bottom
> You got me to wondering about this ratio, and it wasn't clear that 2/3
> was optimally problematic.  I applied calculus (which I've only done
> about 4 times in my life, after gettin' out'a school), and concluded
> that the maximum delta between the broken-code-mean and the correct
> (range over 2 mean) is actually achieved with a ratio R of:
>
> 3/4
>
> (If I'm wrong, please correct me!  I'm notorious for clever ideas... and
> typos in my arithmetic!).
>
> As always, after grinding through the calculus, there is an easy way to
> see this result (that the delta in the means is maximized with R=3/4)
>
> We only consider R in the range of 1/2 to 1, so that we never have to
> consider cases where the rand() values mapped more than twice into any
> return value.
>
> The difference function is then a quadratic in R, where R is the ratio.
> You can pretty much (in retrospect) guess that since the calculation of
> the means is done by summing all numbers in a range, and dividing by the
> count of numbers.  We just have two different ranges that we need to sum
> (one after the modular re-mapping).  Everything in that equation is
> based on R and (1-R), and the sum of numbers from 1 to K is quadratic.
>
> So you know it is *some* quadratic (i.e., a parabola)
>
> If the ratio was 1/2, then there would be no difference in the mean (no
> bias would be introduced).  Similarly, if the ratio were 1, there would
> be no bias.  Hence we know that this parabola has a zero value at R=1/2,
> and R = 1.
>
> Since parabolas are nice and symetric, the the zero-derivative (max)
> point is half-way between those two equal values.
>
> The precise equation (if I didn't screw up) for the difference in the
> means is:
>
> MaxInt  * (R^2  -3R/2 + 1/2)
>
> Again, feel free to correct my math.
>
> http://codereview.chromium.org/7080005/

jar (doing other things)

On Mon, May 30, 2011 at 8:31 AM, Jói Sigurðsson <joi@chromium.org> wrote: > Hi Jim, ...

9 years, 6 months ago (2011-05-30 16:40:41 UTC) #15

On Mon, May 30, 2011 at 8:31 AM, Jói Sigurðsson <joi@chromium.org> wrote:

> Hi Jim,
>
> First off, this bug wasn't affecting me in any practical way (Ilya
> asked the same question in the bug report).  It's just something I
> noticed when reading the code and figured I'd fix since (a) the fix is
> simple and (b) making it right avoids any possibility of extremely
> subtle future bugs caused by the non-uniformity of
> base::RandGenerator.
>

Agreed.  I think I use similar "biased" code in my field trials.  I can
probably switch over to using your central (fixed) code.


> I agree that for small intervals, the non-uniformity would be hard to
> detect, and for intervals that are powers of 2 there is no bias.  For
> small intervals, the fix adds very close to zero overhead as in these
> cases it's very unlikely that we call the underlying base::RandUint64
> more than once (and if we do, it's because otherwise there could have
> been bias).
>

Agreed.

>
> I didn't really mean to spend much time in this area, thought it would
> be a quick, end-of-Friday-afternoon kind of fix :)  But since you got
> me interested, I wrote another quick test that verifies that our
> random generator (RandUint64) generates randomness where we at least
> see every bit take both possible values after looking at some number
> of random values (in practice, on my Linux box, this only takes 10-20
> iterations or less).  Please take a quick look at the new version I
> uploaded, so I can get this checked in.
>
>
Agreed.


> Your math, and the argument that the expected value would be a
> hyperbola, looks right to me.  However, if I do the math another way,
> just calculating the expected value, then for max = 2/3 * MAX_UINT64,
> I get an expected value of max*(2*1/4 + 3/4)/3 = max*0.41667,


The key is that max is different in the two cases.  As a result,the actual
biased means are distinct in the two cases, and the unbiased means are
similarly distinct (both scaled by the fraction in each case).  That in turn
causes the measurable delta twixt the biased and unbiased means to be vary
depending on the fraction.  In fact, the measurable delta is also scaled by
the factor 3/4 or by 2/3, depending on the case.  Hence the measurable delta
is larger in the 3/4 case.


>  Anyway, I
> don't want to get too deep into it so I just changed the comment in
> the file to not state that 2/3 is absolutely the worst case, just that
> it is a degenerate case:
>
>  // A degenerate case for such an implementation is e.g. a top of range
>   // that is 2/3rds of the way to MAX_UINT64, in which case the bottom
>   // half of the range would be twice as likely to occur as the top
>   // half.
>

OK.  I too spent too much time... but it was a fun toy problem to see what
the optimal was.


>
> Hope that covers it.
>
> Cheers,
> Jói
>
>
SGTM.

Jim


>
>
> On Sat, May 28, 2011 at 1:26 PM, <jar@chromium.org> wrote:
> >
> > Note: I don't think I wrote the original code... and I'm not blocking the
> change
> > or being defensive... but I am curious.
> >
> > LGTM with the additional change in the test below. If you really want
> this class
> > of test of rand, please choose the optimal ratio (or tell me why your
> ratio is
> > optimal.. see comment in test).
> >
> > If the bias of randomness is critical to our application, we may
> perchance want
> > to add some tests for the quality of the rand on the various platforms
> (e.g.,
> > check that at least each bit in a rand assumes both values sometimes...
> and
> > maybe that looking at only one bit at a time, that it is reasonably
> random).
> >
> > I wouldn't want to make this a doctoral study, or waste a ton of test
> cycles
> > doing this, but it might be worth having some tests of at least some
> simple
> > things. (example: with maximal modulus, for each single bit, it is
> possible to
> > get a zero and to get a one).
> >
> > If you're going to add tests to watch out for this you may as well write
> a few
> > "fast" tests (since you suggest it will impact your use) and be done with
> it
> > while you're hacking in this area. (...and I can believe on some
> platforms this
> > will be broken big time, such as by building a wimpy rand from a clock),
> >
> >
> >
> >
> >
> > http://codereview.chromium.org/7080005/diff/4003/base/rand_util.cc
> > File base/rand_util.cc (right):
> >
> >
> http://codereview.chromium.org/7080005/diff/4003/base/rand_util.cc#newcode59
> > base/rand_util.cc:59: } while (value > max_acceptable_value);
> > OK... With that change I'm convinced it is "right," but when I've seen
> > this code historically in many codebases, (despite being aware of the
> > bias), I never thought it was significant (with uint64 max being a
> > whopping 2^64).  The fact that your first patch introduced this class of
> > bias (due to a typo) and was virtually undetectable partially
> > substantiates my wonder.
> >
> > For my edification, can you cite how this bias was burning you in your
> > code?
> >
> > For example, when max is 3, the bias in randomness amounts to roughly 1
> > in 2^62.  I strongly suspect the underlying rand to have biases of that
> > order.   I'd expect you'd need to perform a LOT of experiments before
> > you could detect that bias.
> >
> > You tested the other extreme, where max is very close to 2^64.  (see
> > interesting comment in the test).
> >
> > When max is that large, each sample comes from a GIANT range.  As a
> > result, any small number of samples will come with a large bias because
> > we don't come any where near to getting enough samples to prove the
> > range of answers are all even possible :-/.
> >
> > What is the example where this was harming you in a measurable way?
> >
> > Again, the change is IMO "right."  rand is a real slow function usually,
> > so I don't like to see too many extra calls... but the likelihood of
> > such extras are small for most all users (who stay away from a giant
> > max, that is not nearly a power of 2).
> >
> >
> > On 2011/05/27 23:18:01, Jói wrote:
> >>
> >> On 2011/05/27 20:25:14, jar wrote:
> >> > Very cute.... but I think there is a typo, and this should read:
> >> >
> >> > while(value >= max_acceptable_value)
> >> >
> >> > Am I confused??
> >> >
> >> > I think this currently gives a higher probability of returning 0
> >
> > than any
> >>
> >> other
> >> > number in the range.
> >
> >> You're right, thanks for catching that. Fixed and uploaded a new
> >
> > version.
> >
> >
> http://codereview.chromium.org/7080005/diff/4006/base/rand_util_unittest.cc
> > File base/rand_util_unittest.cc (right):
> >
> >
>
http://codereview.chromium.org/7080005/diff/4006/base/rand_util_unittest.cc#n...
> > base/rand_util_unittest.cc:73: // that is 2/3rds of the way to
> > MAX_UINT64, in which case the bottom
> > You got me to wondering about this ratio, and it wasn't clear that 2/3
> > was optimally problematic.  I applied calculus (which I've only done
> > about 4 times in my life, after gettin' out'a school), and concluded
> > that the maximum delta between the broken-code-mean and the correct
> > (range over 2 mean) is actually achieved with a ratio R of:
> >
> > 3/4
> >
> > (If I'm wrong, please correct me!  I'm notorious for clever ideas... and
> > typos in my arithmetic!).
> >
> > As always, after grinding through the calculus, there is an easy way to
> > see this result (that the delta in the means is maximized with R=3/4)
> >
> > We only consider R in the range of 1/2 to 1, so that we never have to
> > consider cases where the rand() values mapped more than twice into any
> > return value.
> >
> > The difference function is then a quadratic in R, where R is the ratio.
> > You can pretty much (in retrospect) guess that since the calculation of
> > the means is done by summing all numbers in a range, and dividing by the
> > count of numbers.  We just have two different ranges that we need to sum
> > (one after the modular re-mapping).  Everything in that equation is
> > based on R and (1-R), and the sum of numbers from 1 to K is quadratic.
> >
> > So you know it is *some* quadratic (i.e., a parabola)
> >
> > If the ratio was 1/2, then there would be no difference in the mean (no
> > bias would be introduced).  Similarly, if the ratio were 1, there would
> > be no bias.  Hence we know that this parabola has a zero value at R=1/2,
> > and R = 1.
> >
> > Since parabolas are nice and symetric, the the zero-derivative (max)
> > point is half-way between those two equal values.
> >
> > The precise equation (if I didn't screw up) for the difference in the
> > means is:
> >
> > MaxInt  * (R^2  -3R/2 + 1/2)
> >
> > Again, feel free to correct my math.
> >
> > http://codereview.chromium.org/7080005/
>

Jói

Thanks Jim. Let me know when I have your LGTM for the added test. > ...

9 years, 6 months ago (2011-05-30 16:47:03 UTC) #16

Thanks Jim.  Let me know when I have your LGTM for the added test.

> [...]  In fact, the measurable delta is also scaled by
> the factor 3/4 or by 2/3, depending on the case.  Hence the measurable delta
> is larger in the 3/4 case.

Right, you're absolutely right.  I've switched to 3/4ths in the test.

> OK.  I too spent too much time... but it was a fun toy problem to see what
> the optimal was.

:-)  It's a fun problem to think about.

Cheers,
Jói



On Mon, May 30, 2011 at 12:40 PM, Jim Roskind <jar@chromium.org> wrote:
> On Mon, May 30, 2011 at 8:31 AM, Jói Sigurðsson <joi@chromium.org> wrote:
>>
>> Hi Jim,
>>
>> First off, this bug wasn't affecting me in any practical way (Ilya
>> asked the same question in the bug report).  It's just something I
>> noticed when reading the code and figured I'd fix since (a) the fix is
>> simple and (b) making it right avoids any possibility of extremely
>> subtle future bugs caused by the non-uniformity of
>> base::RandGenerator.
>
> Agreed.  I think I use similar "biased" code in my field trials.  I can
> probably switch over to using your central (fixed) code.
>
>>
>> I agree that for small intervals, the non-uniformity would be hard to
>> detect, and for intervals that are powers of 2 there is no bias.  For
>> small intervals, the fix adds very close to zero overhead as in these
>> cases it's very unlikely that we call the underlying base::RandUint64
>> more than once (and if we do, it's because otherwise there could have
>> been bias).
>
> Agreed.
>>
>> I didn't really mean to spend much time in this area, thought it would
>> be a quick, end-of-Friday-afternoon kind of fix :)  But since you got
>> me interested, I wrote another quick test that verifies that our
>> random generator (RandUint64) generates randomness where we at least
>> see every bit take both possible values after looking at some number
>> of random values (in practice, on my Linux box, this only takes 10-20
>> iterations or less).  Please take a quick look at the new version I
>> uploaded, so I can get this checked in.
>>
>
> Agreed.
>
>>
>> Your math, and the argument that the expected value would be a
>> hyperbola, looks right to me.  However, if I do the math another way,
>> just calculating the expected value, then for max = 2/3 * MAX_UINT64,
>> I get an expected value of max*(2*1/4 + 3/4)/3 = max*0.41667,
>
> The key is that max is different in the two cases.  As a result,the actual
> biased means are distinct in the two cases, and the unbiased means are
> similarly distinct (both scaled by the fraction in each case).  That in turn
> causes the measurable delta twixt the biased and unbiased means to be vary
> depending on the fraction.  In fact, the measurable delta is also scaled by
> the factor 3/4 or by 2/3, depending on the case.  Hence the measurable delta
> is larger in the 3/4 case.
>
>>
>>  Anyway, I
>> don't want to get too deep into it so I just changed the comment in
>> the file to not state that 2/3 is absolutely the worst case, just that
>> it is a degenerate case:
>>
>>  // A degenerate case for such an implementation is e.g. a top of range
>>  // that is 2/3rds of the way to MAX_UINT64, in which case the bottom
>>  // half of the range would be twice as likely to occur as the top
>>  // half.
>
> OK.  I too spent too much time... but it was a fun toy problem to see what
> the optimal was.
>
>>
>> Hope that covers it.
>>
>> Cheers,
>> Jói
>>
>
> SGTM.
> Jim
>
>>
>> On Sat, May 28, 2011 at 1:26 PM, <jar@chromium.org> wrote:
>> >
>> > Note: I don't think I wrote the original code... and I'm not blocking
>> > the change
>> > or being defensive... but I am curious.
>> >
>> > LGTM with the additional change in the test below. If you really want
>> > this class
>> > of test of rand, please choose the optimal ratio (or tell me why your
>> > ratio is
>> > optimal.. see comment in test).
>> >
>> > If the bias of randomness is critical to our application, we may
>> > perchance want
>> > to add some tests for the quality of the rand on the various platforms
>> > (e.g.,
>> > check that at least each bit in a rand assumes both values sometimes...
>> > and
>> > maybe that looking at only one bit at a time, that it is reasonably
>> > random).
>> >
>> > I wouldn't want to make this a doctoral study, or waste a ton of test
>> > cycles
>> > doing this, but it might be worth having some tests of at least some
>> > simple
>> > things. (example: with maximal modulus, for each single bit, it is
>> > possible to
>> > get a zero and to get a one).
>> >
>> > If you're going to add tests to watch out for this you may as well write
>> > a few
>> > "fast" tests (since you suggest it will impact your use) and be done
>> > with it
>> > while you're hacking in this area. (...and I can believe on some
>> > platforms this
>> > will be broken big time, such as by building a wimpy rand from a clock),
>> >
>> >
>> >
>> >
>> >
>> > http://codereview.chromium.org/7080005/diff/4003/base/rand_util.cc
>> > File base/rand_util.cc (right):
>> >
>> >
>> >
http://codereview.chromium.org/7080005/diff/4003/base/rand_util.cc#newcode59
>> > base/rand_util.cc:59: } while (value > max_acceptable_value);
>> > OK... With that change I'm convinced it is "right," but when I've seen
>> > this code historically in many codebases, (despite being aware of the
>> > bias), I never thought it was significant (with uint64 max being a
>> > whopping 2^64).  The fact that your first patch introduced this class of
>> > bias (due to a typo) and was virtually undetectable partially
>> > substantiates my wonder.
>> >
>> > For my edification, can you cite how this bias was burning you in your
>> > code?
>> >
>> > For example, when max is 3, the bias in randomness amounts to roughly 1
>> > in 2^62.  I strongly suspect the underlying rand to have biases of that
>> > order.   I'd expect you'd need to perform a LOT of experiments before
>> > you could detect that bias.
>> >
>> > You tested the other extreme, where max is very close to 2^64.  (see
>> > interesting comment in the test).
>> >
>> > When max is that large, each sample comes from a GIANT range.  As a
>> > result, any small number of samples will come with a large bias because
>> > we don't come any where near to getting enough samples to prove the
>> > range of answers are all even possible :-/.
>> >
>> > What is the example where this was harming you in a measurable way?
>> >
>> > Again, the change is IMO "right."  rand is a real slow function usually,
>> > so I don't like to see too many extra calls... but the likelihood of
>> > such extras are small for most all users (who stay away from a giant
>> > max, that is not nearly a power of 2).
>> >
>> >
>> > On 2011/05/27 23:18:01, Jói wrote:
>> >>
>> >> On 2011/05/27 20:25:14, jar wrote:
>> >> > Very cute.... but I think there is a typo, and this should read:
>> >> >
>> >> > while(value >= max_acceptable_value)
>> >> >
>> >> > Am I confused??
>> >> >
>> >> > I think this currently gives a higher probability of returning 0
>> >
>> > than any
>> >>
>> >> other
>> >> > number in the range.
>> >
>> >> You're right, thanks for catching that. Fixed and uploaded a new
>> >
>> > version.
>> >
>> >
>> > http://codereview.chromium.org/7080005/diff/4006/base/rand_util_unittest.cc
>> > File base/rand_util_unittest.cc (right):
>> >
>> >
>> >
http://codereview.chromium.org/7080005/diff/4006/base/rand_util_unittest.cc#n...
>> > base/rand_util_unittest.cc:73: // that is 2/3rds of the way to
>> > MAX_UINT64, in which case the bottom
>> > You got me to wondering about this ratio, and it wasn't clear that 2/3
>> > was optimally problematic.  I applied calculus (which I've only done
>> > about 4 times in my life, after gettin' out'a school), and concluded
>> > that the maximum delta between the broken-code-mean and the correct
>> > (range over 2 mean) is actually achieved with a ratio R of:
>> >
>> > 3/4
>> >
>> > (If I'm wrong, please correct me!  I'm notorious for clever ideas... and
>> > typos in my arithmetic!).
>> >
>> > As always, after grinding through the calculus, there is an easy way to
>> > see this result (that the delta in the means is maximized with R=3/4)
>> >
>> > We only consider R in the range of 1/2 to 1, so that we never have to
>> > consider cases where the rand() values mapped more than twice into any
>> > return value.
>> >
>> > The difference function is then a quadratic in R, where R is the ratio.
>> > You can pretty much (in retrospect) guess that since the calculation of
>> > the means is done by summing all numbers in a range, and dividing by the
>> > count of numbers.  We just have two different ranges that we need to sum
>> > (one after the modular re-mapping).  Everything in that equation is
>> > based on R and (1-R), and the sum of numbers from 1 to K is quadratic.
>> >
>> > So you know it is *some* quadratic (i.e., a parabola)
>> >
>> > If the ratio was 1/2, then there would be no difference in the mean (no
>> > bias would be introduced).  Similarly, if the ratio were 1, there would
>> > be no bias.  Hence we know that this parabola has a zero value at R=1/2,
>> > and R = 1.
>> >
>> > Since parabolas are nice and symetric, the the zero-derivative (max)
>> > point is half-way between those two equal values.
>> >
>> > The precise equation (if I didn't screw up) for the difference in the
>> > means is:
>> >
>> > MaxInt  * (R^2  -3R/2 + 1/2)
>> >
>> > Again, feel free to correct my math.
>> >
>> > http://codereview.chromium.org/7080005/
>
>

jar (doing other things)

Thanks for adding this as well. See comment. If you really like the current form ...

9 years, 6 months ago (2011-05-30 17:02:07 UTC) #17

Jói

Heh, no, the bitwise approach is much nicer, I've updated to that. I blame not ...

9 years, 6 months ago (2011-05-30 17:16:38 UTC) #18

Heh, no, the bitwise approach is much nicer, I've updated to that.  I
blame not having had my second coffee of the day for writing it in
that weird way.

I also set the maximum iterations to 1000.  The test already completed
in several seconds with the other maximum, but as per your logic there
is only a 1 in ~10^301 chance that the test won't complete within 1000
iterations assuming the random generator gives each bit a close to
equal chance of either value.

Cheers,
Jói


On Mon, May 30, 2011 at 1:02 PM,  <jar@chromium.org> wrote:
> Thanks for adding this as well.  See comment.
>
> If you really like the current form of the test (rather than the bitwise
> suggestion), then LGTM with a smaller limit on the loop.
>
>
> http://codereview.chromium.org/7080005/diff/8/base/rand_util_unittest.cc
> File base/rand_util_unittest.cc (right):
>
>
http://codereview.chromium.org/7080005/diff/8/base/rand_util_unittest.cc#newc...
> base/rand_util_unittest.cc:114: for (size_t i = 0; i < 1000000; ++i) {
> Each bit *should* have a 50-50 probabilty.  Hence over 100 cycles
> without getting a bit to toggle would be ridiculously improbable, and
> probably 1000 would be wildly conservative.
>
> Then again... this test should never make it even near to that bound
> unless it is broken.  The advantage of ending the test on this boundary,
> is we avoid running so long that the test simply "doesn't finish" and we
> have a less clear failure.
>
>
http://codereview.chromium.org/7080005/diff/8/base/rand_util_unittest.cc#newc...
> base/rand_util_unittest.cc:115: uint64 value = base::RandUint64();
> FYI: You could have used bit-wise AND and bitwise OR to see how long it
> took to get a 0 in each position, and a 1 in each position, rather than
> the vector manipulation:
> uint64 found_ones = 0x0;  // bit of 1 shows we found a 1.
> uint64 found_zeros = ~found_ones;  // bit 0 means found a 0.
> while (0 != found_zeros && ~found_ones != 0) {
>  int64 value = base::RandInt64();
>  found_ones |= value;
>  found_zeros &= value;
> }
>
> http://codereview.chromium.org/7080005/
>

LGTM

Issue 7080005: Fix base::RandGenerator bug (it had non-uniform random distribution). (Closed)

Description

Patch Set 1 #

Patch Set 2 : Respond to review comments. #

Patch Set 3 : Fix issue pointed out by jar@ #

Patch Set 4 : Add "all bit values" test as suggested by jar@. #

Patch Set 5 : Switch to 3/4ths R for max. #

Patch Set 6 : Switch to bitwise operations. Max to 1000. #

Messages