Wikipedia gives the formula *n(p;H) â‰ˆ sqrt(2 H ln(1/(1-p))*. If you directly
transcribe that expression into Python:

def n_naive(p,H): return max(2,sqrt(2 * H * log(1/(1-p))))and try to reproduce the leftmost columns of that table:

def dotable(f): bb = (16, 32, 64, 128, 256, 384, 512) pp = (1e-18, 1e-15, 1e-12, 1e-9, 1e-6, .001) pt = "1e-18 1e-15 1e-12 1e-9 1e-6 .1%".split() print " " + " ".join("%7s" % p for p in pt) for b in bb: print "%3d" % b, for p in pp: print "%7.2g" % f(p,2**b), print dotable(n_naive)you'll be disappointed with the results: the left-hand column will be all "2"s, and the second column will have several values that don't even share one significant figure with the true value.

The reason is that the subexpression `log(1/(1-p))` is computing
a value very close to 1, then taking its log. For very small values of
`p`, `1-p` isn't even different from `p`, leading to
the computation `log(1) == 0`!

This is a perfect opportunity to use `log1p`. First, use the
identity *ln 1/x = -ln x* to obtain the expression
`-log(1-p)` but `-log(1-p) = -log1p(-p)`. This leads to
the second implementation of the function:

def n_clever(p,H): return max(2,sqrt(2 * H * -log1p(-p))) dotable(n_clever)Now, the values shown will match Wikipedia's table.

**Files currently attached to this page:**

nrc.py | 591 bytes |

Entry first conceived on 29 May 2012, 13:03 UTC, last modified on 16 October 2013, 14:35 UTC