pqRand

Quantile sampling

Random number generators are the backbone of Monte Carlo methods, and have become increasingly popular in the last few decades due to their ability to handle nonlinear systems. Of course, the accuracy of the Monte Carlo result depends on the quality of the random sample.

When I was playing around with particle physics simulations and floating point stability, I found a flaw in "inverse transform sampling", a common method for drawing random numbers. A rather formal definition of this technique can be found on Wikipedia, but Amit Sharma has created a more intuitive example on Quora. Since the quantile function Q (a.k.a. the inverse CDF) is the key function behind this technique, I will hereafter refer to inverse transform sampling as "quantile sampling".

My most recent paper discusses in great detail the problem with the standard implementation of quantile sampling (a problem found in both GNU's C++ std::random functions and NumPy's random package). To summarize quickly; standard quantile sampling will lose precision in the tails. If we think of the sample as an image, with the mode at the center and the tails near the edges, then the image becomes pixelated at the edges. This is a problem in two cases; (i) you are interested in low-probability events or (ii) you have a very large sample size. My pqRand package solves this problem, producing a random sample with the maximum precision allowed by the floating point data at a minimal computational cost. I have implemented pqRand in C++ and Python, which should allow anyone to improve the precision of their Monte Carlo simulation and, hopefully, their results. If you decide to use this package, please let me know! I am especially open to bug reports or feature suggestions; just shoot me an email.

Forke me on Github

Rejection sampling draws from the proposal distribution (red) and rejects all points above the blue line, to remove the excess are of the red curve.

Of course, you might be thinking, "Quantile sampling is great, but it only works for a few special distributions which have CDFs that are calculable and invertible. My Monte Carlo uses rejection sampling." This is an absolutely valid point, but rejection sampling relies on a drawing an original sample from a proposal distribution, then shaving that sample down until it fits a target distribution. Therefore, the precision of the target distribution cannot exceed that of the proposal distribution. If the proposal distribution uses quantile sampling (even if it's a uniform distribution), then the target distribution could very well be vulnerable to a loss of precision.