C.1 Method Based on a Change of Variable
Tải bản đầy đủ - 0trang
170
Appendix C: Drawing Random Variables with Prescribed Distributions
can simply generate x by drawing a uniform random number u and computing
x = F −1 (u)
(C.4)
where F −1 is the reciprocal function of F. In practice, this method is useful only
when an analytical expression of F −1 is available, which already covers a number
of usual cases of interest, like exponential or power law distributions. For instance,
an exponential distribution
p(x) = λ e−λx
(x > 0)
(C.5)
with λ > 0 can be simulated using the change of variable
x =−
1
ln(1 − u).
λ
(C.6)
Since u and (1 − u) have the same uniform distribution, one can in principle replace
(1 − u) by u in the r.h.s. of Eq. (C.6). One however needs to pay attention to the
fact that the argument of the logarithm has to be non-zero, which guides the choice
between u and (1 − u), depending on whether 0 or 1 is excluded by the random
number generator. Similarly, a power-law distribution
p(x) =
αx0α
x 1+α
(x > x0 )
(C.7)
with α > 0, can be simulated using
x = x0 (1 − u)−1/α .
(C.8)
Here again, the same comment about the choice of u or (1 − u) applies. Many other
examples where this method is applicable can be found.
When no analytical expression of the reciprocal function F −1 is available, one
could think of using a numerical estimate of this function. There are however other
more convenient methods that can be used in this case, as the rejection method
described below.
Before describing this generic method, let us mention a generalization of the
change of variable method, which as an important application allows for the simulation of a Gaussian distribution. Instead of making a change of variable on single
variables, one can consider couples of random variables: (x1 , x2 ) = F(u 1 , u 2 ), where
u 1 and u 2 are two independent uniform random numbers. It can be shown [1] that
the following choice
x1 =
−2 ln u 1 cos(2πu 2 ),
x2 =
−2 ln u 1 sin(2πu 2 ),
(C.9)
Appendix C: Drawing Random Variables with Prescribed Distributions
171
leads to a pair of independent Gaussian random variables x1 and x2 , each with
distribution
1
2
(C.10)
p(x) = √ e−x /2 .
2π
In practice, one often needs a single Gaussian variable at a time, and uses only one
of the variables (x1 , x2 ). A Gaussian variable y of mean m and variance σ can be
obtained by the simple rescaling y = m + σx, where x satisfies the distribution
(C.10).
C.2
Rejection Method
An alternative method, which is applicable to any distribution, is the rejection method
that we now describe. Starting from an arbitrary target distribution p(x) defined over
an interval (a, b) (where a and/or b may be infinite), one first needs to find an auxiliary
positive function G(x) satisfying the three following conditions: (i) for all x such
b
that a < x < b, G(x) ≥ p(x); (ii) a G(x) d x is finite; (iii) one is able to generate
numerically a random variable x with distribution
p(x)
˜
=
G(x)
b
a
(a < x < b),
G(x ) d x
(C.11)
through another method, for instance using a change of variable. Then the rejection
method consists in two steps. First, a random number x is generated according to
the distribution p(x).
˜
Second, x is accepted with probability p(x)/G(x); this is
done by drawing a uniform random number u over the interval (0, 1), and accepting
x if u < p(x)/G(x). The geometrical interpretation of the rejection procedure is
illustrated in Fig. C.1.
That the resulting variable x is distributed according to p(x) can be shown using
the following simple reasoning. Let us symbolically denote as A the event of drawing
the variable x according to p(x),
˜
and as B the event that x is subsequently accepted.
We are interested in the conditional probability P(A|B), that is, the probability
distribution of the accepted variable. One has the standard relation
P(A|B) =
P(A ∪ B)
.
P(B)
(C.12)
The joint probability P(A ∪ B) is simply the product of the probability p(x)
˜
and the
acceptance probability p(x)/G(x), yielding from Eq. (C.11)
P(A ∪ B) =
p(x)
b
a
G(x ) d x
.
(C.13)
172
Appendix C: Drawing Random Variables with Prescribed Distributions
G(x)
0.4
p(x)
p(x), G(x)
0.3
P1
P2
0.2
Acceptance area
0.1
0
0
1
Rejection area
2
3
4
5
x
Fig. C.1 Illustration of the rejection method, aiming at drawing a random variable according to
the normalized probability distribution p(x) (full line). The function G(x) (dashed line) is a simple
upper bound of p(x) (here, simply a linear function). A point P is randomly drawn, with uniform
probability, in the area between the horizontal axis and the function G(x). If P is below the curve
defining the distribution p, its abscissa x is accepted (point P1 ); it is otherwise rejected (point P2 ).
The random variable x constructed in this way has probability density p(x)—see text
Then, P(B) is obtained by summing P(A ∪ B) over all events A, yielding
b
P(B) =
dx
a
p(x)
b
a
G(x ) d x
1
=
b
a
G(x ) d x
.
(C.14)
Combining Eqs. (C.12)–(C.14) eventually leads to P(A|B) = p(x).
From a theoretical viewpoint, any function satisfying conditions (i), (ii) and (iii) is
appropriate. Considering the efficiency of the numerical computation, it is however
useful to minimize the rejection rate, equal from Eq. (C.14) to
r =1−
1
b
a
G(x) d x
.
(C.15)
b
Hence the choice of the function G(x) should also try to minimize a G(x) d x, to
make it relatively close to 1 if possible. Note that G(x) does not need to be a close
upper approximation of p(x) everywhere, only the integral of G(x) matters.
Reference
1. W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes, The Art of
Scientific Computing, 3rd edn. (Cambridge University Press, Cambridge, 2007)