Smarter sampling methods
VEGAS IN GENERAL
When information about the function to be integrated is not available,
the preference clearly is on adaptive techniques, i.e. on algorithms that
learn about the function when integrating them. Vegas, which will be
discussed here is a prime example for such an adaptive Monte Carlo
algorithm; it automatically concentrates on those regions of the integrand
where it is largest.
Vegas starts by dividing the n-dimensional hypercube [0,1]n into
smaller ones - usually of identical size - and performs a sampling in
each of them. The results are then used to refine the integration grid for
the next iteration of the sampling procedure. The refinement is done by
approximating the optimal p.d.f. p of the function f,
by a step function in each bin. The trick in Vegas is to factorise the p.d.f. p
in n dimensions through a product of n p.d.f.'s pi in each dimension, i.e.
which clearly saves memory and obviously limits the usability of Vegas to those
integrands which have factorising structure. The latter requirement is often met
in particle physics, a fact that explains why Vegas is widely used in this field.
However, the nice thing about Vegas is that the evaluation of an integral can be
divided into two phases: Firstly, the grid is initialised and optimised with many
iterations and comparably few sampling points, and only then the "true" sampling
leading to a result starts with comparably many sampling points distributed over
the now frozen grid. In each iteration j, the estimator and the variance are calculated
afresh, based on the sampling points of this iteration only,
These individual results can be combined to yield a global estimator,
where each estimator contributes weighted by the number of sampling points used for it
and by the variance σ attached to it. The reliablitiy of the total procedure
is then judged according to the χ² per degree of freedom,
which checks whether the individual results are consistent with each other. This
is signalled by values around 1 for the χ² per d.o.f..
WEIGHTS IN VEGAS: THE P.D.F.'s AND THEIR EVOLUTION
In fact, a closer look into the algorithm reveals that instead of reweighting the
bins in each iteration step, their boundaries are shifted. This is done in the following
way (for simplicity, scetched for one dimension only): The first iteration starts out
with a fixed number Nbins of bins, each of which having the same size
Δxi= 1/Nbins.
After the first iteration, each bin i is subdivided into mi bins according to
its contribution <Ii> to the total estimator,
where K is a constant steering the granularity of the subdivision step. Of course, each
of the individual estimators is given by
(Obviously the total number of sampling points cancels out here.) Assuming that each sub-bin
contributes in due proportion to the corresponding bin, the sub-bins are regrouped into
Nbins bins again, such that the new anticipated contributions of each new bin to
the total estimator is the same. In practise this translates into few sub-bins to
form a new bin in those regions where the integrand is largest in magnitude and many
sub-bins to form a new bin in those region where the function is smallest. Of course, in
further iterations these newly found bins can be refined even more.
In so doing, however, the binning and rebinning may lead to oscillations in the grid
which destabilise the convergence. to avoid them, a damping term may be included
into the evaluation of the mi, for example
this is in fact the rebinning distribution implemented in Vegas, with the convergence
parameter α typically in the interval [0.2,2].