Giunti and Laveder |
The Power of Confidence Intervals |
Bityukov and Krasnikov |
Uncertainties and discovery potential in planned experiments |
Cooper-Sarkar |
The ZEUS NLO QCD fit to determine parton distributions and $\alpha_s$ |
Kjaer |
Exact optimization of likelihood analyses in a unified
Bayesian-Frequentist Monte Carlo Framework: The "K2-filter" |
Kjaer |
Optimal statistical treatment of parameterized systematic errors
in a unified Bayesian-Frequentist Monte Carlo Framework. |
Kjaer |
Statistically regularized deconvolution of unbinned spectra
containing event-by-event resolution functions in a unified
Bayesian-Frequentist Monte Carlo Framework. |
Kahrimanis |
Objectively derived default prior depends on stopping rule;
Bayesian treatment of nuisance parameters is defended |
Sharafiddinov |
Mass of the Neutrino and Evolution of the Universe. |
Wolter |
Measurement of physical quantities in the Bayesian framework using neural networks to fit
the probabilitydistributions |
Redin |
Advanced Statistical Techniques in muon g-2 experiment at BNL |
Aslan and Zech |
Comparison of different goodness-of-fit tests |
Roe and Woodroofe |
BooNE Neutrino Oscillations |
Wang |
Diagnose bad fit to multiple data sets |
Vaiciulis |
Support Vector Machines in Analysis of Top Quark Production |
Sina and Seo |
Low Count Poisson Statistics and Cosmic-ray Spectral Studies |
Yabsley |
Statistical practice at the Belle experiment, and some questions |
Angelini et al. |
Jet reconstruction through a general purpose clustering algorithm |
Rolke and Lopez |
Bias-Corrected Confidence Intervals For Rare Decays |
Blobel and Kleinwort |
A new method for the high-precision alignment of track detectors |
Blobel |
An unfolding method for high energy physics experiments |
Kinoshita |
Estimating Goodness-of-Fit for Unbinned Maximum Likelihood Fitting |
Signorelli et al. |
Strong Confidence limits and frequentist treatment of systematics. |
Bock and Wittek |
Multidimensional event classification in images from gamma ray air showers |
Parodi et al. |
How to include the information coming from $B^0_s$ oscillations in CKM fits |
Hakl and Richter-Was |
Application of Neural Networks optimized by Genetic Algorithms to Higgs boson search |
Punzi |
Limits setting in difficult cases and the Strong Confidence approach. |
Karlen |
Credibility of Confidence Intervals |
Demortier |
Bayesian treatments of systematic uncertainties |
Conrad et al. |
Coverage of Confidence Intervals calculated
in the Presence of Systematic Uncertainties. |
Hill and DeYoung |
Application of Bayesian statistics to muon track reconstruction in AMANDA |
Raja |
Confidence limits and their errors |
Towers |
Overview of Probability Density Estimation methods (PDE's) |
Towers |
How to optimise the signal/background discrimination of a MV analysis (hint: reduce the number of variables) |
Stump |
New Generation of Parton Distributions and Methods of Uncertainty Analysis |
Bluemlein & Boettcher |
Polarized Parton Distributions and their Errors |
Reisert |
The H1 NLO QCD fits to determine $\alpha_s$ and parton distribution functions. |
C. Giunti
|
INFN, Sez. di Torino, and Dip. di Fisica Teorica, Univ. di Torino, I-10125 Torino, Italy
|
M. Laveder
|
Dip. di Fisica "G. Galilei", Univ. di Padova,
and INFN, Sez. di Padova, I-35131 Padova, Italy
|
The Power of Confidence Intervals
|
We consider the power to reject false values of the parameter
in Frequentist methods
for the calculation of confidence intervals.
We connect the power with the physical significance
(reliability)
of confidence intervals
for a parameter bounded to be non-negative.
We show that
the confidence intervals
(upper limits) obtained with a (biased) method that
near the boundary has
large power in testing the parameter against larger alternatives
and
small power in testing the parameter against smaller alternatives
are physically more significant.
Considering the recently proposed methods with correct coverage,
we show that
the physical significance of
upper limits is smallest in the Unified Approach
and highest in the Maximum Likelihood Estimator method.
We illustrate our arguments in
the specific cases of a bounded Gaussian distribution
and a Poisson distribution with known background.
|
S.I. Bityukov
|
Institute for High Energy Physics, Protvino
|
N.V. Krasnikov
|
Institute for Nuclear Research, Moscow
|
Uncertainties and discovery potential in planned experiments
|
Several criteria used by physicist to quantify the ratio of signal
to background are compared. The approach for taking into account
the uncertainty in estimations of signal and background is proposed
|
A.M.Cooper-Sarkar (Oxford U.)
|
on behalf of the ZEUS Collaboration
|
The ZEUS NLO QCD fit to determine parton distributions and $\alpha_s$
|
A next-to-leading order QCD fit was performed on the $e^+p$
deep inelastic scattering data collected by the ZEUS experiment
over the years 1996-1997, together with fixed taarget data.
The gluon and quark densities of the proton have been extracted.
Experimental systematic errors take into account point to
point correlations. This allows a correct evaluation of experimental
uncertainties in the extracted parton densities. A combined
fit for the strong coupling constant,$\alpha_s(M^2_Z)$, and the
gluon and quark densities was performed yielding
$\alpha_s(M^2_Z) = 0.117\pm0.003\pm0.004$
|
Niels Kjaer
|
CERN, CH - 1211 Geneva 23, Switzerland
|
Jorgen D'Hondt
|
IIHE (ULB-VUB), Pleinlaan 2, BE-1050 Brussels, Belgium
|
Exact optimization of likelihood analyses in a unified
Bayesian-Frequentist Monte Carlo Framework: The "K2-filter"
|
This paper presents a generalization of the Kalman filter
process control to a situation where the stochastic parts
are not analytically computable and only available as Monte
Carlo generated events.
In a Monte Carlo framework where the exact Matrix elements,
Pi(x), are known on an event-by-event basis the maximum
likelihood technique provides the optimal analysis.
To O(1/n), where n is the number of events the information
content can be computed as:
I = 1/error^2 = sum_i gi^2, where gi = -dln(Pi(x))/dx|x0
gi is thus exactly computable from the generation information.
When the measurement result, E(x), is close to the reference point, x0,
gi contains all the needed information for the estimation of E(x).
On the analysis level this paper shows that the error is given by:
I = 1/error^2 = sum_i [ gi^2 - (ai-gi)^2 ] ,
where ai = -dln(Li(x))/dx|x0 is the information on the analysis level.
Therefore, any likelihood analysis is optimized by maximizing I.
The optimization is explicitly linear in ai, which leads to solutions
of simple linear equations as in other Kalman filter cases.
The optimization procedure assumes an initial analysis.
Any set of parameters used in the initial analysis can be optimized
and the analysis output can be optimized as a function of
any observable simultaneously extracting all information content of
the observables.
The analyses can be optimized for any value of x0 through
reweighting of the Monte Carlo events, leading to the final
likelihood curves being integrals of ai(x0)dxO.
Generalizing the optimization procedure to simultaneous
measurements of several observables is shown to give complications
which rise only linearly as a function of the number of observables.
The Bayesian optimization procedure itself is controlled using
a Frequentist's approach, which is simply applying the Jackknife in
case of uncorrelated events. This procedure is shown to give exact
estimations of any potential overtraining in the optimization procedure.
Finally, several examples and recommendations for practical use are given
|
Niels Kjaer
|
CERN, CH - 1211 Geneva 23, Switzerland
|
Optimal statistical treatment of parameterized systematic errors
in a unified Bayesian-Frequentist Monte Carlo Framework.
|
This paper describes a method to explicitly incorporate systematic
errors in the statistical part of a likelihood analysis.
In Monte Carlo based likelihood analyses systematic errors can
be interpreted as uncertainties in the underlying Monte Carlo Model.
Often, these uncertainties can be reliable parameterized giving
parameters, s, in the simulation, which are either controlled
by the information from the data itself or limited by external
information from other data.
In this case it is shown that the optimal way to treat
systematic errors is to simultaneously estimate the observables
of interest and the systematic parameters themselves.
In the K2-filter approach this is done using an event-by-event
analysis with optimal observables for both the observables
of interest and s. Even though events contain no explicit
information on the systematic s, the measurement will often
depend on a combination of s and other observables.
The practical implementation of the proposed procedure is
simply to generate events according to the a priori known
distribution of s. If the measurement does not depend on s
this causes no harm, while if it does depend on s, the
Monte Carlo sample can be used to extract the correlations with s.
This is then transformed into multi-dimensional likelihood
curve for the observables and s.
The simultaneous measurements can hence be done using classical
techniques. An a priori knowledge of s can be folded into the
likelihood, and experiments can be combined directly taking
the systematic correlations into account.
|
Niels Kjaer
|
CERN, CH - 1211 Geneva 23, Switzerland
|
Statistically regularized deconvolution of unbinned spectra
containing event-by-event resolution functions in a unified
Bayesian-Frequentist Monte Carlo Framework.
|
A recursive Bayesian algorithm is applied to update a generator
function in the case having a set of events having individual
Bayesian resolution functions. The recursive process is locally
regularized by computing the statistical significance
of the proposed update from the data themselves.
The significance is evaluated in a Frequentist's framework using
the Jackknife estimator. The algorithm is stopped when all
proposed updates have significances which are below a regularization
parameter. The choice of this parameter to be one standard deviation
is shown to give a deconvoluted spectrum which is optimal
and information conserving at the level of one standard deviation.
The output of the algorithm is formally a Monte Carlo generator
function, with a full covariance matrix, which ensures full
usage of the spectral information after deconvolution.
Only localized information and features are retained by
applying the algorithm, which explains why it is possible
to obtain information conservation.
The algorithm is extendable to any number of dimensions, but
relies on the concept of locality, which is only uniquely defined
in 1-d. With an additional required definition of locality
the algorithm is shown also to be information conserving in 2-d
deconvolution problems at the level of one standard deviation.
This rapidly converging algorithm is unique since it correctly
deals with event-by-event resolution functions and does not produce
any artificial features which other unbinned algorithms are prone to do.
|
George Kahrimanis
|
Southern University at Baton Rouge
|
Objectively derived default prior depends on stopping rule;
Bayesian treatment of nuisance parameters is defended
|
Previous attempts (mine, too) for defining objective priors
have been ineffectual, on account of both unsure derivation
and implausible results. In a fresh approach, the existence
of a default prior probability density is established in a
special case: if the experimental error, as a random
variable, is known to be independent of the true value.
This finding is extended to the generic case. The resulting prior is
the same as the one proposed by Balasubramanian (1996).
Consequently, prejudice can be eliminated in
Bayesian analysis. This assessment calls for a review of
comparison between Bayesian and classical methods. Only the
Bayesian method is suitable for obtaining results from
atypical data sets. Still, the comparison between classical
and Bayesian results can point out for us that the
sensitivity of an experiment may need enhancement in a
certain range. (Alternatively, one can compute a classical
goodness of fit, in this way also testing the plausibility
of the model.) Although useful, the classical approach has
certain severe side effects, such as coupling of the
background with the measured signal even if no events are
recorded, and the counterintuitive lowering of upper limits
in the presence of systematic uncertainties. Remedies have
existed for years, though not yet endorsed by everybody:
these side effects have been suppressed by means of a mixed
approach, in which the background and/or the systematic
uncertainties are treated in a Bayesian fashion. Such
mixing is defended again here, with the rationale that
normally our beliefs regarding systematic variables and the
background are far from controversial (unlike beliefs
concerning the estimated variables) therefore a Bayesian
treatment is suitable. This view becomes more defensible in
view of the possibility of objective Bayesian procedures.
|
Rasulkhozha S. Sharafiddinov
|
Institute of Nuclear Physics, Uzbekistan Academy of Sciences
|
Mass of the neutrino and evolution of the universe.
|
Our study of elastic scattering of unpolarized and longitudinal polarized electrons and their neutrinos by a
spinless nuclei [1,2] shows clearly that the neutrino electric charge [2] connected with its nonzero rest
mass arise as a consequance of the availability of compound structures of the physical mass and charge.
Having the fact that the force of the Newton attraction of the two neutrinos must be less than the force of
their Coulomb repulsion, we can establish the new restrictions on the neutrino mass and charge. Some
considerations and logical confirmations of the existence of intraneutrino interratio between the forces of
different nature have been listed which must explain the violation of chiral symmetry as well as the steadity
of the Universe in the dependence of the ultimate structures of elementary particles charges and masses.
[1] R.B.Begzhanov and R.S.Sharafiddinov, Mod.Phys.Lett.
A15(2000)557
[2] R.S.Sarafiddinov, Spacetime & Substance, 4(2000)176
|
Marcin Wolter
|
Tufts University, Boston, USA
Institute of Nuclear Physics, Krakow, Poland
|
Measurement of physical quantities in the Bayesian framework using neural networks to fit the probability
distributions
|
A Bayesian approach to the reconstruction of particle properties is presented. As an illustration, the
algorithm is applied to the measurement of the Higgs boson mass. Monte Carlo events generated with various
Higgs masses are used to determine the probability distribution for each Higgs boson mass. The probability is
a function of measured
quantities, in this example energies of b-tagged jets as well as the the angle between them. The probability
distributions are fitted using neural networks and the final Higgs mass probability distribution is obtained
as
a product of the single event distributions. The proposed method offers advantages compared with the
traditional approach based on the invariant mass distribution. The miscalibration of the measured quantities
is automatically corrected by the probability distributions. Also the Higgs mass resolution can be superior
due to the fact that well reconstructed events enter the final distribution with higher weight than the
events with worse mass resolution.
|
Sergei I. Redin
|
Budker Institute of Nuclear Physics, Novosibirsk, Russia
Yale University, USA.
|
Advanced Statistical Techniques in muon g-2 experiment
at BNL
|
In our muon g-2 experiment we intend to measure the muon g-2
value to about 0.3 ppm (parts per million) accuracy. There
are two quantities in this experiment which have to be
measured precisely: one is magnetic field in muon storage ring,
measured by system of NMR probes in terms of frequency of proton
spin resonance (omega_p) and another is frequency of g-2
oscillations in time distribution of electrons emerging from
muon decays (omega_a).
Technically, omega_a is to be found by chi-squared optimization
of parameters of a fit function approximating time distribution
(histogram) of decay electrons. This is a standard data analysis
technique in high energy physics, however precision requirement
in our experiment is exceptionally high. By this reason all
statistical properties of fitted distribution, such as statistical
errors and correlations among parameters, must be well understood,
all statistical tests must be well justified and all possible
systematic effects must be extensively studied, etc.
A simple procedure which was developed in our g-2 experiment
to calculate analytically statistical errors and correlations
of parameters is given. In a course
of these studies it was found that correlation between frequency
and phase of g-2 oscillations can be used to improve statistical
error of omega_a by use of information on phase, which we posess
in our experiment. An appropriate equation was derived.
Similar procedure was developed for analytical estimation of
systematic shift of omega_a due to presence of small unidentified
background. This allowed us to study number of possible sources
of systematic errors including such a fundamental systematic effect
as finite binwidth, which might be important for any distribution
in physics and elsewhere.
Some of omaga_a stability checks in g-2 experiment are based on
a set-subset comparison. These are, in particular, checks for
stability with respect to change of
(1) the histogram fit start time and
(2) energy threshold of decay electrons.
For the first case it was shown that difference in omega_a should
be within square root of [ sigma^2(subset) - sigma^2(full set) ]
for one standard deviation. As a matter of fact, this set-subset
formula is valid for any distribution no matter of how many
parameters it has and whether or not they correlate to each other.
We have proved this theorem ourself, though it might be found
somewhere in literature.
For the case of energy threshold change, this theorem is not
applied since some parameters of g-2 distribution (phase and
amplitude of g-2 oscillations) depend on average energy and hence
depend on energy threshold. Nonetheless procedure, developed for
the first case allowed us to derive formula for this situation
too. The formula contains sigma, phase and amplitude for both
subset and full set.
|
B. Aslan and G. Zech
|
Universität Siegen, Germany
|
Comparison of different goodness-of-fit tests
|
Various procedures of distribution free goodness-of-fit test and tests of the uniform distribution,
respectively, have been extracted from literature. In addition, we have constructed several new tests. We
have investigated the power of the selected tests with respect to different slowly variing distortions of
experimental distributions which could opccur in physics applications. None of the tests is optimum for all
kind of distortions. The application of most goodness-of-fit tests is restricted to one dimension. We discuss
the possibility to extend binning free tests to two or more dimensions.
|
Byron P. Roe and Michael B. Woodroofe |
University of Michigan, Ann Arbor, MI 48109, U.S.A. |
BooNE Neutrino Oscillations
|
For neutrino oscillations, the probability of oscillation is given by
p_phi,Delta-m^2(E) =sin^2 2phi sin^2(1.27Delta-m^2 L/E), where phi and
Delta-m^2 are parameters to be determined. E is the neutrino energy
and L is the length of the neutrino path. The mini-BooNE experiment
has an incoming nu_mu beam and searches for nu_mu-->nu_e. If the
results of the LSND experiment are due to neutrino oscillations, the
number of events expected would be of the order of one hundred to a
few hundred. The process can be modelled as a marked Poisson process
with the energy of the events as "marks". The likelihood can be
written as
L = (product_k=1 ^n =N[theta(e)g(e,x) + bh(e,x)])times exp(-Lambda)de_1de_2...
where Lambda = int_0^infty int_0^1 N[theta(e)g(e,x) + bh(e,x)]dxde.
x represents parameters of event shape.
Here theta g represents the signal and bh represents the background. g
and h are normalized to 1. g is a function of one of the parameters,
Delta-m^2, and theta is a function of both parameters. Appropriate
methods for maximizing the likelihood and setting frequentist
confidence intervals or limits will be discussed.
|
Ming-Jer Wang |
Institute of Physics, Academia Sinica, Taipei, Taiwan |
Diagnose bad fit to multiple data sets
|
Determining parameter uncertainties on parton distributions using
global fits to data sets is becoming an important research area due to the increase of
the measurement precision in hadron collider experiments. Before extracting a set of
reliable parameter uncertainties, one needs to make sure that a consistent data set has
been established. For this reason, new criterion for tests of goodness of fit to
multiple data sets had been proposed to detect inconsistent data set in a global fit
of parton distribution function. This method did indicate that the combined CTEQ5
data set is not internally consistent.
In the present study, we propose to examine the inconsistent experimental data set
which is identified by parameter-fitting criteria, using pull distribution and residual
plot in order to allocate uncorrected systematic shift in the data. Seven classes of pull
distributions were postulated for seven possible distortions. Only five of them could be
clearly identified. This provides us a basis for the diagnosis purpose. If the residual
uncertainty is not available, we still could examine the residual plot against physical
variable instead of pull distribution. In addition, detailed systematic pattern might be
revealed by this residual plot also.
|
Anthony Vaiciulis |
University of Rochester, Rochester, NY, USA |
Support Vector Machines in Analysis of Top Quark Production
|
Multivariate data analysis techniques have the potential
to improve physics analyses in many ways. The common
classification problem of signal/background discrimination
is one example. A comparison of a conventional method and
a Support Vector Machine algorithm is presented here for
the case of identifying top quark signal events in the
dilepton decay channel amidst a large number of background
events.
|
Ramin Sina and Eun-Suk Seo |
Institute for Physical Sciences and Technology,
University of Maryland |
Low Count Poisson Statistics and Cosmic-ray Spectral Studies
|
The Cosmic-ray spectrum above 10 GeV can be described as power law which
steepens at about 1 PeV (Commonly referred to as the "Knee" in the spectrum)
and hardens again at about 1 EeV (the "Ankle" in the spectrum). At about
30 EeV, the spectrum is expected to die off due to interactions with the
cosmic microwave background. However several experiments have reported
significant flux above this maximum energy. Although the origin of the ultra
high energy part of the spectrum is still a mystery, cosmic-ray particles with
energies below 1 PeV are believed to be accelerated by supernova shocks, with
the maximum energy proportional to the charge of the cosmic-ray particle.
To test the supernova shock acceleration model, it is essential to use
detectors with good charge resolution to separate protons from heavier nuclei.
Such detectors must be placed above the atmosphere. Since the flux
below the Knee falls off by a factor of approximately 500 with every decade of
energy, therefore, these detectors will have a very limited statistics near
the maximum energies expected from supernovae shocks, i.e. about 1 PeV.
The currently operating ground-based experiments also suffer from small
statistics in their highest energy range. In this paper we will discuss
the statistical techniques needed to measure spectral features with low
count Poisson statistics.
|
B.D. Yabsley |
Virginia Polytechnic Institute and State University |
Statistical practice at the Belle experiment, and some questions
|
The Belle collaboration operates a general-purpose detector at the KEKB
asymmetric-energy e+ e- collider, performing a wide range of measurements
in beauty, charm, tau and 2-photon physics. In this paper, the treatment
of statistical problems in past and present Belle measurements is
reviewed.
The adoption of a uniform method in the future requires the development of
standard tools. Early results from a tool to calculate frequentist
confidence intervals from multiple measurements in the Unified Approach,
under the simplifying assumption of Gaussian errors, are presented, and
compared with more exact calculations. A number of open questions,
including the preferred method of treating systematic errors, are also
discussed.
|
L. Angelini,
P. De Felice,
L. Nitti,
M. Pellicoro,
S. Stramaglia
|
Department of Physics and I.N.F.N. - Bari |
Jet reconstruction through a general purpose clustering algorithm.
|
A general purpose algorithm for data mining [1], based on the
synchronization property of extended spatio-temporal chaotic
systems, has been properly modified and tailored to
reconstruct jets in high energy particle physics. Comparison
with standard approaches will be presented.
[1] L. Angelini et al., P.R.L. 85 (554) 2000.
|
Wolfgang Rolke and Angel Lopez
|
University of Puerto Rico - Mayaguez |
Bias-Corrected Confidence Intervals For Rare Decays
|
When we find limits for rare decays we usually first search for a cut combination with a high signal to
noise ratio. Unfortunately this introduces a bias in the signal rate, making it appear higher than it really
is. We will discuss a method based on the bootstrap algorithm which corrects for this "cut selection" bias.
We will present the results of a mini Monte Carlo study that first shows the presence of this type of bias,
and then also shows that our method is quite effective in correcting for it, yielding confidence intervals
with the correct coverage.
|
Volker Blobel Claus Kleinwort
|
Inst. f. Experimentalphysik Hamburg Deutsches Elektronensynchrotron DESY |
A new method for the high-precision alignment of
track detectors
|
Track detectors in high energy physics experiments
require an accurate determination of a large number of alignment
parameters. A method has been developed, which allows the determination
of up to several thousand alignment parameters in a simultaneous
linear least squares fit of an arbitrary number of tracks.
The method is general for problems where global parameters (e.g. from
alignment) and independent sets of local parameters (e.g track parameters
like angles and curvatures) appear simultaneously. Constraints
between global parameters can be incorporated.
The huge matrix system of the normal equations is reduced to the size
corresponding to the global parameters, which allows the exact
determination of the global parameters.
A program has been developed which allows a certain fraction of
outliers in the data.
The sensitivity of the method is demonstrated in an example of
the alignment of a 56-plane drift chamber and a 2-plane silicon tracker.
In this example about 1000 alignment parameters incl. local drift velocity
values are determined in a fit of 50 thousand tracks from ep-interactions
and field-on and field-off cosmics.
|
Volker Blobel
|
Inst. f. Experimentalphysik Hamburg |
An unfolding method for high energy physics experiments
|
Finite detector resolution and limited acceptance require to apply
unfolding methods to distributions measured in high energy physics
experiments. Information on the detector resolution is usually given
by a set of Monte Carlo events. Based on the experience with a widely
used unfolding program (RUN) a modified method has been developed.
The first step of the method is a maximum likelihood fit of the
Monte Carlo distributions to the measured distribution in one, two or
three dimensions; the finite statistic of the Monte Carlo events is taken
into account by the use Barlows method with a new method of solution.
A clustering method is used before to combine bins in sparsely populated
areas.
In the second step a regularization is applied to the solution,
which introduces only a small bias. The regularization parameter is
determined from the data after a diagonalization and rotation
procedure.
|
Kay Kinoshita
|
University of Cincinnati |
Estimating Goodness-of-Fit for Unbinned Maximum Likelihood Fitting
|
An unbinned maximum likelihood fit has the advantage that it maximizes the use of available information to
obtain the shape of a distribution in the face of limited statistics. Measurements made recently by Belle
using this method include sin$2\phi_1$ and B and D meson lifetimes. However, this method has one difficulty
in that there has been no method for evaluating goodness-of-fit for the result. We derive a formal estimate
of goodness-of-fit for this method.
|
Giovanni Signorelli a,b)
Donato Nicolo' b)
Giovanni Punzi a,b)
|
a) Istituto Nazionale di Fisica Nucleare e Universita' PISA
b) Scuola Normale Superiore di PISA
|
Strong Confidence limits and frequentist treatment of systematics.
|
Calculation of Confidence Limits using the Strong Confidence paradygm is
described. Application to standard problems like Gaussian and Poisson
with background are discussed and compared to other methods.
Prescriptions to introduce systematic uncertainties in a pure frequentist
way are suggested. The computation of the limits for the CHOOZ neutrino
oscillation experiment in this framework is illustrated.
|
Rudy Bock and Wolfgang Wittek |
CERN and MPI,Munich |
Multidimensional event classification in images from gammaray air showers.
|
Exploring signals from outer space has become a science under fast
expansion: astroparticle physics. Among earthbound observations, the
technique of gamma ray Cherenkov telescopes using the atmosphere as
calorimeter is a particularly recent technique. Events in such telescopes
appear as 2-dimensional images (100 - 1000 pixels), and the image
characteristics have to be used to discriminate between the interesting
gammas and the dominating charged particles, mostly protons.
Present techniques of analysis express the images in terms of several
parameters; the goal is to find some test statistic(s) which allow(s) to
optimize the classification. Among optimization techniques, the following
have been used or are under investigation
- cut sequences in the image parameters
- Classification and Regression Trees (CART, commercial products)
- Linear Discriminant Analysis (LDA)
- Composite Probabilities (under development)
- Kernel methods
- Artificial Neural Networks
The methods and some early tentative results will be briefly presented,
remaining problems will be discussed.
|
F. Parodi P. Roudeau A. Stocchi A. Villa |
Genova University LAL-Orsay CERN/LAL-Orsay Milano University |
How to include the information coming from $B^0_s$ oscillations in CKM fits
|
In this paper we discuss how to include the information
coming from the searches for $B^0_s$ oscillations in CKM
fits, starting from the standard output (amplitude
spectrum) of the LEP Oscillation Working Group.
The adopted method (Likelihood ratio) is compared with
other proposed methods.
|
Frantisek Hakl Elzbieta Richter-Was |
Institute of Computer Science,
Czech Academy of Science,
Prague, Czech Republic
Institut fyziki Jadrovej,
Jagelonian University,
Krakow, Poland
|
Application of Neural Networks optimized by Genetic
Algorithms to Higgs boson search
|
Our contribution describe an application of a neural network
approach to SM (standard model) and MSSM (minimal supersymetry
standard model) Higgs search in the associated production
$t\bar{t}H$ with $H \rightarrow b\bar{b}$. This decay
channel is considered as a discovery channel for Higgs
scenarios for Higgs boson masses in the range 80 - 130 GeV.
Neural network model with a special type of data flow is
used to separate $t\bar{t}jj$ background from $H \rightarrow
b\bar{b}$ events. Used neural network combine together a
classical neural network approach and linear decision tree
separation process. Parameters of these neural networks are
randomly generated and population of predefined size of those
networks is learned to get initial generation for the following
genetic algorithm optimization process. A genetic algorithm
principles are used to tune parameters of further neural
network individuals derived from previous neural networks
by GA operations of crossover and mutation. The goal of
this GA process is optimization of the final neural network
performance.
Our results show that NN approach is applicable to the
problem of Higgs boson detection. Neural network filters
can be used to emphasize difference of $M_{bb}$ distribution
for events accepted by filter (with better $\frac{signal}
{background}$ ate) and $M_{bb}$ distribution for original
events (with original $\frac{signal}{background}$ rate)
under condition that there is no loss of significance.
This improvement of the shape of $M_{bb}$ distribution can
be used as a criterion of existence of Higgs boson decay
in considered discovery channel.
|
Giovanni Punzi |
Scuola Normale Superiore and INFN-Pisa |
Limits setting in difficult cases and the Strong Confidence approach.
|
Examples of difficult situations in the practice of limits setting are examined, including measurements
affected by significant systematic uncertainties. Fully frequentist solutions to these problems are described,
based on the concept of Strong Confidence, a localized application of the standard Confidence Level concept
possessing many desirable properties from a physicist's viewpoint.
|
Dean Karlen |
Carleton University |
Credibility of Confidence Intervals
|
Classical confidence intervals are often misinterpreted by scientists and the general public alike. The
confusion arises from the two different definitions of probability in common use. Likewise, there is general
dissatisfaction when confidence intervals are empty or they exclude parameter values for which the experiment is
insensitive. In order to clarify these issues, the use of a Bayesian probability to evaluate the credibility of
a classical confidence interval is proposed.
|
Luc Demortier |
The Rockefeller University |
Bayesian treatments of systematic uncertainties
|
We discuss integration-based methods for incorporating
systematic uncertainties into upper limits, both in a
Bayesian and a hybrid frequentist-Bayesian framework.
For small systematic uncertainties, we show that the
relevant integral can be expressed as a convolution.
We derive the correct form of this convolution, examine
its properties, and present several examples.
|
Jan Conrad, Olga Botner, Allan Hallgren and Carlos P. de los Heros |
High Energy Physics Division, Uppsala University |
Coverage of Confidence Intervals calculated in the Presence of Systematic Uncertainties.
|
We present a Monte Carlo implementation of Highland \& Cousins
method to include systematic uncertainties into confidence
interval calculation. Different ordering schemes and
different types of paramtrizations of systematic
uncertainties are considered. Using this implementation
we perform measurements of the coverage for different
assumptions on the size and shape of the systematic
uncertainties. We illustrate the effect of including
systematic uncertainties in a limit calculation with a
real example taken from the field of high
energy neutrino astrophysics
|
Gary Hill Tyce DeYoung |
University of Wisconsin Santa Cruz Institute for Particle Physics |
Application of Bayesian statistics to muon track reconstruction in AMANDA
|
The AMANDA neutrino telescope detects neutrinos by observing
Cherenkov light from secondary leptons produced in charged
current neutrino interactions. At lower energies, a
background of penetrating muons approximately 10^6 times as
numerous as neutrino-induced muons is rejected by looking
for muons travelling upward through the earth. A comparison
of maximum likelihood and Bayesian approaches to track
reconstruction will be presented, and the implications of
each approach for background rejection will be discussed.
|
Sherry Towers |
State University of New York at Stony Brook |
Overview of Probability Density Estimation methods (PDE's)
|
Probability Density Estimation techniques are gaining popularity in particle physics. I will give an overview of these
powerful methods, and a comparison of their performance relative to those of neural networks.
|
Sherry Towers |
State University of New York at Stony Brook |
How to optimise the signal/background discrimination of a MV analysis (hint: reduce the number of variables)
|
As particle physics experiments grow more complicated
with each passing decade, so too do the analyses of data collected by these experiments. Multivariate analyses involving
dozens of variables are not uncommon in this field. I will show how the use of many
variables in a multivariate analysis can actually degrade the ability to distinguish signal from background, rather than
improve it. |
Daniel R. Stump |
Michigan State University |
New Generation of Parton Distributions and
Methods of Uncertainty Analysis
|
A new generation of parton distribution functions,
CTEQ6 in the sequence of CTEQ global analyses,
is presented. This analysis significantly extends previous
analyses because it includes a full treatment of
available correlated systematic errors for the data sets,
and provides a systematic treatment of uncertainties of the
resulting distribution functions. The properties of the
new parton distributions are shown. Methods for computing
uncertainties of physical predictions based on the
CTEQ6 analysis are described.
|
Johannes Bluemlein and Helmut Boettcher |
DESY Zeuthen |
Polarized Parton Distributions and their Errors
|
A QCD analysis of the world data on polarized deep inelastic scattering is presented in
leading and next-to-leading order. New parametrizations are derived for the quark and gluon
distributions and the value of alpha_s(M_Z) is determined. Emphasis id put on the derivation
of fully correlated 1sigma error bands for these distributions, which are directly applicable
to determine experimental errors of other polarized observables. The error calculation
based on Gaussian error propagation through the evolution equations is discussed in detail.
|
Pekka Sinervo |
University of Toronto |
Estimating the Significance of Data
|
A review of what is commonly known as "signal significance" in the observation of new phenomena in experimental particle physics
data will be provided. The statistical concepts underlying definitions of signal significance will be
summarized and specific recent
examples of their uses will be discussed.
|
Rajendran Raja |
Fermi National Accelerator laboratory |
Confidence Limits and their errors
|
Confidence limits are common place in physics analysis. Great care must be taken
in their calculation and use especially in cases of limited statistics. We
introduce the concept of statistical errors of confidence limits and argue that
not only should limits be calculated but also their errors in order to represent
the results of the analysis to the fullest. We show that comparison of two
different limits from two different experiments becomes easier when their errors
are also quoted. Use of errors of confidence limits will lead to abatement of
the debate on which method is best suited to calculate confidence limits.
|
Fred James |
CERN |
Overview of Bayesian and Frequentist Principles
|
A summary of the principles forming the basis for Bayesian and Frequentist
methodologies, including:
1. How probability is defined and used.
2. Point estimation (how the "best estimate" is defined).
3. Interval estimation (how intervals are constructed to contain
a given confidence or credibility).
4. Hypothesis testing (how to compare two or more hypotheses).
5. Goodness-of-fit (how to measure whether data are compatible
with a single given hypothesis).
6. Decision making (how to make optimal decisions)
|
Fred James |
CERN |
The relation of goodness-of-fit to confidence intervals
|
Confidence intervals are by convention defined so that they contain a
given "confidence", which is either coverage probability for
frequentist intervals or Bayesian probability content for Bayesian
credibility intervals. Alternatively, one could imagine defining
intervals (or more generally regions in parameter space) which contain
all parameter values which give good fits to the data. This latter
definition may be closer to what physicists expect. Especially when
the complement of a confidence interval (an exclusion region) is
published, the reader may interpret that as the ensemble of parameter
values excluded because they don't fit to the data. Why are exclusion
regions not calculated that way? Should they be?
|
Fred James |
CERN |
Comment on a paper by Garzelli and Giunti
|
In a paper "Bayesian View of Solar Neutrino Oscillations"
[hep-ph/0108191 v3], Garzelli and Giunti list eight reasons for using
Bayesian inference (which they call "only a few facts"). I will show
that one can just as easily interpret those "facts" to arrive at the
opposite conclusion. The goal of this exercise is not to show that
Bayesian inference is right or wrong, but to show that this very
general way of reasoning does not lead to unambiguous conclusions. I
propose a different procedure, based on finding the method with the
properties appropriate to the way the results will be used or interpreted.
|
Burkard Reisert |
Max-Planck-Institut f\"ur Physik, Munich.
on behalf of the H1 Collaboration |
The H1 NLO QCD fits to determine $\alpha_s$ and
parton distribution functions.
|
A dedicated NLO QCD fit to the H1 $e^+p$ neutral current
cross sections (1994-97) and the BCDMS $\mu p$ data allowed
the strong coupling constant $\alpha_s$ and the gluon
distribution to be simultaneously determined. Correlated,
uncorrelated and statistical errors of the measurements are
treated by the fit to obtain an error band for the gluon
distribution. The variation of input parameters to the fit
gives rise to additional uncertainties for the gluon and
$alpha_s$.
A fit to the neutral and charged current cross sections
including the latest $e^-p$ and $e^-p$ measurements is in
preparation aiming for the extraction of the full set of
parton density functions.
|