Outline of an Analysis of an Expected Signal (with blinding)
Analysis Outline
Define the Analysis
- Signal: what physics process is being studied (e.g., H -> ZZ, Z->mu mu, Z->mu mu)
- Characteristics of the signal
- Event topology and kinematics
- Detector elements
- Selection criteria
- Particle identification
- Kinematics
- Jet reconstruction
- Reconstruction algorithm
- Background processes that mimic the signal
- Discriminate between signal and background
Implement the Analysis
- write/beg/borrow/steal code to implement the selection and reconstruction procedure
- most analyses start with someone else's code
Inputs: source code
Outputs: source code
Tools: source browser (lxr), version control system (cvs), IDE, development tools
Test the analysis on Simulated Data
- Use simulated signal and background data samples to test the implementation
- "Generic" simulated data sample provided by the experiment, signal or analysis specific simulated data generated for the analysis
- Usually test backgrounds individually
- Statistically combine weighted results from individual signal and background tests to simulate full result
Inputs: simulated data
Outputs: test procedures, test results
Tools: data discovery (DBS), software framework (CMSSW, FWLite), analysis environment (ROOT), simulated data production workflow management (
ProdAgent), analysis workflow management (CRAB)
Tune the analysis on Simulated Data
- classification/optimization problem
- systematically vary selection criteria and measure the results on the various signal and background samples
- may include neural net training, decision trees, bayesian classifiers, etc.
Inputs: simulated data
Outputs: tuning procedure, tuning results
Tools: same as previous step
Validate the analysis
- Verify that the overall characteristics of the simulated analysis results correspond to expectations
- Probe for errors, anomalies, etc.
- Hypothesize potential problems with the analysis
- Develop data samples to test for the hypothesized problems
- Some tests will use real data with no signal
- samples where no signal is expected
- samples with signal analyzed outside the signal region
- analysis modified to look for "impossible" signal signature (e.g., impossible particle charge combinations)
- On failure, iterate back to analysis definition or implementation as appropriate
Inputs: simulated data, real data
Outputs: validation procedure, validation results
Tools: same as previous step
Determine Efficiencies needed to reconstruct physics process
- Usually determined via a combination of simulated and real data
- real data is preferred to avoid systematic uncertainties/biases in simulated data
- Efficiencies for a difficult analysis may be very low
Inputs: simulate data, real data
Outputs: efficiency analysis procedure, reconstruction efficiency results
Tools: same as previous step
Run on real data with blinded analysis
- Blinding obscures the value of the final measurement so that the analysis is unbiased by tuning for a desired value
Verify the characteristics of the result
- Verify that results other than the signal matches the expected distributions (e.g., check that fit residuals or backgrounds outside the signal region are reasonable).
Unblind the signal
--
DanRiley - 21 Sep 2007