Multi-Bin Exclusion Limit with Combine
These notes were written while using Combine v9.2.1. Please reference Combine's own documentation when replicating this work and extending it to more complicated situations. In addition, these are not meant to be a full introduction to the statistical procedures within Combine (again, refer to its documentation), instead these notes are just meant as an example guide on how one could use Combine.
Set Up
Fortunately, Combine distributes a container image distinct from CMSSW.
This enables us to choose a version of Combine and (almost) immediately begin.
In order to pin the version, avoid typing longer commands, and have the same command across
multiple runners, I use this image denv
.
# initialize with combine version, may change version tag at end if desired
denv init --clean-env gitlab-registry.cern.ch/cms-cloud/combine-standalone:v9.2.1
# --clean-env is necessary for denv currently due to sharing of the LD_LIBRARY_PATH variable
# https://github.com/tomeichlersmith/denv/issues/110
Usage
Roughly, we can partition using Combine into three steps.
- Preparing the inputs to Combine
- Running Combine
- Interpreting the results from Combine
I've done the first and last steps within Python in a Jupyter notebook; however, one could easily use Python scripts, ROOT macros, or some other form of text file creation and TTree reading and plotting.
Input Preparation
The main input to Combine is a text file called a "datacard" where we specify the observed event counts, the expected event counts for the different processes and the uncertainties on these event counts for these different processes. Again, I refer you to the Combine documentation for some tutorials on writing these datacards. Below, I include an example datacard where I have a 3-bin analysis with only two processes (signal and background).
Example 3-bin 2-process Data Card
# counting experiment with multiple channels
# combine datacard generated on 2024-05-16 17:21:40
imax 3 number of channels
jmax 1 number of backgrounds
kmax 5 number of nuisance parameters
------------
# number of observed events in each channel
bin low med high
observation 1 10 100
------------
# list the expectation for signal and background processes in each of the channels
bin low low med med high high
process ap bkg ap bkg ap bkg
process 0 1 0 1 0 1
rate 0.400000 1.000000 0.300000 10.000000 0.200000 100.000000
------------
# list the sources of systematic uncertainty
sigrate lnN 1.200000 - 1.200000 - 1.200000 -
lowstat lnN - 2.000000 - - - -
medstat lnN - - - 1.316228 - -
highstat lnN - - - - - 1.100000
bkgdsyst lnN - 1.050000 - 1.050000 - 1.050000
A few personal notes on the datacards.
- The datacard syntax is a custom space-separated syntax that is somewhat difficult
to edit manually. As such, there are many tools that write datacards to be subsequently
read by
combine
(The above example was actually written by just some Python I wrote to convert my histograms into this textual format.) - When uncertainties appear on the same line, they are treated as correlated by
combine
(thebkgdsyst
uncertainty in the example). Simiarly, appearing on separate lines means they are uncorrelated (the*stat
uncertainties in the example). - The total signal process rate (the sum over the
ap
process across all bins) is normalized to one so that interpreting the result is easier.
Running
There are a plethora of command line options for combine
that I
will not go into here. Instead, I just have a few comments.
- In a low-background scenario, the asymptotic limits are not as trustworthy,
so it is better to use the
--method HybridNew
rather than the default. This method does the full CLs technique with toy experiment generation. We can use the suggested settings for this method using the--LHCmode LHC-limits
argument. We can also speed this up by using a "hint" to help get it started on the right path--hintMethod AsymptoticLimits
. - In a blinded (no data) scenario, we want to observe the median expected limit rather than the "observed" limit for some arbitrary chosen observed events. We can ask for this with
--expectedFromGrid=0.5
. - One can label different runs within the output TTree with
--mass
and--keyword-value
such that the resulting ROOT files can behadd
ed together into a single file containing all of the limits for the different runs ofcombine
.
For a single datacard example,
denv combine \
--datacard datacard.txt \
--method HybridNew \
--LHCmode LHC-limits \
--hintMethod AsymptoticLimits \
--expectedFromGrid=0.5
Often, we want to estimate an exclusion for our different mass points as well.
The brute-force way to do this is to simply write a datacard for each masspoint
(if the mass point changes the signal rate fraction in each bin for example)
and then run combine
for each of these datacards while changing the --mass
label.
for m in 1 10 100 1000; do
denv combine \
--datacard datacard-${m}.txt \
--method HybridNew \
--LHCmode LHC-limits \
--hintMethod AsymptoticLimits \
--expectedFromGrid=0.5
--mass ${m} || break
done
denv hadd higgsCombineTest.HybridNew.root higgsCombineTest.HybridNew.mH*.root
Without the --mass
argument (or another --keyword-value
arugment) that
allows separation of the resulting limits in the ROOT files, the merged
ROOT file will have several limit values but no way for you to determine
which corresponds to the different inputs.
With GNU parallel installed, it is pretty easy to parallelize this running of combine since none of the runs depend on each other.
parallel \
--joblog combine.log \
denv combine \
--method HybridNew \
--mass {} \
--datacard datacard-{}.txt \
--hintMethod AsymptoticLimits \
--expectedFromGrid=0.5
::: 1 10 100 1000
And then hadd
the same was as above.
Other Helpful Options
A few other helpful arguments for combine
.
--plot=path/to/plot.png
stores the CLs vs r plot with the HybridNew method. This is helpful for making sure that combine is actually finding a solution.--rAbsAcc
allows you to set the accuracy on r that combine is shooting for.--toysH
allows you to manually ask for more toys to be thrown. This is sometimes necessary if you wish to improve the CLs estimation.
Result Interpretation
The output from Combine is stored into a TTree within a ROOT file
with a simple and well-defined schema.
For the HybridNew
method, the limit
branch of the TTree is
the upper limit on \(r = N_\text{sig}/N_\text{sig}^\text{exp}\)
where \(N_\text{sig}^\text{exp}\) is the total number of expected
signal events across all bins (hence, why we set this number to one
when creating the datacard above).
Again, here I used Python within a Jupyter notebook to access this
TTree (via uproot
) and plot the results (via matplotlib
and mplhep
).
Since we normalized the expected signal to one, the resulting limit
can be interpreted as the maximum number of signal events allowed.
This can then be converted into a limit on the dark sector mixing
strength \(\epsilon\) with knowledge of the total signal production
rate \(R_{\epsilon=1}\) and the signal efficiency \(E\).
\[
N_\text{sig} = \epsilon^2 E R_{\epsilon=1}
\quad\Rightarrow\quad
\epsilon_\text{up}^2 = \frac{E R_{\epsilon=1}}{r}
\]
where \(r\) is a stand-in for the value of the limit
branch from
the Combine output ROOT file. Futher conversion of this mixing strength
into an effective interaction strength \(y\) is commonly done using
the dark sector interaction strength \(\alpha_D\) and the ratio of
the dark fermion mass to the dark photon mass \(m_\chi/m_{A'}\).
\[
y_\text{up}
= \alpha_D\left(m_\chi/m_{A'}\right)^4\epsilon_\text{up}^2
= \alpha_D\left(m_\chi/m_{A'}\right)^4\frac{E R_{\epsilon=1}}{r}
\]