How to Make Signal Samples

This page is focused on how to produce dark brem signal samples using the G4DarkBreM package integrated into ldmx-sw. It is focused on having the dark brem occur within the target although having it occur within the ECal is also supported in ldmx-sw.

As mentioned in the prior page, the procedure for making signal samples is two steps. First, a reference library is produced using MadGraph and then the reference library is used by the Geant4 simulation to model the dark brem interaction. The second stage is what happens within ldmx-sw and its simulation framework while the first part is done outside of ldmx-sw.

Generating a Reference Library

The technical requirements upon the library making it usable by G4DarkBreM are written in its parseLibrary documentation Theoretically, any program could generate this reference library and have it usable by G4DarkBreM; however, it has only been tested and validated with libraries generated by a specific configuration of MadGraph which just so happens to be a configuration of MadGraph that is useful to LDMX and has been packaged into a container similar to how ldmx-sw and its dependencies are.

For details on the MadGraph configuration and how to use it, visit the GitHub project tomeichlersmith/dark-brem-lib-gen. Unless you plan to add new features or fix some sort of issue, you will not need this repository and instead only require a script wrapping the container running process for you.

The script only needs to be downloaded once and it only needs to be sourced once inside each terminal you wish to run in.

# download the dark-brem-lib-gen environment script
wget https://raw.githubusercontent.com/tomeichlersmith/dark-brem-lib-gen/main/env.sh
# initialize the environment
source env.sh

Now a new bash function dbgen is defined which you can use (similar to ldmx) to interact with the containerized MadGraph and generate dark brem events. Further configuration of your local dbgen setup is possible now. Below, I've written some dummy commands which are helpful for various reasons.

  • dbgen use v4.5 : it is helpful to pin the version you are using so that future analyzers of the data (including yourself) know exactly how it was generated
  • dbgen cache /big/cache/dir : on clusters where you are using singularity, you will probably need to move the directory where singularity caches downloaded layers because, by default, it uses your home directory which probably doesn't have enough space. One option, if you are also using ldmx is to put the cache in the same place the ldmx cache is dbgen cache ${LDMX_BASE}/.singularity.
  • dbgen work /scratch/dir : the working directory where intermediate files will be written. It needs to be large enough to hold a copy of MadGraph (>1GB). On laptops, the default /tmp directory is probably fine but this will probably need to be changed on clusters. For example, at SLAC you will want dbgen work /scratch/$USER.
  • dbgen dest /path/to/destination : set where you would like the generate library to be put. By default, it is whereever you execute dbgen run but you may want the output directory to be somewhere else with more space.

You can view all of the runtime options (and test that the environment is setup reasonably) by running the container and asking for the usage information.

dbgen run --help

The defaults for the runtime options align pretty well with the LDMX signal use case, so lets just run it with defaults and obtain a library to use later. This usually takes a few minutes but may be faster/slower depending on the computer you are using to run.

dbgen run

Now we have a new directory created in the current directory (or where-ever you set dest to be) that has several LHE files within it. This directory of LHE files is the reference library we can give to the Geant4 simulation.

Simulating Dark Brem in ldmx-sw

We keep a few files within ldmx-sw/Biasing related to dark brem simulation which will be helpful places to reference as examples.

  • ldmx-sw/Biasing/test/target_db.py a config which runs dark brem simulation in the target using the example reference library shipped with G4DarkBreM.
  • ldmx-sw/Biasing/python/target.py is a configuration module with the dark_brem function which configures a simulation to do dark brem within the target. This function is what we will unpack below to explain the various pieces configuring the simulation.

The dark_brem function linked above has four "parts" that do a few different tasks. We will walk through them in order.

Base Construction

The first few lines of the function merely construct the simulation with the desired detector, a standard electron beam generator and the beam-spot smearing we expect to see in the beam.

sim = simulator.simulator( "target_dark_brem_" + str(ap_mass) + "_MeV" )
sim.description = "One e- fired far upstream with Dark Brem turned on and biased up in target"
sim.setDetector( detector , True )
sim.generators.append( generators.single_4gev_e_upstream_tagger() )
sim.beamSpotSmear = [ 20., 80., 0. ] #mm

This stuff will remain pretty similar across many different types of samples used for physics analyses.

Model Configuration

The next section configures the dark brem model and activates it so the simulation will allow dark brem to happen.

from LDMX.SimCore import dark_brem
db_model = dark_brem.G4DarkBreMModel(lhe)
db_model.threshold = 2. #GeV - minimum energy electron needs to have to dark brem
db_model.epsilon   = 0.01 #decrease epsilon from one to help with Geant4 biasing calculations
sim.dark_brem.activate( ap_mass , db_model )

This is where we pass the reference library lhe and the dark photon mass ap_mass to the simulation so that it is capable of simulating dark brem in the way we desire. Even though dark brem is activated, it still may be overwhelmed by other processes that are more likely to occur so we need further configuration.

Biasing

First, we need to artificailly increase the biasing factor of the dark brem so that it is more likely to happen. We do this inside of the target because that is where we want these interactions to occur.

sim.biasing_operators = [
    bias_operators.DarkBrem.target(sim.dark_brem.ap_mass**2 / db_model.epsilon**2)
]

Filtering

Biasing is not perfect however and we don't want to bias too much because then there wouldn't be a realistically flat position distribution of where the dark brem occurred within the target. For these reasons, we also need to attach filters to the simulation so that events are only kept if our specific requirements are met. Below, we have a few filtering requirements and we make sure that the products of the dark brem are kept no matter what.

sim.actions.extend([
    #make sure electron reaches target with 3.5GeV
    filters.TaggerVetoFilter(3500.),
    #make sure dark brem occurs in the target where A' has at least 2GeV
    filters.TargetDarkBremFilter(2000.),
    #keep all prodcuts of dark brem(A' and recoil electron)
    util.TrackProcessFilter.dark_brem()
])

That's it! After configuring the simulation in this way, events will be produced that have a dark brem interaction ocurring within the target region that have a dark photon with at least 2GeV of energy. One can then add this simulation to the processing sequence and add other emulation and reconstruction processes after it to study the sample in more detail.

Batch Running

Suppose you've gotten pretty familiar with the signal samples generating single files for each of the different mass points you wish to study, but now you want to scale up this analysis to larger samples so you can more precisely study how your analysis effects the signal distributions. This is where batch running comes in! Below, I've copied a bash script I've used at UMN to generate large signal samples. It avoids the use of dark-brem-lib-gen's env.sh script as well as ldmx-sw's ldmx-env.sh script by writing container-running commands manually.

You may notice that there are actually three steps in this script and not only two. Since the LHE files generated for the reference library are often only used for the reference library, we can "extract" them into a single CSV file which is about ten times smaller than all of the LHE files, saving space by throwing away information that is not needed by G4DarkBreM. Doing this helps save the space required for large samples but retains the ability to re-simulate without having to re-generate the library.

At UMN, our batch scheduler HTCondor copies any files written to the working area to the output directory for us, so no copying is done in this script. Additionally, we instruct HTCondor to copy our desired config.py into the working area before this script is executed. Your batch system may work differently, and in that case, you would need to modify the script. This script expects two environment variables to be defined: DBGEN_IMAGE which is the full path to a SIF holding a version of dark-brem-lib-gen and FIRE_IMAGE which is the full path to a ldmx-sw production SIF that can run the config.py script. Then the command line arguments are the dark photon mass and the run number to use for random seeding.

#!/bin/bash
# bash script to run full pipeline of signal simulation

# error out if any of the commands we run return non-zero status
set -o errexit
# print every command that is deduced by bash so the batch logs can show us
# exactly what was run
set -o xtrace

__generate_library__() {
  mkdir -p dbgen-scratch
  apptainer run \
    --no-home \
    --cleanenv \
    --bind $(pwd):/output,$(realpath dbgen-scratch):/working \
    ${DBGEN_IMAGE} \
    --target tungsten silicon copper oxygen \
    --apmass ${1} \
    --run ${2}
  return $?
}

__extract_library__() {
  # deduce library path
  local lib=$(echo electron_*_run_*)
  [ -d "${lib}" ] || return 1
  apptainer run \
    --no-home \
    --cleanenv \
    --env LDMX_BASE=$(pwd) \
    --bind $(pwd) \
    ${FIRE_IMAGE} \
    . g4db-extract-library ${lib}
  lib=$(echo *.csv)
  [ -f "${lib}" ] || return 1
  gzip "${lib}"
  return $?
}

__detector_sim__() {
  local lib=$(echo *.csv.gz)
  [ -f "${lib}" ] || return 1
  apptainer run \
    --no-home \
    --cleanenv \
    --env LDMX_BASE=$(pwd) \
    --bind $(pwd) \
    ${FIRE_IMAGE} \
    . fire config.py ${lib}
  return $?
}

__main__() {
  local mass=${1}
  local run_number=${2}
  __generate_library__ ${mass} ${run_number} || return $?
  __extract_library__ || return $?
  __detector_sim__
  return $?
}

__main__ $@