What the F.A.Q.?
Frequent issues and potholes that new collaborators run into while starting their work with LDMX software.
fire: command not found
This error can arise pretty easily but luckily it has a pretty easily solution.
The current entrypoint script
for the development environment1 (ldmx/dev
images)
assume that ldmx-sw is installed into the path at ${LDMX_BASE}/ldmx-sw/install
, so
if any of the following are true, you will see the fire: command not found
error.
LDMX_BASE
is not the full path to the directory containing ldmx-sw.- The directory with the ldmx-sw source is not named
ldmx-sw
. - The user configured ldmx-sw to be installed somewhere else with
-DCMAKE_INSTALL_PREFIX=...
.
The ldmx/pro
images (short for "production") already have a copy of ldmx-sw compiled
and installed into them and so they do not have this path resolution requirement in order
to access the fire
command.
I say "current" because there is some discussion about re-designing the container interaction to avoid such heavy reliance on the entrypoint in the image. Avoiding this reliance would make it easier for users to switch between images, but it would require us to learn a slightly new interaction workflow.
Compile error: ap_fixed.h: No such file or directory.
As with other errors of this type, this originates in a bad interaction between our repository
and one of its submodules. Specifically, ap_fixed.h
is within the Trigger/HLS_arbitrary_Precision_Types
submodule and so if the files within that directory are not present you need to make sure that
the submodules are up-to-date.
git submodule update --recursive --init && echo "success"
This command should succeed; thus, I added the && echo "success"
so that the last line is
success
if everything finished properly.
One issue that has been observed is that this command fails
rather quietly after a lot of printouts. Specifically, on MacOS, there was an issue where
git lfs
was not found and so acts
's submodule OpenDataDetector
was failing to checkout
and issuing a fatal
error. This prevented all of the submodules from being updated and thus
the missing file.
Make sure git lfs
is installed by running that command and seeing that it prints out
a help message instead of an error like 'lfs' is not a git command
. If you see this
error, you need to install git lfs
.
CMake error: does not contain a CMakeLists.txt file.
The full text of this error looks like (for example)
CMake Error at Tracking/CMakeLists.txt:61 (add_subdirectory):
The source directory
/full/path/to/ldmx/ldmx-sw/Tracking/acts
does not contain a CMakeLists.txt file.
We make wide use of submodules in git
so that
our software can be separated into packages that evolve and are developed at different rates.
The error summarized in this title is most often shown when a submodule has not been properly initialized
and thus is missing its source code (the first file that is used is the CMakeLists.txt
file during the cmake
step). Resolving this can be done with
git submodule update --init --recursive
This command does the following
submodule update
instructsgit
to switch the submodules to the commits stored in the parent repository- The
--init
flag tellsgit
to do an initialclone
of a submodule if it hasn't yet. - The
--recursive
flag tellsgit
to recursively apply these submodule updates in case submodule repos have their own submodules (which they do in our case).
You can avoid running this submodule update command if you use --recurse-submodules
during your initial
clone and any pulls you do.
# --recursive is the flag used with older git versions
git clone --recurse-submodules git@github.com:LDMX-Software/ldmx-sw.git
git pull --recurse-submodules
I highly recommend reading the linked page about git submodules. It is very helpful for understanding how to interact with them efficiently.
How can I find the version of the image I'm using?
Often it is helpful to determine an image's version.
Sometimes, this is as easy as looking at the tag provided by docker image ls
or written into the SIF file name,
but sometimes this information is lost.
Since v4 of the container image, we've been more generous with adding labels to the image and a standard one is included
org.opencontainers.image.version
which (for our purposes) stores the release that built the image.
We can use inspect
to find the labels stored within an image.
# docker inspect returns JSON with all the image manifest details
# jq just helps us parse this JSON for the specific thing we are looking for,
# but you could just scroll through the output of docker inspect
docker inspect ldmx/dev:latest | jq 'map(.Config.Labels["org.opencontainers.image.version"])[]'
# apptainer inspect (by default) returns just the list of labels
# so we can just use grep to select the line with the label we care about
apptainer inspect ldmx_dev_latest | grep org.opencontainers.image.version
If the org.opencontainers.image.version
is not an available label, then the image being used was
built prior to v4 and a more complicated, investigative procedure will need to be done. In this case,
you should be trying to find the "digest" of the image you are running and then looking for that "digest"
on the images availabe on DockerHub.
With docker
(and podman
), this is easy
docker images --digests
It is more difficult (impossible in general?) with apptainer
(and singularity
) since
some information is discarded while creating the SIF.
Running ROOT Macros from the Command Line
ROOT uses shell-reserved characters in order to mark command line arguments as specific inputs to the function it is parsing and running as a macro. This makes passing these arguments through our shell infrastructure difficult and one needs a hack to get around this. Outside of the container, one might execute a ROOT macro like
root 'my-macro.C("/path/to/input/file.root",72,42.0)'
Using the ldmx
command to run this in the container would look like
ldmx root <<EOF
.x my-macro.C("/path/to/input/file.root",72,42.0)
EOF
The context for this can be found on GitHub Issue #1169.
While this answer was originally written with ldmx
and its suite of bash functions,
the same solution should work with denv
as well.
denv root <<EOF
.x my-macro.C("/path/to/input/file.root",72,42.0)
EOF
Installing on SDF: No space left on the device
On SLAC SDF (and S3DF, I use the same name for both here) we will be using singularity which is already installed.
This is a frustrating interaction between SDF's filesystem design and singularity. Long story short, singularity needs O(5GB) of disk space for intermediate files while it is downloading and building container images and by default it uses /tmp/
to do that. /tmp/
on SDF is highly restricted. The solution is to set a few environment variables to tell singularity where to write its intermediate files.
# on SDF
export TMPDIR=/scratch/$USER
# on S3DF
export TMPDIR=${SCRATCH}
The ldmx-env.sh script is able to iterface with both docker and singularity so you should be able to use it after resolving this disk space issue.
How do I update the version of the container I am using?
The prior answer below is under the assumption you are using the ldmx
program
defined within scripts/ldmx-env.sh
. If you are using denv
, you can define
the container image to use with denv config image
. For example
denv config image ldmx/pro:v4.0.1 # use pre-compiled ldmx-sw v4.0.1
denv config image ldmx/dev:4.2.2 # use version 4.2.2 of image with ldmx-sw dependencies
denv config image pull # explicitly pull whatever image is configured (e.g. if latest)
Legacy Answer
We've added this command to the ldmx
command line program.
In general, it is safe to just update to the latest version.
ldmx pull dev latest
This explicitly pulls the latest version no matter what, so it will download the image even if it is the same as one already on your computer.
My display isn't working when using Windoze and WSL!
You probably are seeing an error that looks like:
$ ldmx rootbrowse
Error in <TGClient::TGClient>: can't open display ":0", switching to batch mode...
In case you run from a remote ssh session, reconnect with ssh -Y
Press enter to exit.
The solution to this is two-fold.
- Install an X11-server on Windoze (outside WSL) so that the display programs inside WSL have something to send their information to. X410 is a common one, but there are many options.
- Make sure your environment script (
scripts/ldmx-env.sh
) has updates to the environment allowing you to connect directly to the server. These updates are on PR #997 and could be manually copied if you aren't yet comfortable withgit
.
Is it okay for me to use tools outside the container? I have been informed to only use the container when working on LDMX.
The container is focused on being a well-defined and well-tested environment for ldmx-sw, but it may not be able to accomodate every aspect of your workflow. For example, my specific workflow can be outlined with the following commands (not a tutorial, just an outline for explanatory purposes):
ldmx cmake .. # configure ldmx-sw build
ldmx make install # build and install ldmx-sw
ldmx fire sim.py # run a simulation
rootbrowse sim_output.root #look at simulation file with local ROOT build **outside the container**
ldmx fire analysis.py sim_output.root #analyze the simulation file
python pretty_plotter.py histogram.root #make PDFs from ROOT histograms generated by analysis.py
As you can see, I switch between inside the container and outside the container and I can do this because I have an install of ROOT outside the container. Key point: For what I am doing which is simply looking at ROOT files and plotting ROOT histograms, my ROOT install does not need to match the ROOT inside the container. I think using both the container and a local installation of ROOT is a very powerful combination. This allows for you to use ROOT's TBrowser or TTreeViewer to debug files while using the container to actually build and run ldmx-sw. Another workflow that follows a similar protocol is a python-based analysis.
ldmx cmake .. # configure ldmx-sw build
ldmx make install # build and install ldmx-sw
ldmx fire sim.py # run a simulation
rootbrowse sim_output.root #look at simulation file with local ROOT build **outside the container**
ldmx python analysis.py sim_output.root #analyze the simulation file and produce PDF/PNG images of histograms
Notice that the analysis.py
script, even though it is not being run through fire, is still run inside of the container. This is because the python-based analysis needs access to the ROOT dictionary for ldmx-sw which was generated by the container. This is the only reason for analysis.py to be run inside the container. If the sim_output.root
file only has (for example) ntuples in it, then you could run your analysis outside the container because you don't need access to the ldmx-sw dictionary.
python analysis.py sim_output.root #this analysis uses the local ROOT install and does not need the ldmx-sw ROOT dictionary
Conclusion: The absolutely general protocol still uses an installation of ROOT that is outside of the container. You do not need installations of the other ldmx-sw dependencies and your installation of ROOT does not need to match the one inside of the container.
On macOS, I get a 'invalid developer path' error when trying to use git
.
git
.On recent versions of the mac operating system, you need to re-install command line tools. This stackexchange question outlines a simple solution.
On macOS, I get a failure when I try to run the docker containers.
The default terminal on macOS is tcsh
, but the environment setup script assumes that you are using bash
. You should look at this article for how to change your default shell to bash
so that the ldmx environment script works.
On macOS, I get a error when trying to use tab completion.
This error looks like
ldmx <tab>-bash: compopt: command not found
where <tab>
stands for you pressing the tab key.
This error arises because the default version of bash
shipped with MacOS is pretty old,
for example, MacOS Monterey ships with bash version 3.2 (the current is 5+!).
You can get around this issue by simply updating the version of bash
you are using.
Since the terminal is pretty low-level, you will unfortunately need to also manually override the default bash.
- Install a newer version of bash. This command uses homebrew to install
bash
.
brew install bash
- Add the newer bash to the list of shells you can use.
sudo echo /usr/local/bin/bash >> /etc/shells
- Change your default shell to the newer version
chsh -s /usr/local/bin/bash $USER
You can double check your current shell's bash version by running echo $BASH_VERSION
.
This answer was based on a helpful stackexchange answer.
I can't compile and I get a 'No such file or directory' error.
LDMX software is broken up into several different git
repositories that are then pulled into a central repository ldmx-sw as "submodules".
In order for you to get the code in the submodules, you need to tell git
that you want it. There are two simple ways to do this.
- You can re-clone the repository, but this time using the
--recursive
flag.
git clone --recursive https://github.com/LDMX-Software/Framework.git
- You can download the submodules after cloning ldmx-sw.
git submodule update --init --recursive
Software complains about not being able to create a processor. (UnableToCreate)
This is almost always due to the library that contains that processor not being loaded. We are trying to have the libraries automatically included by the python templates, but if you have found a python template that doesn't include the library you can do the following in the meantime.
# in your python configuration script
# p is a ldmxcfg.Process object you created earlier
p.libraries.append( 'libMODULE.so' )
# MODULE is the module that your processor is in (for example: Ecal, EventProc, or Analysis)
Software complains about linking a library. (LibraryLoadFailure)
Most of the time, this is caused by the library not being found in the common search paths. In order to make sure that the library can be found, you should give the full path to it.
# in your python configuration script
# p is a ldmxcfg.Process object you created earlier
p.libraries.append( '<full-path-to-ldmx-sw-install>/lib/libMODULE.so' )
# MODULE is the module that your processor is in (for example: Ecal, EventProc, or Analysis)
ROOT version mismatch error.
This error looks something like this:
Error in <TUnixSystem::Load>: version mismatch, <full-path>/ldmx-sw/install/lib/libEvent.so = 62000, ROOT = 62004
This is caused by using one version of ROOT to compile ldmx-sw and another one to use ldmx-sw, and most people run into this issue when compiling ldmx-sw inside of the container and then using ldmx-sw outside of the container. The fix to this is simple: always compile and use ldmx-sw inside of the container. This makes sure that the dependencies never change.
If you need some dependency that isn't in your version of the container, please file an issue with the docker repository to request what's called a "derivative" container.
Parameter 'className' does not exist in list of parameters.
This error usually occurs when you pass an object that isn't an EventProcessor as an EventProcessor to the Process sequence.
A feature that is helpful for debugging your configuration file is the Process.pause
function:
# at the end of your config.py
p.pause() #prints your Process configuration and waits for you to press enter before continuing
ldmx-sw cannot find the contents of a file when I know that the file exists
Remember that you are running inside of the container, so the file you want to give to the software must be accessible by the container.
Some python features that are helpful for making sure the container can find your file are os.path.fullpath
to get the full path to a file and
os.path.isfile
to check that the file exists.