diff --git a/_config.yml b/_config.yml
index e8d125f..a1ed92e 100644
--- a/_config.yml
+++ b/_config.yml
@@ -11,7 +11,7 @@
carpentry: "dune"
# Overall title for pages.
-title: "Computing Basics for DUNE - Late 2024 edition"
+title: "Computing Basics for DUNE - Revised 2025 edition"
# Life cycle stage of the lesson
# See this page for more details: https://cdh.carpentries.org/the-lesson-life-cycle.html
diff --git a/_episodes/04-Spack.md b/_episodes/04-Spack.md
new file mode 100644
index 0000000..a83bb4c
--- /dev/null
+++ b/_episodes/04-Spack.md
@@ -0,0 +1,117 @@
+---
+title: The new Spack code management system
+teaching: 15
+exercises: 5
+questions:
+- How are different software versions handled?
+objectives:
+- Understand the role of Spack
+keypoints:
+- Spack is a tool to deliver well defined software configurations
+- CVMFS distributes software and related files without installing them on the target computer (using a VM, Virtual Machine).
+---
+## What is Spack and why do we need it?
+
+> ## Note
+
+
+An important requirement for making valid physics results is computational reproducibility. You need to be able to repeat the same calculations on the data and MC and get the same answers every time. You may be asked to produce a slightly different version of a plot for example, and the data that goes into it has to be the same every time you run the program.
+
+This requirement is in tension with a rapidly-developing software environment, where many collaborators are constantly improving software and adding new features. We therefore require strict version control; the workflows must be stable and not constantly changing due to updates.
+
+DUNE must provide installed binaries and associated files for every version of the software that anyone could be using. Users must then specify which version they want to run before they run it. All software dependencies must be set up with consistent versions in order for the whole stack to run and run reproducibly.
+
+Spack is a tool to handle the software product setup operation.
+
+## Minimal spack for root analysis and file access
+
+You can get spack going with our minimal implementation
+
+~~~
+. /cvmfs/dune.opensciencegrid.org/dune-spack/spack-develop-fermi/setup-env.sh
+spack env activate dune-tutorial
+~~~
+{: .language-bash}
+
+This sets up the file and job management packages (metacat, rucio, justin) and a version of root that can do streaming transfers. It is useful for end stage tuple analysis.
+
+A full version with larsoft is in the works.
+
+You can list what is available in that environment via
+
+~~~
+spack find
+~~~
+{: .language-bash}
+
+Which lists packages like this:
+
+~~~
+-- linux-almalinux9-x86_64_v2 / %c,cxx=gcc@12.5.0 ---------------
+cmake@3.31.8 libffi@3.4.8 openssl@3.3.3 root@6.28.12
+davix@0.8.10 libjpeg-turbo@3.0.4 patchelf@0.17.2 rust@1.85.0
+ftgl@2.4.0 libpng@1.6.47 pcre@8.45 unuran@1.8.1
+ifdhc@2.8.0 lz4@1.10.0 postgresql@15.8 vdt@0.4.6
+ifdhc-config@2.8.0 ninja@1.13.0 python@3.9.15 xrootd@5.6.9
+intel-tbb-oneapi@2021.9.0 nlohmann-json@3.11.3 re2c@3.1 xxhash@0.8.3
+~~~
+{: .output}
+
+This particular environment loads defined versions of the packages.
+
+
+
+Also the PYTHONPATH describes where Python modules will be loaded from.
+
+Try
+
+~~~
+which root
+~~~
+{: .language-bash}
+to see the version of root that spack sets up. Try it out!
+
+
+### Spack basic commands
+
+| Command | Action |
+|------------------------------------------------|------------------------------------------------------------------|
+| `spack list` | List everything spack knows about |
+| `spack find` | Displays what has been setup |
+
+
+
+
+
diff --git a/_episodes/04-intro-art-larsoft.md b/_episodes/04-intro-art-larsoft.md
deleted file mode 100644
index 3e21257..0000000
--- a/_episodes/04-intro-art-larsoft.md
+++ /dev/null
@@ -1,698 +0,0 @@
----
-title: Introduction to art and LArSoft (2024 - Apptainer version)
-teaching: 50
-exercises: 0
-questions:
-- Why do we need a complicated software framework? Can't I just write standalone code?
-objectives:
-- Learn what services the *art* framework provides.
-- Learn how the LArSoft tookit is organized and how to use it.
-keypoints:
-- Art provides the tools physicists in a large collaboration need in order to contribute software to a large, shared effort without getting in each others' way.
-- Art helps us keep track of our data and job configuration, reducing the chances of producing mystery data that no one knows where it came from.
-- LArSoft is a set of simulation and reconstruction tools shared among the liquid-argon TPC collaborations.
----
-
-#### Session Video
-
-The session video on December 10, 2025 was captured for your asynchronous review.
-
-
-
-
-
-
-
-
-
-
-
-## Advertisement -- February 2025 LArSoft workshop at CERN
-
-[https://indico.cern.ch/event/1461779/overview](https://indico.cern.ch/event/1461779/overview)
-
-This page is protected by a password. Dom Brailsford sent this password in an e-mail to the DUNE Collaboration on November 6, 2024.
-
-## Introduction to *art*
-
-*Art* is the framework used for the offline software used to process LArTPC data from the far detector and the ProtoDUNEs. It was chosen not only because of the features it provides, but also because it allows DUNE to use and share algorithms developed for other LArTPC experiments, such as ArgoNeuT, LArIAT, MicroBooNE and ICARUS. The section below describes LArSoft, a shared software toolkit. Art is also used by the NOvA and mu2e experiments. The primary language for *art* and experiment-specific plug-ins is C++.
-
-The *art* wiki page is here: [https://cdcvs.fnal.gov/redmine/projects/art/wiki][art-wiki]. It contains important information on command-line utilities, how to configure an *art* job, how to define, read in and write out data products, how and when to use *art* modules, services, and tools.
-
-*Art* features:
-
-1. Defines the event loop
-2. Manages event data storage memory and prevents unintended overwrites
-3. Input file interface -- allows ganging together input files
-4. Schedules module execution
-5. Defines a standard way to store data products in *art*-formatted ROOT files
-6. Defines a format for associations between data products (for example, tracks have hits, and associations between tracks and hits can be made via art's association mechanism.
-7. Provides a uniform job configuration interface
-8. Stores job configuration information in *art*-formatted root files.
-9. Output file control -- lets you define output filenames based on parts of the input filename.
-10. Message handling
-11. Random number control
-12. Exception handling
-
-The configuration storage is particularly useful if you receive a data file from a colleague, or find one in a data repository and you want to know more about how it was produced, with what settings.
-
-### Getting set up to try the tools
-
-Log in to a `dunegpvm*.fnal.gov` machine and set up your environment (This script is defined in Exercise 5 of https://dune.github.io/computing-training-basics/setup.html)
-
-> ## Note
-> For now do this in the Apptainer. Due to the need to set up the container separately on the build nodes and the gpvms due to /pnfs mounts being different, and the need to keep your environment clean for use on other experiments, it is best to define aliases in your .profile or .bashrc or other login script you use to define aliases. A set of convenient aliases is
-{: .challenge}
-
-~~~
-alias dunesl7="/cvmfs/oasis.opensciencegrid.org/mis/apptainer/current/bin/apptainer shell --shell=/bin/bash -B /cvmfs,/exp,/nashome,/pnfs/dune,/opt,/run/user,/etc/hostname,/etc/hosts,/etc/krb5.conf --ipc --pid /cvmfs/singularity.opensciencegrid.org/fermilab/fnal-dev-sl7:latest"
-
-alias dunesl7build="/cvmfs/oasis.opensciencegrid.org/mis/apptainer/current/bin/apptainer shell --shell=/bin/bash -B /cvmfs,/exp,/build,/nashome,/opt,/run/user,/etc/hostname,/etc/hosts,/etc/krb5.conf --ipc --pid /cvmfs/singularity.opensciencegrid.org/fermilab/fnal-dev-sl7:latest"
-
-alias dunesetups="source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh"
-~~~
-{: .language-bash}
-
-Then you can use the appropriate alias to start the SL7 container on either the build node or the gpvms. Starting a container gives you a very bare environment -- it does not source your .profile for you; you have to do that yourself. The examples below assume you put the aliases above in your .profile or in a script sourced by your .profile. I always set the prompt variable PS1 in my profile so I can tell that I've sourced it.
-
-~~~
-PS1="<`hostname`> "; export PS1
-~~~
-{: .language-bash}
-
-Then when you log in, you can type these commands to set up your environment in a container:
-~~~
-dunesl7
-source .profile
-dunesetups
-
-export DUNELAR_VERSION=v10_00_04d00
-export DUNELAR_QUALIFIER=e26:prof
-setup dunesw $DUNELAR_VERSION -q $DUNELAR_QUALIFIER
-
-setup_fnal_security
-~~~
-{: .language-bash}
-
-~~~
-# define a sample file
-export SAMPLE_FILE=root://fndcadoor.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/physics/full-reconstructed/2023/mc/out1/MC_Winter2023_RITM1592444_reReco/54/05/35/65/NNBarAtm_hA_BR_dune10kt_1x2x6_54053565_607_20220331T192335Z_gen_g4_detsim_reco_65751406_0_20230125T150414Z_reReco.root
-~~~
-{: .language-bash}
-
-The examples below will refer to files in `dCache` at Fermilab which can best be accessed via `xrootd`.
-
-**For those with no access to Fermilab computing resources but with a CERN account:**
-Copies are stored in `/afs/cern.ch/work/t/tjunk/public/jan2023tutorialfiles/`.
-
-The follow-up of this tutorial provides help on how to find data and MC files in storage.
-
-You can list available versions of `dunesw` installed in `CVMFS` with this command:
-
-~~~
-ups list -aK+ dunesw
-~~~
-{: .language-bash}
-
-The output is not sorted, although portions of it may look sorted. Do not depend on it being sorted. The string indicating the version is called the version tag (v09_72_01d00 here). The qualifiers are e26 and prof. Qualifiers can be entered in any order and are separated by colons. "e26" corresponds to a specific version of the GNU compiler -- v9.3.0. We also compile with `clang` -- the compiler qualifier for that is "c7".
-
-"prof" means "compiled with optimizations turned on." "debug" means "compiled with optimizations turned off". More information on qualifiers is [here][about-qualifiers].
-
-In addition to the version and qualifiers, `UPS` products have "flavors". This refers to the operating system type and version. Older versions of DUNE software supported `SL6` and some versions of macOS. Currently only SL7 and the compatible CentOS 7 are supported. The flavor of a product is automatically selected to match your current operating system when you set up a product. If a product does not have a compatible flavor, you will get an error message. "Unflavored" products are ones that do not depend on the operating-system libraries. They are listed with a flavor of "NULL".
-
-There is a setup command provided by the operating system -- you usually don't want to use it (at least not when developing DUNE software). If you haven't yet sourced the `setup_dune.sh` script in `CVMFS` above but type `setup xyz` anyway, you will get the system setup command, which will ask you for the root password. Just `control-C` out of it, source the `setup_dune.sh` script, and try again. On AL9 and the SL7 container, there is no system setup command so you will get "command not found" if you haven't yet set up UPS.
-
-UPS's setup command (find out where it lives with this command):
-
-~~~
-type setup
-~~~
-{: .language-bash}
-
-will not only set up the product you specify (in the instructions above, dunesw), but also all dependent products with corresponding versions so that you get a consistent software environment. You can get a list of everything that's set up with this command
-
-~~~
- ups active
-~~~
-{: .language-bash}
-
-It is often useful to pipe the output through grep to find a particular product.
-
-~~~
- ups active | grep geant4
-~~~
-{: .language-bash}
-
-for example, to see what version of geant4 you have set up.
-
-### *Art* command-line tools
-
-All of these command-line tools have online help. Invoke the help feature with the `--help` command-line option. Example:
-
-~~~
-config_dumper --help
-~~~
-{: .language-bash}
-
-Docmentation on art command-line tools is available on the [art wiki page][art-wiki].
-
-#### config_dumper
-
-Configuration information for a file can be printed with config_dumper.
-
-~~~
-config_dumper -P
-~~~
-{: .language-bash}
-
-Try it out:
-~~~
-config_dumper -P $SAMPLE_FILE
-~~~
-{: .language-bash}
-
-The output is an executable `fcl` file, sent to stdout. We recommend redirecting the output to a file that you can look at in a text editor:
-
-Try it out:
-~~~
-config_dumper -P $SAMPLE_FILE > tmp.fcl
-~~~
-{: .language-bash}
-
-Your shell may be configured with `noclobber`, meaning that if you already have a file called `tmp.fcl`, the shell will refuse to overwrite it. Just `rm tmp.fcl` and try again.
-
-The `-P` option to `config_dumper` is needed to tell `config_dumper` to print out all processing configuration `fcl` parameters. The default behavior of `config_dumper` prints out only a subset of the configuration parameters, and is most notably missing art services configuration.
-
-
-> ## Quiz
->
-> Quiz questions from the output of the above run of `config_dumper`:
->
-> 1. What generators were used? What physics processes are simulated in this file?
-> 2. What geometry is used? (hint: look for "GDML" or "gdml")
-> 3. What electron lifetime was assumed?
-> 4. What is the readout window size?
->
-{: .solution}
-
-
-#### fhicl-dump
-
-You can parse a `FCL` file with `fhicl-dump`.
-
-Try it out:
-~~~
-fhicl-dump protoDUNE_refactored_g4_stage2.fcl
-~~~
-{: .language-bash}
-
-See the section below on `FCL` files for more information on what you're looking at.
-
-#### count_events
-
-Try it out:
-~~~
-count_events $SAMPLE_FILE
-~~~
-{: .language-bash}
-
-
-#### product_sizes_dumper
-
-You can get a peek at what's inside an *art*ROOT file with `product_sizes_dumper`.
-
-Try it out:
-~~~
-product_sizes_dumper -f 0 $SAMPLE_FILE
-~~~
-{: .language-bash}
-
-It is also useful to redirect the output of this command to a file so you can look at it with a text editor and search for items of interest. This command lists the sizes of the `TBranches` in the `Events TTree` in the *art*ROOT file. There is one `TBranch` per data product, and the name of the `TBranch` is the data product name, an "s" is appended (even if the plural of the data product name doesn't make sense with just an "s" on the end), an underscore, then the module label that made the data product, an underscore, the instance name, an underscore, and the process name and a period.
-
-
-Quiz questions, looking at the output from above.
-
-> ## Quiz
-> Questions:
-> 1. What is the name of the data product that takes up the most space in the file?
-> 2. What the module label for this data product?
-> 3. What is the module instance name for this data product? (This question is tricky. You have to count underscores here).
-> 4. How many different modules produced simb::MCTruth data products? What are their module labels?
-> 5. How many different modules produced recob::Hit data products? What are their module labels?
-{: .solution}
-
-You can open up an *art*ROOT file with `ROOT` and browse the `TTrees` in it with a `TBrowser`. Not all `TBranches` and leaves can be inspected easily this way, but enough can that it can save a lot of time programming if you just want to know something simple about a file such as whether it contains a particular data product and how many there are.
-
-Try it out
-~~~
-root $SAMPLE_FILE
-~~~
-{: .language-bash}
-
-then at the `root` prompt, type:
-~~~
-new TBrowser
-~~~
-{: .language-bash}
-
-This will be faster with `VNC`. Navigate to the `Events TTree` in the file that is automatically opened, navigate to the `TBranch` with the Argon 39 MCTruths (it's near the bottom), click on the branch icon `simb::MCTruths_ar39__SinglesGen.obj`, and click on the `NParticles()` leaf (It's near the bottom. Yes, it has a red exclamation point on it, but go ahead and click on it). How many events are there? How many 39Ar decays are there per event on average?
-
-Header files for many data products are in [lardataobj](https://github.com/larsoft/lardataobj) and some are in [nusimdata](https://github.com/NuSoftHEP/nusimdata).
-
-*Art* is not constrained to using `ROOT` files -- we use HDF5-formatted files for some purposes. ROOT has nice browsing features for inspecting ROOT-formatted files; Some HDF5 data visualiztion tools exist, but they assume that data are in particular formats. ROOT has the ability to display more general kinds of data (C++ classes), but it needs dictionaries for some of the more complicated ones.
-
-The *art* main executable program is a very short stub that interprets command-line options, reads in the configuration document (a `FHiCL` file which usually includes other `FHiCL` files), and loads shared libraries, initializes software components, and schedules execution of modules. Most code we are interested in is in the form of *art* plug-ins -- modules, services, and tools. The generic executable for invoking *art* is called `art`, but a LArSoft-customized one is called `lar`. No additional customization has yet been applied so in fact, the `lar` executable has identical functionality to the `art` executable.
-
-There is online help:
-
-~~~
- lar --help
-~~~
-{: .language-bash}
-
-All programs in the art suite have a `--help` command-line option.
-
-Most *art* job invocations take the form
-
-~~~
-lar -n -c fclfile.fcl artrootfile.root
-~~~
-{: .language-bash}
-
-where the input file specification is just on the command line without a command-line option. Explicit examples follow below. The `-n ` is optional -- it specifies the number of events to process. If omitted, or if `` is bigger than the number of events in the input file, the job processes all of the events in the input file. `-n ` is important for the generator stage. There's also a handy `--nskip ` argument if you'd like the job to start processing partway through the input file. You can steer the output with
-
-~~~
-lar -c fclfile.fcl artrootfile.root -o outputartrootfile.root -T outputhistofile.root
-~~~
-{: .language-bash}
-
-
-The `outputhistofile.root` file contains `ROOT` objects that have been declared with the `TFileService` service in user-supplied art plug-in code (i.e. your code).
-
-### Job configuration with FHiCL
-
-The Fermilab Hierarchical Configuration Language, FHiCL is described here [https://cdcvs.fnal.gov/redmine/documents/327][fhicl-described].
-
-FHiCL is **not** a Turing-complete language: you cannot write an executable program in it. It is meant to declare values for named parameters to steer job execution and adjust algorithm parameters (such as the electron lifetime in the simulation and reconstruction). Look at `.fcl` files in installed job directories, like `$DUNESW_DIR/fcl` for examples. `Fcl` files are sought in the directory seach path `FHICL_FILE_PATH` when art starts up and when `#include` statements are processed. A fully-expanded `fcl` file with all the #include statements executed is referred to as a fhicl "document".
-
-Parameters may be defined more than once. The last instance of a parameter definition wins out over previous ones. This makes for a common idiom in changing one or two parameters in a fhicl document. The generic pattern for making a short fcl file that modifies a parameter is:
-
-~~~
-#include "fcl_file_that_does_almost_what_I_want.fcl"
-block.subblock.parameter: new_value
-~~~
-{: .source}
-
-To see what block and subblock a parameter is in, use `fhcl-dump` on the parent fcl file and look for the curly brackets. You can also use
-
-~~~
-lar -c fclfile.fcl --debug-config tmp.txt --annotate
-~~~
-{: .language-bash}
-
-which is equivalent to `fhicl-dump` with the --annotate option and piping the output to tmp.txt.
-
-Entire blocks of parameters can be substituted in using `@local` and `@table` idioms. See the examples and documentation for guidance on how to use these. Generally they are defined in the PROLOG sections of fcl files. PROLOGs must precede all non-PROLOG definitions and if their symbols are not subsequently used they do not get put in the final job configuration document (that gets stored with the data and thus may bloat it). This is useful if there are many alternate configurations for some module and only one is chosen at a time.
-
-
-Try it out:
-~~~
-fhicl-dump protoDUNE_refactored_g4_stage2.fcl > tmp.txt
-~~~
-{: .language-bash}
-
-Look for the parameter `ModBoxA`. It is one of the Modified Box Model ionization parameters. See what block it is in. Here are the contents of a modified g4 stage 2 fcl file that modifies just that parameter:
-
-~~~
-#include "protoDUNE_refactored_g4_stage2.fcl"
-services.LArG4Parameters.ModBoxA: 7.7E-1
-~~~
-{: .source}
-
-> ## Exercise
-> Do a similar thing -- modify the stage 2 g4 fcl configuration to change the drift field from 486.7 V/cm to 500 V/cm. Hint -- you will find the drift field in an array of fields which also has the fields between wire planes listed.
-{: .challenge}
-
-
-### Types of Plug-Ins
-
-Plug-ins each have their own .so library which gets dynamically loaded by art when referenced by name in the fcl configuration.
-
-**Producer Modules**
-A producer module is a software component that writes data products to the event memory. It is characterized by produces<> and consumes<> statements in the class constructor, and `art::Event::put()` calls in the `produces()` method. A producer must produce the data product collection it says it produces, even if it is empty, or *art* will throw an exception at runtime. `art::Event::put()` transfers ownership of memory (use std::move so as not to copy the data) from the module to the *art* event memory. Data in the *art* event memory will be written to the output file unless output commands in the fcl file tell art not to do that. Documentation on output commands can be found in the LArSoft wiki [here][larsoft-rerun-part-job]. Producer modules have methods that are called on begin job, begin run, begin subrun, and on each event, as well as at the end of processing, so you can initialize counters or histograms, and finish up summaries at the end. Source code must be in files of the form: `modulename_module.cc`, where `modulename` does not have any underscores in it.
-
-**Analyzer Modules**
-Analyzer modules read data products from the event memory and produce histograms or TTrees, or other output. They are typically scheduled after the producer modules have been run. Producer modules have methods that are called on begin job, begin run, begin subrun, and on each event, as well as at the end of processing, so you can initialize counters or histograms, and finish up summaries at the end. Source code must be in files of the form: `modulename_module.cc`, where `modulename` does not have any underscores in it.
-
-**Source Modules**
-Source modules read data from input files and reformat it as need be, in order to put the data in *art* event data store. Most jobs use the art-provided RootInput source module which reads in art-formatted ROOT files. RootInput interacts well with the rest of the framework in that it provides lazy reading of TTree branches. When using the RootInput source, data are not actually fetched from the file into memory when the source executes, but only when GetHandle or GetValidHandle or other product get methods are called. This is useful for *art* jobs that only read a subset of the TBranches in an input file. Code for sources must be in files of the form: `modulename_source.cc`, where `modulename` does not have any underscores in it.
-Monte Carlo generator jobs use the input source called EmptyEvent.
-
-**Services**
-These are singleton classes that are globally visible within an *art* job. They can be FHiCL configured like modules, and they can schedule methods to be called on begin job, begin run, begin event, etc. They are meant to help supply configuration parameters like the drift velocity, or more complicated things like geometry functions, to modules that need them. Please do not use services as a back door for storing event data outside of the *art* event store. Source code must be in files of the form: `servicename_service.cc`, where servicename does not have any underscores in it.
-
-**Tools**
-Tools are FHiCL-configurable software components that are not singletons, like services. They are meant to be swappable by FHiCL parameters which tell art which .so libraries to load up, configure, and call from user code. See the [Art Wiki Page][art-wiki-redmine] for more information on tools and other plug-ins.
-
-You can use cetskelgen to make empty skeletons of *art* plug-ins. See the art wiki for documentation, or use
-
-~~~
-cetskelgen --help
-~~~
-{: .language-bash}
-
-for instructions on how to invoke it.
-
-### Ordering of Plug-in Execution
-
-The constructors for each plug-in are called at job-start time, after the shared object libraries are loaded by the image activater after their names have been discovered from the fcl configuration. Producer, analyzer and service plug-ins have BeginJob, BeginRun, BeginSubRun, EndSubRun, EndRun, EndJob methods where they can do things like book histograms, write out summary information, or clean up memory.
-
-When processing data, the input source always gets executed first, and it defines the run, subrun and event number of the trigger record being processed.
-The producers and filters in trigger_paths then get executed for each event. The analyzers and filters in end_paths then get executed. Analyzers cannot be added to trigger_paths, and producers cannot be added to end_paths. This ordering ensures that data products are all produced by the time they are needed to be analyzed. But it also forces high memory usage for the same reason.
-
-Services and tools are visible to other plug-ins at any stage of processing. They are loaded dynamically from names in the fcl configurations, so a common error is to use in code a service that hasn't been mentioned in the job configuration. You will get an error asking you to configure the service, even if it is just an empty configuration with the service name and no parameters set.
-
-
-
-### Non-Plug-In Code
-
-You are welcome to write standard C++ code -- classes and C-style functions are no problem. In fact, to enhance the portability of code, the *art* team encourages the separation of algorithm code into non-framework-specific source files, and to call these functions or class methods from the *art* plug-ins. Typically, source files for standalone algorithm code have the extension .cxx while art plug-ins have .cc extensions. Most directories have a CMakeLists.txt file which has instructions for building the plug-ins, each of which is built into a .so library, and all other code gets built and put in a separate .so library.
-
-### Retrieving Data Products
-
-In a producer or analyzer module, data products can be retrieved from the art event store with `getHandle()` or `getValidHandle()` calls, or more rarely `getManyByType` or other calls. The arguments to these calls specify the module label and the instance of the data product. A typical `TBranch` name in the Events tree in an *art*ROOT file is
-
-~~~
-simb::MCParticles_largeant__G4Stage1.
-~~~
-{: .source}
-
-here, `simb::MCParticle` is the name of the class that defines the data product. The "s" after the data product name is added by *art* -- you have no choice in this even if the plural of your noun ought not to just add an "s". The underscore separates the data product name from the module name, "largeant". Another underscore separates the module name and the instance name, which in this example is the empty string -- there are two underscores together there. The last string is the process name and usually is not needed to be specified in data product retrieval. You can find the `TBranch` names by browsing an artroot file with `ROOT` and using a `TBrowser`, or by using `product_sizes_dumper -f 0`.
-
-### *Art* documentation
-
-There is a mailing list -- `art-users@fnal.gov` where users can ask questions and get help.
-
-There is a workbook for art available at [https://art.fnal.gov/art-workbook/][art-workbook] Look for the "versions" link in the menu on the left for the actual document. It is a few years old and is missing some pieces like how to write a producer module, but it does answer some questions. I recommend keeping a copy of it on your computer and using it to search for answers.
-
-There was an [art/LArSoft course in 2015][art-LArSoft-2015]. While it, too is a few years old, the examples are quite good and it serves as a useful reference.
-
-## Gallery
-
-Gallery is a lightweight tool that lets users read art-formatted root files and make plots without having to write and build art modules. It works well with interpreted and compiled ROOT macros, and is thus ideally suited for data exploration and fast turnaround of making plots. It lacks the ability to use art services, however, though some LArSoft services have been split into services and service providers. The service provider code is intended to be able to run outside of the art framework and linked into separate programs.
-
-Gallery also lacks the ability to write data products to an output file. You are of course free to open and write files of your own devising in your gallery programs. There are example gallery ROOT scripts in duneexamples/duneexamples/GalleryScripts. They are only in the git repository but do not get installed in the UPS product.
-
-More documentation: [https://art.fnal.gov/gallery/][art-more-documentation]
-
-## LArSoft
-
-### Introductory Documentation
-
-LArSoft's home page: [larsoft.org](https://larsoft.org)
-
-The LArSoft wiki is here: [larsoft-wiki](https://larsoft.github.io/LArSoftWiki/).
-
-### Software structure
-
-The LArSoft toolkit is a set of software components that simulate and reconstruct LArTPC data, and also it provides tools for accessing raw data from the experiments. LArSoft contains an interface to GEANT4 (art does not list GEANT4 as a dependency) and the GENIE generator. It contains geometry tools that are adapted for wire-based LArTPC detectors.
-
-LArSoft provides a collection of shared simulation, reconstruction, and analysis tools, with art interfaces. Often, a useful algorithm will be developed by an experimental collaboration, and desire to share it with other LArTPC collaborations, which is how much of the software in LArSoft came to be. Interfaces and services have to be standardized for shared use. Things like the detector geometry and the dead channel list, for example, are detector-specific, but shared simulation and reconstruction algorithms need to be able to access information from these services, which are not defined until an experiment's software stack is set up and the lar program is invoked. LArSoft therefore uses plug-ins and class inheritance extensively to deal with these situations.
-
-A recent graph of the UPS products in a full stack starting with dunesw is available [here](https://wiki.dunescience.org/w/img_auth.php/0/07/Dunesw_v10_00_04d00_e26-prof_graph.pdf) (dunesw). You can see the LArSoft pieces under dunesw, as well as GEANT4, GENIE, ROOT, and a few others.
-
-### LArSoft Data Products
-
-A very good introduction to data products such as raw digits, calibrated waveforms, hits and tracks, that are created and used by LArSoft modules and usable by analyzers was given by Tingjun Yang at the [2019 ProtoDUNE analysis workshop](https://indico.fnal.gov/event/19133/contributions/50492/attachments/31462/38611/dataproducts.pdf) (larsoft-data-products).
-
-There are a number of data product dumper fcl files. A non-exhaustive list of useful examples is given below:
-
-~~~
- dump_mctruth.fcl
- dump_mcparticles.fcl
- dump_simenergydeposits.fcl
- dump_simchannels.fcl
- dump_simphotons.fcl
- dump_rawdigits.fcl
- dump_wires.fcl
- dump_hits.fcl
- dump_clusters.fcl
- dump_tracks.fcl
- dump_pfparticles.fcl
- eventdump.fcl
- dump_lartpcdetector_channelmap.fcl
- dump_lartpcdetector_geometry.fcl
-~~~
-{: .language-bash}
-
-Some of these may require some configuration of input module labels so they can find the data products of interest.
-
-Some of these may require some configuration of input module labels so they can find the data products of interest. Try one of these yourself:
-
-~~~
-lar -n 1 -c dump_mctruth.fcl $SAMPLE_FILE
-~~~
-{: .language-bash}
-
-This command will make a file called `DumpMCTruth.log` which you can open in a text editor. Reminder: `MCTruth` are particles made by the generator(s), and MCParticles are those made by GEANT4, except for those owned by the `MCTruth` data products. Due to the showering nature of LArTPCs, there are usually many more MCParticles than MCTruths.
-
-## Examples and current workflows
-
-The page with instructions on how to find and look at ProtoDUNE data has links to standard fcl configurations for simulating and reconstructing ProtoDUNE data: [https://wiki.dunescience.org/wiki/Look_at_ProtoDUNE_SP_data][look-at-protodune].
-
-Try it yourself! The workflow for ProtoDUNE-SP MC is given in the [Simulation Task Force web page](https://wiki.dunescience.org/wiki/ProtoDUNE-SP_Simulation_Task_Force).
-
-
-### Running on a dunegpvm machine at Fermilab
-
-~~~
- export USER=`whoami`
- mkdir -p /exp/dune/data/users/$USER/tutorialtest
- cd /exp/dune/data/users/$USER/tutorialtest
- source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
-
- export DUNELAR_VERSION=v10_00_04d00
- export DUNELAR_QUALIFIER=e26:prof
- setup dunesw $DUNELAR_VERSION -q $DUNELAR_QUALIFIER
-
- TMPDIR=/tmp lar -n 1 -c mcc12_gen_protoDune_beam_cosmics_p1GeV.fcl -o gen.root
- lar -n 1 -c protoDUNE_refactored_g4_stage1.fcl gen.root -o g4_stage1.root
- lar -n 1 -c protoDUNE_refactored_g4_stage2_sce_datadriven.fcl g4_stage1.root -o g4_stage2.root
- lar -n 1 -c protoDUNE_refactored_detsim_stage1.fcl g4_stage2.root -o detsim_stage1.root
- lar -n 1 -c protoDUNE_refactored_detsim_stage2.fcl detsim_stage1.root -o detsim_stage2.root
- lar -n 1 -c protoDUNE_refactored_reco_35ms_sce_datadriven_stage1.fcl detsim_stage2.root -o reco_stage1.root
- lar -c eventdump.fcl reco_stage1.root >& eventdump_output.txt
- config_dumper -P reco_stage1.root >& config_output.txt
- product_sizes_dumper -f 0 reco_stage1.root >& productsizes.txt
-~~~
-{: .language-bash}
-
-Note added November 22, 2023: The construct "TMPDIR=/tmp lar ..." defines the environment variable TMPDIR only for the duration of the subsequent command on the line. This is needed for the tutorial example because the mcc12 gen stage copies a 2.9 GB file (see below -- it's the one we had to copy over to CERN) to /var/tmp using ifdh's default temporary location. But the dunegpvm machines as of November 2023 seem to rarely have 2.9 GB of space in /var/tmp and you get a "no space left on device" error. The newer prod4 versions of the fcls point to a newer version of the beam particle generator that can stream this file using XRootD instead of copying it with ifdh. But the streaming flag is turned off by default in the prod4 fcl for the version of dunesw used in this tutorial, and so this is the minimal solution. Note for the next iteration: the Prod4 fcls are here: https://wiki.dunescience.org/wiki/ProtoDUNE-SP_Production_IV
-
-### Run the event display on your new Monte Carlo event
-~~~
- lar -c evd_protoDUNE_data.fcl reco_stage1.root
-~~~
-{: .language-bash}
-and push the "Reconstructed" radio button at the bottom of the display.
-
-### Display decoded raw digits
-
-To look at some raw digits in the event display, you need to decode a DAQ file or find one that's already been decoded. The decoder fcl for ProtoDUNE-HD data taken in 2024 is run_pdhd_wibeth3_tpc_decoder.fcl. An event display of an example decoded file is
-~~~
- lar -c evd_protoDUNE_data.fcl /exp/dune/data/users/trj/nov2024tutorial/np04hd_raw_run028707_0075_dataflow5_datawriter_0_20240815T154544_decode.root
-~~~
-which is a file taken in August 2024.
-
-### Running at CERN
-
-This example puts all files in a subdirectory of your home directory. There is an input file for the ProtoDUNE-SP beamline simulation that is copied over and you need to point the generation job at it. The above sequence of commands will work at CERN if you have a Fermilab grid proxy, but not everyone signed up for the tutorial can get one of these yet, so we copied the necessary file over and adjusted a fcl file to point at it. It also runs faster with the local copy of the input file than the above workflow which copies it.
-
-The apptainer command is slightly different as the mounts are different. Here we assume you are logged into an lxplus node running Alma9.
-
->#### Note
-> CERN Apptainer variant
-{: .callout}
-
-~~~
-/cvmfs/oasis.opensciencegrid.org/mis/apptainer/current/bin/apptainer shell --she
-ll=/bin/bash \
--B /cvmfs,/afs,/opt,/run/user,/etc/hostname,/etc/krb5.conf --ipc --pid \
-/cvmfs/singularity.opensciencegrid.org/fermilab/fnal-dev-sl7:latest
-~~~
-{: .language-bash}
-
-Make a fcl file:
-
-~~~
-#include "mcc12_gen_protoDune_beam_cosmics_p1GeV.fcl"
-physics.producers.generator.FileName: "/afs/cern.ch/work/t/tjunk/public/may2023tutorialfiles/H4_v34b_1GeV_-27.7_10M_1.root"
-~~~
-{: .source}
-
-~~~
- cd ~
- mkdir 2024Tutorial
- cd 2024Tutorial
- source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
-
- export DUNELAR_VERSION=v10_00_04
- export LARSOFT_VERSION=${DUNELAR_VERSION}
- export DUNELAR_QUALIFIER=e26:prof
- setup dunesw $DUNELAR_VERSION -q $DUNELAR_QUALIFIER
-
- #cat > tmpgen.fcl << EOF
- ##include "mcc12_gen_protoDune_beam_cosmics_p1GeV.fcl"
- #physics.producers.generator.FileName: "/afs/cern.ch/work/t/tjunk/public/may2023tutorialfiles/H4_v34b_1GeV_-27.7_10M_1.root"
- #EOF
- lar -n 1 -c tmpgen.fcl -o gen.root
- lar -n 1 -c protoDUNE_refactored_g4_stage1.fcl gen.root -o g4_stage1.root
- lar -n 1 -c protoDUNE_refactored_g4_stage2_sce_datadriven.fcl g4_stage1.root -o g4_stage2.root
- lar -n 1 -c protoDUNE_refactored_detsim_stage1.fcl g4_stage2.root -o detsim_stage1.root
- lar -n 1 -c protoDUNE_refactored_detsim_stage2.fcl detsim_stage1.root -o detsim_stage2.root
- lar -n 1 -c protoDUNE_refactored_reco_35ms_sce_datadriven_stage1.fcl detsim_stage2.root -o reco_stage1.root
- lar -c eventdump.fcl reco_stage1.root >& eventdump_output.txt
- config_dumper -P reco_stage1.root >& config_output.txt
- product_sizes_dumper -f 0 reco_stage1.root >& productsizes.txt
- ~~~
- {: .language-bash}
-
-You can also browse the root files with a TBrowser or run other dumper fcl files on them. The dump example commands above redirect their outputs to text files which you can edit with a text editor or run grep on to look for things.
-
-You can run the event display with
-
-~~~
-lar -c evd_protoDUNE.fcl reco_stage1.root
-~~~
-{: .language-bash}
-
-but it will run very slowly over a tunneled X connection. A VNC session will be much faster. Tips: select the "Reconstructed" radio button at the bottom and click on "Unzoom Interest" on the left to see the reconstructed objects in the three views.
-
-
-## DUNE software documentation and how-to's
-
-The following legacy wiki page provides information on how to check out, build, and contribute to dune-specific larsoft plug-in code.
-
-[https://cdcvs.fnal.gov/redmine/projects/dunetpc/wiki][dunetpc-wiki]
-
-The follow-up part of this tutorial gives hands-on exercises for doing these things.
-
-### Contributing to LArSoft
-
-The LArSoft git repositories are hosted on GitHub and use a pull-request model. LArSoft's github link is [https://github.com/larsoft][github-link]. DUNE repositories, such as the dunesw stack, protoduneana and garsoft are also on GitHub but at the moment (not for long however), allow users to push code.
-
-To work with pull requests, see the documentation at this link: [https://larsoft.github.io/LArSoftWiki/Developing_With_LArSoft][developing-with-larsoft]
-
-There are bi-weekly LArSoft coordination meetings [https://indico.fnal.gov/category/405/][larsoft-meetings] at which stakeholders, managers, and users discuss upcoming releases, plans, and new features to be added to LArSoft.
-
-## Useful tip: check out an inspection copy of larsoft
-
-A good old-fashioned `grep -r` or a find command can be effective if you are looking for an example of how to call something but I do not know where such an example might live. The copies of LArSoft source in CVMFS lack the CMakeLists.txt files and if that's what you're looking for to find examples, it's good to have a copy checked out. Here's a script that checks out all the LArSoft source and DUNE LArSoft code but does not compile it. Warning: it deletes a directory called "inspect" in your app area. Make sure `/exp/dune/app/users/` exists first:
-
-
-> ## Note
-> Remember the Apptainer! You can use your dunesl7 alias defined at the top of this page.
-{: .callout}
-
-~~~
- #!/bin/bash
- USERNAME=`whoami`
- source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
- cd /exp/dune/app/users/${USERNAME}
- rm -rf inspect
- mkdir inspect
- cd inspect
- mrb newDev
- source /exp/dune/app/users/${USERNAME}/inspect/localProducts*/setup
- cd srcs
- mrb g larsoft_suite
- mrb g larsoftobj_suite
- mrb g larutils
- mrb g larbatch
- mrb g dune_suite
- mrb g -d dune_raw_data dune-raw-data
-~~~
-{: .language-bash}
-
-Putting it to use: A very common workflow in developing software is to look for an example of how to do something similar to what you want to do. Let's say you want to find some examples of how to use `FindManyP` -- it's an *art* method for retrieving associations between data products, and the art documentation isn't as good as the examples for learning how to use it. You can use a recursive grep through your checked-out version, or you can even look through the installed source in CVMFS. This example looks through the duneprototype product's source files for `FindManyP`:
-
-~~~
- cd $DUNEPROTOTYPES_DIR/source/duneprototypes
- grep -r -i findmanyp *
-~~~
-{: .language-bash}
-
-It is good to use the `-i` option to grep which tells it to ignore the difference between uppercase and lowercase string matches, in case you misremembered the case of what you are looking for. The list of matches is quite long -- you may want to pipe the output of that grep into another grep
-
-~~~
- grep -r -i findmanyp * | grep recob::Hit
-~~~
-{: .language-bash}
-
-The checked-out versions of the software have the advantage of providing some files that don't get installed in CVMFS, notably CMakeLists.txt files and the UPS product_deps files, which you may want to examine when looking for examples of how to do things.
-
-## GArSoft
-
-GArSoft is another art-based software package, designed to simulate the ND-GAr near detector. Many components were copied from LArSoft and modified for the pixel-based TPC with an ECAL. You can find installed versions in CVMFS with the following command:
-
-~~~
-ups list -aK+ garsoft
-~~~
-{: .language-bash}
-
-and you can check out the source and build it by following the instructions on the [GArSoft wiki](https://cdcvs.fnal.gov/redmine/projects/garsoft/wiki).
-
-
-## Quiz
-
-> ## Question 01
->
-> Enter Question here
->
->
.
->
.
->
.
->
.
->
None of the Above
->
->
-> > ## Answer
-> > The correct answer is .
-> > {: .output}
-> > Comment here
-> {: .solution}
-{: .challenge}
-
-
-{%include links.md%}
-
-[about-qualifiers]: https://cdcvs.fnal.gov/redmine/projects/cet-is-public/wiki/AboutQualifiers
-[art-wiki]: https://cdcvs.fnal.gov/redmine/projects/art/wiki
-[larsoft-rerun-part-job]: https://larsoft.github.io/LArSoftWiki/Rerun_part_of_all_a_job_on_an_output_file_of_that_job
-[github-link]: https://github.com/larsoft
-[protodune-sim-task-force]: https://wiki.dunescience.org/wiki/ProtoDUNE-SP_Simulalation_Task_Force
-[larsoft-meetings]: https://indico.fnal.gov/category/405/][larsoft-meetings
-[developing-with-larsoft]: https://larsoft.github.io/LArSoftWiki/Developing_With_LArSoft
-[fhicl-described]: https://cdcvs.fnal.gov/redmine/documents/327
-[garsoft-wiki]: https://cdcvs.fnal.gov/redmine/projects/garsoft/wiki
-[art-wiki-redmine]: https://cdcvs.fnal.gov/redmine/projects/art/wiki#How-to-use-the-modularity-of-art
-[art-more-documentation]: https://art.fnal.gov/gallery/][art-more-documentation
-[using-larsoft]: https://cdcvs.fnal.gov/redmine/projects/larsoft/wiki/Using_LArSoft
-[larsoft-data-products]: https://indico.fnal.gov/event/19133/contributions/50492/attachments/31462/38611/dataproducts.pdf
-[dunetpc-wiki]: https://cdcvs.fnal.gov/redmine/projects/dunetpc/wiki
-[look-at-protodune]: https://wiki.dunescience.org/wiki/Look_at_ProtoDUNE_SP_data
-[art-LArSoft-2015]: https://indico.fnal.gov/event/9928/timetable/?view=standard
-[art-workbook]: https://art.fnal.gov/art-workbook/
diff --git a/_episodes/05.5-mrb.md b/_episodes/05.5-mrb.md
deleted file mode 100644
index ff86bc7..0000000
--- a/_episodes/05.5-mrb.md
+++ /dev/null
@@ -1,38 +0,0 @@
----
-title: Multi Repository Build (mrb) system (2024)
-teaching: 10
-exercises: 0
-questions:
-- How are different software versions handled?
-objectives:
-- Understand the roles of the tool mrb
-keypoints:
-- The multi-repository build (mrb) tool allows code modification in multiple repositories, which is relevant for a large project like LArSoft with different cases (end user and developers) demanding consistency between the builds.
----
-
-## mrb
-**What is mrb and why do we need it?**
-Early on, the LArSoft team chose git and cmake as the software version manager and the build language, respectively, to keep up with industry standards and to take advantage of their new features. When we clone a git repository to a local copy and check out the code, we end up building it all. We would like LArSoft and DUNE code to be more modular, or at least the builds should reflect some of the inherent modularity of the code.
-
-Ideally, we would like to only have to recompile a fraction of the software stack when we make a change. The granularity of the build in LArSoft and other art-based projects is the repository. So LArSoft and DUNE have divided code up into multiple repositories (DUNE ought to divide more than it has, but there are a few repositories already with different purposes). Sometimes one needs to modify code in multiple repositories at the same time for a particular project. This is where mrb comes in.
-
-**mrb** stands for "multi-repository build". mrb has features for cloning git repositories, setting up build and local products environments, building code, and checking for consistency (i.e. there are not two modules with the same name or two fcl files with the same name). mrb builds UPS products -- when it installs the built code into the localProducts directory, it also makes the necessasry UPS table files and .version directories. mrb also has a tool for making a tarball of a build product for distribution to the grid. The software build example later in this tutorial exercises some of the features of mrb.
-
-| Command | Action |
-|--------------------------|-----------------------------------------------------|
-| `mrb --help` | prints list of all commands with brief descriptions |
-| `mrb \ --help` | displays help for that command |
-| `mrb gitCheckout` | clone a repository into working area |
-| `mrbsetenv` | set up build environment |
-| `mrb build -jN` | builds local code with N cores |
-| `mrb b -jN` | same as above |
-| `mrb install -jN` | installs local code with N cores |
-| `mrb i -jN` | same as above (this will do a build also) |
-| `mrbslp` | set up all products in localProducts... |
-| `mrb z` | get rid of everything in build area |
-
-Link to the [mrb reference guide](https://cdcvs.fnal.gov/redmine/projects/mrb/wiki/MrbRefereceGuide)
-
-> ## Exercise 1
-> There is no exercise 5. mrb example exercises will be covered in a later session as any useful exercise with mrb takes more than 30 minutes on its own. Everyone gets 100% credit for this exercise!
-{: .challenge}
diff --git a/_episodes/06-larsoft-modify-module.md b/_episodes/06-larsoft-modify-module.md
deleted file mode 100644
index 45b5ba5..0000000
--- a/_episodes/06-larsoft-modify-module.md
+++ /dev/null
@@ -1,699 +0,0 @@
----
-title: Expert in the Room - LArSoft How to modify a module - in progress
-teaching: 15
-exercises: 0
-questions:
-- How do I check out, modify, and build DUNE code?
-objectives:
-- How to use mrb.
-- Set up your environment.
-- Download source code from DUNE's git repository.
-- Build it.
-- Run an example program.
-- Modify the job configuration for the program.
-- Modify the example module to make a custom histogram.
-- Test the modified module.
-- Stretch goal -- run the debugger.
-key points:
-- DUNE's software stack is built out of a tree of UPS products.
-- You don't have to build all of the software to make modifications -- you can check out and build one or more products to achieve your goals.
-- You can set up pre-built CVMFS versions of products you aren't developing, and UPS will check version consistency, though it is up to you to request the right versions.
-- mrb is the tool DUNE uses to check out software from multiple repositories and build it in a single test release.
-- mrb uses git and cmake, though aspects of both are exposed to the user.
----
-
-
-
-## First learn a bit about the MRB system
-
-Link to the [mrb]({{ site.baseurl }}/05.5-mrb) episode
-
-## getting set up
-
-You will need *three* login sessions. These have different
-environments set up.
-
-* Session #1 For editing code (and searching for code)
-* Session #2 For building (compiling) the software
-* Session #3 For running the programs
-
-## Session 1
-
-Start up session #1, editing code, on one of the dunegpvm*.fnal.gov
-interactive nodes. These scripts have also been tested on the
-lxplus.cern.ch interactive nodes.
-
-> ## Note Remember the Apptainer!
-> see below for special Apptainers for CERN and build machines.
-{: .callout}
-
-Create two scripts in your home directory:
-
-`newDev2024Tutorial.sh` should have these contents:
-
-~~~
-#!/bin/bash
-export DUNELAR_VERSION=v10_00_04d00
-export PROTODUNEANA_VERSION=$DUNELAR_VERSION
-DUNELAR_QUALIFIER=e26:prof
-DIRECTORY=2024tutorial
-USERNAME=`whoami`
-export WORKDIR=/exp/dune/app/users/${USERNAME}
-if [ ! -d "$WORKDIR" ]; then
- export WORKDIR=`echo ~`
-fi
-
-source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
-
-cd ${WORKDIR}
-touch ${DIRECTORY}
-rm -rf ${DIRECTORY}
-mkdir ${DIRECTORY}
-cd ${DIRECTORY}
-mrb newDev -q ${DUNELAR_QUALIFIER}
-source ${WORKDIR}/${DIRECTORY}/localProducts*/setup
-mkdir work
-cd srcs
-mrb g -t ${PROTODUNEANA_VERSION} protoduneana
-
-cd ${MRB_BUILDDIR}
-mrbsetenv
-mrb i -j16
-~~~
-{: .language-bash}
-
-and `setup2024Tutorial.sh` should have these contents:
-
-~~~
-DIRECTORY=2024tutorial
-USERNAME=`whoami`
-
-source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
-export WORKDIR=/exp/dune/app/users/${USERNAME}
-if [ ! -d "$WORKDIR" ]; then
- export WORKDIR=`echo ~`
-fi
-
-cd $WORKDIR/$DIRECTORY
-source localProducts*/setup
-cd work
-setup dunesw $DUNELAR_VERSION -q $DUNELAR_QUALIFIER
-mrbslp
-~~~
-{: .language-bash}
-
-Execute this command to make the first script executable.
-
-~~~
- chmod +x newDev2024Tutorial.sh
-~~~
-{: .language-bash}
-
-It is not necessary to chmod the setup script. Problems writing
-to your home directory? Check to see if your Kerberos ticket
-has been forwarded.
-
-~~~
- klist
-~~~
-{: .language-bash}
-
-## Session 2
-
-Start up session #2 by logging in to one of the build nodes,
-`dunebuild02.fnal.gov` or `dunebuild03.fnal.gov`. They have at least 16 cores
-apiece and the dunegpvm's have only four, so builds run much faster
-on them. If all tutorial users log on to the same one and try
-building all at once, the build nodes may become very slow or run
-out of memory. The `lxplus` nodes are generally big enough to build
-sufficiently quickly. The Fermilab build nodes should not be used
-to run programs (people need them to build code!)
-
-Note -- interactive computers at Fermilab will print out how much RAM, swap, and CPU threads the node has when you log in. In general, builds that launch more processes than a machine has threads will not run any faster, but it will use more memory. So the command "mrb i -j16" above is intended to be run on a build node with at least 16 threads and enough memory to support 16 simultaneous invocations of the C++ compiler, which may take up to 2 GB per invocation.
-
-> ## Note you need a modified container on the build machines and at CERN as they don't mount /pnfs
-> This is done to prevent people from running interactive jobs on the dedicated build machines.
-{: .callout}
-
-### FNAL build machines
-~~~
-# remove /pnfs/ for build machines
-/cvmfs/oasis.opensciencegrid.org/mis/apptainer/current/bin/apptainer shell --shell=/bin/bash \
--B /cvmfs,/exp,/nashome,/opt,/run/user,/etc/hostname,/etc/hosts,/etc/krb5.conf --ipc --pid \
-/cvmfs/singularity.opensciencegrid.org/fermilab/fnal-dev-sl7:latest
-~~~
-{: .language-bash}
-
-### CERN
-~~~
-/cvmfs/oasis.opensciencegrid.org/mis/apptainer/current/bin/apptainer shell --shell=/bin/bash\
--B /cvmfs,/afs,/opt,/run/user,/etc/hostname,/etc/krb5.conf --ipc --pid \
-/cvmfs/singularity.opensciencegrid.org/fermilab/fnal-dev-sl7:latest
-~~~
-{: .language-bash}
-
-### Download source code and build it
-
-On the build node, execute the `newDev` script:
-
-~~~
- ./newDev2024Tutorial.sh
-~~~
-{: .language-bash}
-
-Note that this script will *delete* the directory planned to store
-the source code and built code, and make a new directory, in order
-to start clean. Be careful not to execute this script then if you've
-worked on the code some, as this script will wipe it out and start fresh.
-
-This build script will take a few minutes to check code out and compile it.
-
-The `mrb g` command does a `git clone` of the specified repository with an optional tag and destination name. More information is available [here][dunetpc-wiki] and [here][mrb-reference-guide].
-
-Some comments on the build command
-
-~~~
- mrb i -j16
-~~~
-{: .language-bash}
-
-The `-j16` says how many concurrent processes to run. Set the number to no more than the number of cores on the computer you're running it on. A dunegpvm machine has four cores, and the two build nodes each have 16. Running more concurrent processes on a computer with a limited number of cores won't make the build finish any faster, but you may run out of memory. The dunegpvms do not have enough memory to run 16 instances of the C++ compiler at a time, and you may see the word `killed` in your error messages if you ask to run many more concurrent compile processes than the interactive computer can handle.
-
-You can find the number of cores a machine has with
-
-~~~
- cat /proc/cpuinfo
-~~~
-{: .language-bash}
-
-The `mrb` system builds code in a directory distinct from the source code. Source code is in `$MRB_SOURCE` and built code is in `$MRB_BUILDDIR`. If the build succeeds (no error messages, and compiler warnings are treated as errors, and these will stop the build, forcing you to fix the problem), then the built artifacts are put in `$MRB_TOP/localProducts*`. mrbslp directs ups to search in `$MRB_TOP/localProducts*` first for software and necessary components like `fcl` files. It is good to separate the build directory from the install directory as a failed build will not prevent you from running the program from the last successful build. But you have to look at the error messages from the build step before running a program. If you edited source code, made a mistake, built it unsuccessfully, then running the program may run successfully with the last version which compiled. You may be wondering why your code changes are having no effect. You can look in `$MRB_TOP/localProducts*` to see if new code has been added (look for the "lib" directory under the architecture-specific directory of your product).
-
-Because you ran the `newDev2024Tutorial.sh` script instead of sourcing it, the environment it
-set up within it is not retained in the login session you ran it from. You will need to set up your environment again.
-You will need to do this when you log in anyway, so it is good to have
-that setup script. In session #2, type this:
-
-~~~
- source setup2024Tutorial.sh
- cd $MRB_BUILDDIR
- mrbsetenv
-~~~
-{: .language-bash}
-
-The shell command "source" instructs the command interpreter (bash) to read commands from the file `setup2024Tutorial.sh` as if they were typed at the terminal. This way, environment variables set up by the script stay set up.
-Do the following in session #1, the source editing session:
-
-~~~
-source setup2024Tutorial.sh
- cd $MRB_SOURCE
- mrbslp
-~~~
-{: .language-bash}
-
-## Run your program
-
-[YouTube Lecture Part 2](https://youtu.be/8-M2ZV-zNXs): Start up the session for running programs -- log in to a `dunegpvm` interactive
-computer for session #3
-
-~~~
- source setup2024Tutorial.sh
- mrbslp
- setup_fnal_security
-~~~
-{: .language-bash}
-
-We need to locate an input file. Here are some tips for finding input data:
-
-[https://wiki.dunescience.org/wiki/Look_at_ProtoDUNE_SP_data][dune-wiki-protodune-sp]
-
-Data and MC files are typically on tape, but can be cached on disk so you don't have to wait possibly a long time for the
-file to be staged in. Check to see if a sample file is in dCache or only on tape:
-
-~~~
-cache_state.py PDSPProd4a_protoDUNE_sp_reco_stage1_p1GeV_35ms_sce_datadriven_18800650_2_20210414T012053Z.root
-~~~
-{: .language-bash}
-
-Get the `xrootd` URL:
-
-~~~
-samweb get-file-access-url --schema=root PDSPProd4a_protoDUNE_sp_reco_stage1_p1GeV_35ms_sce_datadriven_18800650_2_20210414T012053Z.root
-~~~
-{: .language-bash}
-
-which should print the following URL:
-
-~~~
-root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune-sp/full-reconstructed/2021/mc/out1/PDSPProd4a/18/80/06/50/PDSPProd4a_protoDUNE_sp_reco_stage1_p1GeV_35ms_sce_datadriven_18800650_2_20210414T012053Z.root
-~~~
-{: .language-bash}
-
-Now run the program with the input file accessed by that URL:
-
-~~~
-lar -c analyzer_job.fcl root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune-sp/full-reconstructed/2021/mc/out1/PDSPProd4a/18/80/06/50/PDSPProd4a_protoDUNE_sp_reco_stage1_p1GeV_35ms_sce_datadriven_18800650_2_20210414T012053Z.root
-~~~
-{: .language-bash}
-
-CERN Users without access to Fermilab's `dCache`: -- example input files for this tutorial have been copied to `/afs/cern.ch/work/t/tjunk/public/2024tutorialfiles/`.
-
-After running the program, you should have an output file `tutorial_hist.root`. Note -- please do not
-store large rootfiles in `/exp/dune/app`! The disk is rather small, and we'd like to
-save it for applications, not data. But this file ought to be quite small.
-Open it in root
-
-~~~
- root tutorial_hist.root
-~~~
-{: .language-bash}
-
-and look at the histograms and trees with a `TBrowser`. It is empty!
-
-#### Adjust the program's job configuration
-
-In Session #1, the code editing session,
-
-~~~
- cd ${MRB_SOURCE}/protoduneana/protoduneana/TutorialExamples/
-~~~
-{: .language-bash}
-
-See that `analyzer_job.fcl` includes `clustercounter.fcl`. The `module_type`
-line in that `fcl` file defines the name of the module to run, and
-`ClusterCounter_module.cc` just prints out a message in its analyze() method
-just prints out a line to stdout for each event, without making any
-histograms or trees.
-
-Aside on module labels and types: A module label is used to identify
-which modules to run in which order in a trigger path in an art job, and also
-to label the output data products. The "module type" is the name of the source
-file: `moduletype_module.cc` is the filename of the source code for a module
-with class name moduletype. The build system preserves this and makes a shared object (`.so`)
-library that art loads when it sees a particular module_type in the configuration document.
-The reason there are two names here is so you
-can run a module multiple times in a job, usually with different inputs. Underscores
-are not allowed in module types or module labels because they are used in
-contexts that separate fields with underscores.
-
-Let's do something more interesting than ClusterCounter_module's print
-statement.
-
-Let's first experiment with the configuration to see if we can get
-some output. In Session #3 (the running session),
-
-~~~
- fhicl-dump analyzer_job.fcl > tmp.txt
-~~~
-{: .language-bash}
-
-and open tmp.txt in a text editor. You will see what blocks in there
-contain the fcl parameters you need to adjust.
-Make a new fcl file in the work directory
-called `myana.fcl` with these contents:
-
-~~~
-#include "analyzer_job.fcl"
-
-physics.analyzers.clusterana.module_type: "ClusterCounter3"
-~~~
-{: .language-bash}
-
-Try running it:
-
-~~~
- lar -c myana.fcl root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune-sp/full-reconstructed/2021/mc/out1/PDSPProd4a/18/80/06/50/PDSPProd4a_protoDUNE_sp_reco_stage1_p1GeV_35ms_sce_datadriven_18800650_2_20210414T012053Z.root
-~~~
-{: .language-bash}
-
-
-but you will get error messages about "product not found".
-Inspection of `ClusterCounter3_module.cc` in Session #1 shows that it is
-looking for input clusters. Let's see if we have any in the input file,
-but with a different module label for the input data.
-
-Look at the contents of the input file:
-
-~~~
- product_sizes_dumper root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune-sp/full-reconstructed/2021/mc/out1/PDSPProd4a/18/80/06/50/PDSPProd4a_protoDUNE_sp_reco_stage1_p1GeV_35ms_sce_datadriven_18800650_2_20210414T012053Z.root | grep -i cluster
-~~~
-{: .language-bash}
-
-There are clusters with module label "pandora" but not
-`lineclusterdc` which you can find in the tmp.txt file above. Now edit `myana.fcl` to say
-
-~~~
-#include "analyzer_job.fcl"
-
-physics.analyzers.clusterana.module_type: "ClusterCounter3"
-physics.analyzers.clusterana.ClusterModuleLabel: "pandora"
-~~~
-{: .language-bash}
-
-and run it again:
-
-~~~
- lar -c myana.fcl root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune-sp/full-reconstructed/2021/mc/out1/PDSPProd4a/18/80/06/50/PDSPProd4a_protoDUNE_sp_reco_stage1_p1GeV_35ms_sce_datadriven_18800650_2_20210414T012053Z.root
-~~~
-{: .language-bash}
-
-
-Lots of information on job configuration via FHiCL is available at this [link][redmine-327]
-
-#### Editing the example module and building it
-
-[YouTube Lecture Part 3](https://youtu.be/S29HEzIoGwc): Now in session #1, edit `${MRB_SOURCE}/protoduneana/protoduneana/TutorialExamples/ClusterCounter3_module.cc`
-
-Add
-
-~~~
-#include "TH1F.h"
-~~~
-{: .source}
-
-to the section with includes.
-
-Add a private data member
-
-~~~
-TH1F *fTutorialHisto;
-~~~
-{: .source}
-
-to the class. Create the histogram in the `beginJob()` method:
-
-~~~
-fTutorialHisto = tfs->make("TutorialHisto","NClus",100,0,500);
-~~~
-
-Fill the histo in the `analyze()` method, after the loop over clusters:
-
-~~~
-fTutorialHisto->Fill(fNClusters);
-~~~
-{: .source}
-
-Go to session #2 and build it. The current working directory should be the build directory:
-
-~~~
-make install -j16
-~~~
-{: .language-bash}
-
-
-Note -- this is the quicker way to re-build a product. The `-j16` says to use 16 parallel processes,
-which matches the number of cores on a build node. The command
-
-~~~
-mrb i -j16
-~~~
-{: .language-bash}
-
-first does a cmake step -- it looks through all the `CMakeLists.txt` files and processes them,
-making makefiles. If you didn't edit a `CMakeLists.txt` file or add new modules or fcl files
-or other code, a simple make can save you some time in running the single-threaded `cmake` step.
-
-Rerun your program in session #3 (the run session)
-
-~~~
- lar -c myana.fcl root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune-sp/full-reconstructed/2021/mc/out1/PDSPProd4a/18/80/06/50/PDSPProd4a_protoDUNE_sp_reco_stage1_p1GeV_35ms_sce_datadriven_18800650_2_20210414T012053Z.root
-~~~
-{: .language-bash}
-
-Open the output file in a TBrowser:
-
-~~~
- root tutorial_hist.root
-~~~
-{: .language-bash}
-
-and browse it to see your new histogram. You can also run on some data.
-
-~~~
- lar -c myana.fcl -T dataoutputfile.root root://fndca1.fnal.gov/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune-sp/full-reconstructed/2020/detector/physics/PDSPProd4/00/00/53/87/np04_raw_run005387_0041_dl7_reco1_13832298_0_20201109T215042Z.root
-~~~
-{: .language-bash}
-
-The `-T dataoutputfile.root` changes the output filename for the `TTrees` and
-histograms to `dataoutputfile.root` so it doesn't clobber the one you made
-for the MC output.
-
-This iteration of course is rather slow -- rebuilding and running on files in `dCache`. Far better,
-if you are just changing histogram binning, for example, is to use the output TTree.
-`TTree::MakeClass` is a very useful way to make a script that reads in the `TBranches` of a `TTree` on
-a file. The workflow in this tutorial is also useful in case you decide to add more content
-to the example `TTree`.
-
-#### Run your program in the debugger
-
-##### gdb and ddd
-
-As of January 2025, the Fermilab license for forge_tools ddt and map has expired and will not be renewed. To debug programs, we now have access to command-line gdb and ddd. Instructions for how to use both of these are available on the web. The version of gdb that comes with SL7 is quite old. gdb gets set up with dunesw however so you get a version that can debug programs compiled with modern versions of gcc and clang. The gui debugger ddd is also installed both in the AL9 suite on the dunegpvms, as well as in the defualt SL7 container. ddd uses gdb under the hood, but it provides convenience features for displaying data and setting breakpoints in the source window. There is an issue with assigning a pseudo-terminal in a SL7 container session that is fixed with a preloaded shared library.
-
-~~~
- source /etc/profile.d/ddd.sh
-~~~
-{: .language-bash}
-defines an alias for ddd that sets LD_PRELOAD before running the debugger gui. Some of the advice in using the forge_tools debugger below is expected to be useful in running ddd and gdb at the command line, such as the need to find the appropriate version of the source, and stepping to find bugs.
-
-
-##### Old forge_tools ddt instructions
-
-[YouTube Lecture Part 4](https://youtu.be/xcgVKmpKgfw): In session #3 (the running session)
-
-~~~
- setup forge_tools
-
- ddt `which lar` -c myana.fcl root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune-sp/full-reconstructed/2021/mc/out1/PDSPProd4a/18/80/06/50/PDSPProd4a_protoDUNE_sp_reco_stage1_p1GeV_35ms_sce_datadriven_18800650_2_20210414T012053Z.root
-~~~
-{: .language-bash}
-
-Click the "Run" button in the window that pops up. The `which lar` is needed because ddt cannot find
-executables in your path -- you have to specify their locations explicitly.
-
-In session #1, look at `ClusterCounter3_module.cc` in a text editor that lets you know what the line numbers are.
-Find the line number that fills your new histogram. In the debugger window, select the "Breakpoints"
-tab in the bottom window, and usethe right-mouse button (sorry mac users -- you may need to get an external
-mouse if you are using VNC. `XQuartz` emulates a three-button mouse I believe). Make sure the "line"
-radio button is selected, and type `ClusterCounter3_module.cc` for the filename. Set the breakpoint
-line at the line you want, for the histogram filling or some other place you find interesting. Click
-Okay, and "Yes" to the dialog box that says ddt doesn't know about the source code yet but will try to
-find it when it is loaded.
-
-Click the right green arrow to start the program. Watch the program in the Input/Output section.
-When the breakpoint is hit, you can browse the stack, inspect values (sometimes -- it is better when
-compiled with debug), set more breakpoints, etc.
-
-You will need Session #1 to search for code that ddt cannot find. Shared object libraries contain
-information about the location of the source code when it was compiled. So debugging something you
-just compiled usually results in a shared object that knows the location of the source, but installed
-code in CVMFS points to locations on the Jenkins build nodes.
-
-#### Looking for source code:
-
-Your environment has lots of variables pointing at installed code. Look for variables like
-
-~~~
- PROTODUNEANA_DIR
-~~~
-
-which points to a directory in `CVMFS`.
-
-~~~
- ls $PROTODUNEANA_DIR/source
-
-or $LARDATAOBJ_DIR/include
-~~~
-
-are good examples of places to look for code, for example.
-
-#### Checking out and committing code to the git repository
-
-For protoduneana and dunesw, this [wiki page][dunetpc-wiki-tutorial] is quite good. LArSoft uses GitHub with a pull-request model. See
-
-[https://cdcvs.fnal.gov/redmine/projects/larsoft/wiki/Developing_With_LArSoft][redmine-dev-larsoft]
-
-[https://cdcvs.fnal.gov/redmine/projects/larsoft/wiki/Working_with_GitHub][redmine-working-github]
-
-### Some handy tools for working with search paths
-
-Tom has written some scripts and made aliases for convenience -- finding files in search paths like FHCIL_FILE_PATH, or FW_SEARCH_PATH, and searching within those files for content. Have a look on the dunegpvms at /exp/dune/data/users/trj/texttools. There is a list of aliases in aliases.txt that can be run in your login script (such as .profile). Put the perl scripts and tkdiff and newtkdiff somewhere in your PATH. A common place to put your favorite convenience scripts is ${HOME}/bin, but make sure to add that to your PATH. The scripts tkdiff and newtkdiff are open-source graphical diff tools that run using TCL/TK.
-
-## Common errors and recovery
-
-#### Version mismatch between source code and installed products
-
-When you perform an mrbsetenv or a mrbslp, sometimes you get a version mismatch. The most common reason for this is that you have set up an older version of the dependent products. `Dunesw` depends on `protoduneana`, which depends on `dunecore`, which depends on `larsoft`, which depends on *art*, ROOT, GEANT4, and many other products. This [picture][dunesw-dependency-tree] shows the software dependency tree for dunesw v09_72_01_d00. If the source code is newer than the installed products, the versions may mismatch. You can check out an older version of the source code (see the example above) with
-
-~~~
- mrb g -t repository
-~~~
-{: .language-bash}
-
-Alternatively, if you have already checked out some code, you can switch to a different tag using your local clone of the git repository.
-
-~~~
- cd $MRB_SOURCE/
- git checkout
-~~~
-{: .language-bash}
-
-Try `mrbsetenv` again after checking out a consistent version.
-
-#### Telling what version is the right one
-
-The versions of dependent products for a product you're building from source are listed in the file `$MRB_SOURCE//ups/product_deps``.
-
-Sometimes you may want to know what the version number is of a product way down on the dependency tree so you can check out its source and edit it. Set up the product in a separate login session:
-
-~~~
- source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
- setup $DUNELAR_VERSION -q $DUNELAR_QUALIFIER
- ups active
-~~~
-{: .language-bash}
-
-It usually is a good idea to pipe the output through grep to find a particular product version. You can get dependency information with
-
-~~~
- ups depend $DUNELAR_VERSION -q $DUNELAR_QUALIFIER
-~~~
-{: .language-bash}
-
-Note: not all dependencies of dependent products are listed by this command. If a product is already listed, it sometimes is not listed a second time, even if two products in the tree depend on it. Some products are listed multiple times.
-
-There is a script in duneutil called `dependency_tree.sh` which makes graphical displays of dependency trees.
-
-#### Inconsistent build directory
-
-The directory $MRB_BUILD contains copies of built code before it gets installed to localProducts. If you change versions of the source or delete things, sometimes the build directory will have clutter in it that has to be removed.
-
-~~~
- mrb z
-~~~
-{: .language-bash}
-
-will delete the contents of `$MRB_BUILDDIR` and you will have to type `mrbsetenv` again.
-
-~~~
- mrb zd
-~~~
-{: .language-bash}
-
-will also delete the contents of localProducts. This can be useful if you are removing code and want to make sure the installed version also has it gone.
-
-### Inconsistent environment
-
-When you use UPS's setup command, a lot of variables get defined. For each product, a variable called `_DIR` is defined, which points to the location of the version and flavor of the product. UPS has a command "unsetup" which often succeeds in undoing what setup does, but it is not perfect. It is possible to get a polluted environment in which inconsistent versions of packages are set up and it is too hard to repair it one product at a time. Logging out and logging back in again, and setting up the session is often the best way to start fresh.
-
-### The setup command is the wrong one
-
-If you have not sourced the DUNE software setup script
-
-~~~
- source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
-~~~
-{: .language-bash}
-
-you will find that the setup command that is used instead is one provided by the operating system and it requires root privilege to execute and setup will ask you for the root password. Rather than typing that if you get in this situation, ctrl-c, source the setup_dune.sh script and try again.
-
-#### Compiler and linker warnings and errors
-
-Common messages from the `g++` compiler are undeclared variables, uninitialized variables, mismatched parentheses or brackets, missing semicolons, checking unsigned variables to see if they are positive (yes, that's a warning!) and other things. mrb is set up to tell `g++` and clang to treat warnings as errors, so they will stop the build and you will have to fix them. Often undeclared variables or methods that aren't members of a class messages result from having forgotten to include the appropriate include file.
-
-The linker has fewer ways to fail than the compiler. Usually the error message is "Undefined symbol". The compiler does not emit this message, so you always know this is in the link step. If you have an undefined symbol, one of three things may have gone wrong. 1) You may have mistyped it (usually this gets caught by the compiler because names are defined in header files). More likely, 2) You introduced a new dependency without updating the `CMakeLists.txt` file. Look in the `CMakeLists.txt` file that steers the building of the source code that has the problem. Look at other `CMakeLists.txt` files in other directories for examples of how to refer to libraries. ` MODULE_LIBRARIES` are linked with modules in the `ART_MAKE` blocks, and `LIB_LIBRARIES` are linked when building non-module libraries (free-floating source code, for algorithms). 3) You are writing new code and just haven't gotten around to finishing writing something you called.
-
-#### Out of disk quota
-
-Do not store data files on the app disk! Sometimes the app disk fills up nonetheless, and there is a quota of 100 GB per user on it. If you need more than that for several builds, you have some options. 1) Use `/exp/dune/data/users/`. You have a 400 GB quota on this volume. They are slower than the app disk and can get even slower if many users are accessing them simultaneously or transferring large amounts of data to or ofrm them. 3) Clean up some space on app. You may want to tar up an old release and store the tarball on the data volume or in `dCache` for later use.
-
-#### Runtime errors
-
-Segmentation faults: These do not throw errors that *art* can catch. They terminate the program immediately. Use the debugger to find out where they happened and why.
-
-Exceptions that are caught. The `ddt` debugger has in its menu a set of standard breakpoints. You can instruct the debugger to stop any time an exception is thrown. A common exception is a vector accessed past its size using `at()`, but often these are hard to track down because they could be anywhere. Start your program with the debugger, but it is often a good idea to turn off the break-on-exception feature until after the geometry has been read in. Some of the XML parsing code throws a lot of exceptions that are later caught as part of its normal mode of operation, and if you hit a breakpoint on each of these and push the "go" button with your mouse each time, you could be there all day. Wait until the initialization is over, press "pause" and then turn on the breakpoints by exception.
-
-If you miss, start the debugging session over again. Starting the session over is also a useful technique when you want to know what happened *before* a known error condition occurs. You may find yourself asking "how did it get in *that* condition? Set a breakpoint that's earlier in the execution and restart the session. Keep backing up -- it's kind of like running the program in reverse, but it's very slow. Sometimes it's the only way.
-
-Print statements are also quite useful for rare error conditions. If a piece of code fails infrequently, based on the input data, sometimes a breakpoint is not very useful because most of the time it's fine and you need to catch the program in the act of misbehaving. Putting in a low-tech print statement, sometimes with a uniquely-identifying string so you can grep the output, can let you put some logic in there to print only when things have gone bad, or even if you print on each iteration, you can just look at the last bit of printout before a crash.
-
-#### No authentication/permission
-
-You will almost always need to have a valid Kerberos ticket in your session. Accessing your home directory on the Fermilab machines requires it. Find your tickets with the command
-
-~~~
- klist
-~~~
-{: .language-bash}
-
-By default, they last for 25 hours or so (a bit more than a day). You can refresh them for another 25 hours (up to
-one week's worth of refreshes are allowed) with
-
-~~~
- kinit -R
-~~~
-{: .language-bash}
-
-If you have a valid ticket on one machine and want to refresh tickets on another, you can
-
-~~~
-k5push
-~~~
-{: .language-bash}
-
-The safest way to get a new ticket to a machine is to kinit on your local computer (like your laptop) and log in again,
-making sure to forward all tickets. In a pinch, you can run kinit on a dunegpvm and enter your Kerberos password, but this is discouraged as bad actors can (and have!) installed keyloggers on shared systems, and have stolen passwords. DO NOT KEEP PRIVATE, PERSONAL INFORMATION ON FERMILAB COMPUTERS! Things like bank account numbers, passwords, and social security numbers are definitely not to be stored on public, shared computers. Running `kinit -R` on a shared machine is fine.
-
-You will need a grid proxy to submit jobs and access data in `dCache` via `xrootd` or `ifdh`.
-
-~~~
- setup_fnal_security
-~~~
-{: .language-bash}
-
-will use your valid Kerberos ticket to generate the necessary certificates and proxies.
-
-#### Link to art/LArSoft tutorial May 2021
-
-
-[https://wiki.dunescience.org/wiki/Presentation_of_LArSoft_May_2021][dune-larsoft-may21]
-
-
-[dunetpc-wiki]: https://cdcvs.fnal.gov/redmine/projects/dunetpc/wiki/_Tutorial_
-[mrb-reference-guide]: https://cdcvs.fnal.gov/redmine/projects/mrb/wiki/MrbRefereceGuide
-[dune-wiki-protodune-sp]: https://wiki.dunescience.org/wiki/Look_at_ProtoDUNE_SP_data
-[redmine-327]: https://cdcvs.fnal.gov/redmine/documents/327
-[dunetpc-wiki-tutorial]: https://cdcvs.fnal.gov/redmine/projects/dunetpc/wiki/_Tutorial_
-[redmine-dev-larsoft]: https://cdcvs.fnal.gov/redmine/projects/larsoft/wiki/Developing_With_LArSoft
-[redmine-working-github]: https://cdcvs.fnal.gov/redmine/projects/larsoft/wiki/Working_with_GitHub
-[dune-larsoft-may21]: https://wiki.dunescience.org/wiki/Presentation_of_LArSoft_May_2021
-[dunesw-dependency-tree]: https://wiki.dunescience.org/w/img_auth.php/6/6f/Dunesw_v09_72_01_e20_prof_graph.pdf
-
-{%include links.md%}
-
-
diff --git a/_episodes/07-grid-job-submission.md b/_episodes/07-grid-job-submission.md
deleted file mode 100644
index 0c7e1b9..0000000
--- a/_episodes/07-grid-job-submission.md
+++ /dev/null
@@ -1,574 +0,0 @@
----
-title: Grid Job Submission and Common Errors
-teaching: 65
-exercises: 0
-questions:
-- How to submit grid jobs?
-objectives:
-- Submit a basic batchjob and understand what's happening behind the scenes
-- Monitor the job and look at its outputs
-- Review best practices for submitting jobs (including what NOT to do)
-- Extension; submit a small job with POMS
-keypoints:
-- When in doubt, ask! Understand that policies and procedures that seem annoying, overly complicated, or unnecessary (especially when compared to running an interactive test) are there to ensure efficient operation and scalability. They are also often the result of someone breaking something in the past, or of simpler approaches not scaling well.
-- Send test jobs after creating new workflows or making changes to existing ones. If things don't work, don't blindly resubmit and expect things to magically work the next time.
-- Only copy what you need in input tar files. In particular, avoid copying log files, .git directories, temporary files, etc. from interactive areas.
-- Take care to follow best practices when setting up input and output file locations.
-- Always, always, always prestage input datasets. No exceptions.
----
-
-> ## Note:
-> This section describes basic job submission. Large scale submission of jobs to read DUNE data files are described in the [next section]({{ site.baseurl }}/08-submit-jobs-w-justin/index.html).
-
-#### Session Video
-
-This session will be captured on video a placed here after the workshop for asynchronous study.
-
-The video from the two day version of this training in May 2022 is provided [here](https://www.youtube.com/embed/QuDxkhq64Og) as a reference.
-
-
-
-#### Live Notes
-
-Participants are encouraged to monitor and utilize the [livedoc](https://docs.google.com/document/d/1QNK-hKPqLIVaecRyg9q4QZOHNwAZgq32oHVuboG_AvQ/edit?usp=sharing) to ask questions and learn. For reference, the [Livedoc from Jan. 2023](https://docs.google.com/document/d/1sgRQPQn1OCMEUHAk28bTPhZoySdT5NUSDnW07aL-iQU/edit?usp=sharing) is provided.
-
-#### Temporary Instructor Note:
-
-The May 2023 training event was cloned from the [May 2022](https://github.com/DUNE/computing-training-basics/blob/gh-pages/_episodes/), both two day events.
-
-This lesson (07-grid-job-submission.md) was imported from the [Jan. 2023 lesson](https://github.com/DUNE/computing-training-basics-short/blob/gh-pages/_episodes/07-grid-job-submission.md) which was a one half day version of the training.
-
-Quiz blocks are added at the bottom of this page, and invite your review, modify, review, and additional comments.
-
-The official timetable for this training event is on the [Indico site](https://indico.fnal.gov/event/59762/timetable/#20230524).
-
-## Notes on changes in the 2023/2024 versions
-
-The past few months have seen significant changes in how DUNE (as well as other FNAL experiments) submits jobs and interacts with storage elements. While every effort was made to preserve backward compatibility a few things will be slightly different (and some are easier!) than what's been shown at previous versions. Therefore even if you've attended this tutorial multiple times in past and know the difference between copying and streaming, tokens vs. proxies, and know your schedds from your shadows, you are encouraged to attend this session. Here is a partial list of significant changes:
-
-* The jobsub_client product generally used for job submission has been replaced by the [jobsub_lite](https://fifewiki.fnal.gov/wiki/Jobsub_Lite)
- product, which is very similar to jobsub_client except there is no server on the other side (i.e. there is more direct HTCondor interaction). You no longer need to set up the jobsub_client product as part of your software setup; it is installed via RPM now on all DUNE interactive machines. See [this Wiki page](https://fifewiki.fnal.gov/wiki/Differences_between_jobsub_lite_and_legacy_jobsub_client/server) for some differences between jobsub_lite and legacy jobsub.
-* __As of May 2024 you cannot submit batch jobs from SL7 containers but many submission scripts only run on SL7. You need to record the submission command from SL7 and the open a separate window running Alma9 and execute that command.__
-* Authentication via tokens instead of proxies is now rolling out and is now the primary authentication method. Please note that not only are tokens used for job submission now, they are also used for storage element access.
-* It is no longer possible to write to certain directories from grid jobs as analysis users, namely the persistent area. Read access to the full /pnfs tree is still available. Bulk copies of job outputs from scratch to persistent have to be done outside of grid jobs.
-* Multiple `--tar_file_name` options are now supported (and will be unpacked) if you need things in multiple tarballs.
-* The `-f` behavior with and without dropbox:// in front is slightly different from legacy jobsub; see the [documentation](https://fifewiki.fnal.gov/wiki/Differences_between_jobsub_lite_and_legacy_jobsub_client/server#Bug_with_-f_dropbox:.2F.2F.2Fa.2Fb.2Fc.tar) for details.
-* jobsub_lite will probably not work directly from lxplus at the moment, though work is underway to make it possible to submit batch jobs to non-FNAL schedulers.
-
-## Submit a job
-
-**Note that job submission requires FNAL account or access to another HTCondor submission point conneected to the Fermilab pool (currently BNL or RAL).
-
-First, log in to a `dunegpvm` machine . Then you will need to set up the job submission tools (`jobsub`). If you set up `dunesw` it will be included, but if not, you need to do
-
-```bash
-mkdir -p /pnfs/dune/scratch/users/${USER}/DUNE_tutorial_may2023 # if you have not done this before
-mkdir -p /pnfs/dune/scratch/users/${USER}/may2023tutorial
-```
-Having done that, let us submit a prepared script:
-
-~~~
-jobsub_submit -G dune --mail_always -N 1 --memory=1000MB --disk=1GB --cpu=1 --expected-lifetime=1h --singularity-image /cvmfs/singularity.opensciencegrid.org/fermilab/fnal-wn-sl7:latest --append_condor_requirements='(TARGET.HAS_Singularity==true&&TARGET.HAS_CVMFS_dune_opensciencegrid_org==true&&TARGET.HAS_CVMFS_larsoft_opensciencegrid_org==true&&TARGET.CVMFS_dune_opensciencegrid_org_REVISION>=1105)' -e GFAL_PLUGIN_DIR=/usr/lib64/gfal2-plugins -e GFAL_CONFIG_DIR=/etc/gfal2.d file:///exp/dune/app/users/kherner/submission_test_singularity.sh
-~~~
-
-If all goes well you should see something like this:
-
-~~~
-Attempting to get token from https://htvaultprod.fnal.gov:8200 ... succeeded
-Storing bearer token in /tmp/bt_token_dune_Analysis_11469
-Transferring files to web sandbox...
-Copying file:///nashome/k/kherner/.cache/jobsub_lite/js_2023_05_21_205736_877318d7-d14a-4c5a-b2fc-7486f1e54fa2/submission_test_singularity.sh [DONE] after 2s
-Copying file:///nashome/k/kherner/.cache/jobsub_lite/js_2023_05_21_205736_877318d7-d14a-4c5a-b2fc-7486f1e54fa2/simple.cmd [DONE] after 5s
-Copying file:///nashome/k/kherner/.cache/jobsub_lite/js_2023_05_21_205736_877318d7-d14a-4c5a-b2fc-7486f1e54fa2/simple.sh [DONE] after 5s
-Submitting job(s).
-1 job(s) submitted to cluster 710165.
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-
-Use job id 710165.0@jobsub05.fnal.gov to retrieve output
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-
-~~~
-{: .output}
-
-> ## Quiz
->
-> 1. What is your job ID?
->
-{: .solution}
-
-
-*Note if you have not submitted a DUNE batch job within the past 30 days:* you may be asked to (re-)authenticate. If so you will see the following:
-
-~~~
-Attempting OIDC authentication with https://htvaultprod.fnal.gov:8200
-
-Complete the authentication at:
- https://cilogon.org/device/?user_code=ABC-D1E-FGH
-No web open command defined, please copy/paste the above to any web browser
-Waiting for response in web browser
-~~~
-
-The user code will be different of course. In this particular case, you do want to follow the instructions and copy and paste the link into your browser (can be any browser). There is a time limit on it so its best to do it right away. Always choose Fermilab as the identity provider in the menu, even if your home institution is listed. After you hit log on, you'll get a message saying you approved the access request, and then after a short delay (may be several seconds) in the terminal you will see
-
-~~~
-Saving credkey to /nashome/u/username/.config/htgettoken/credkey-dune-default
-Saving refresh token ... done
-Attempting to get token from https://htvaultprod.fnal.gov:8200 ... succeeded
-Storing bearer token in /tmp/bt_token_dune_Analysis_somenumber.othernumber
-Storing condor credentials for dune
-Submitting job(s)
-.
-1 job(s) submitted to cluster 57110235.
-~~~
-
-Now, let's look at some of these options in more detail.
-
-* `--mail_always` sends mail after the job completes whether it was successful for not. To disable all emails, use `--mail_never`.
-* `-N` controls the number of identical jobs submitted with each cluster. Also called the process ID, the number ranges from 0 to N-1 and forms the part of the job ID number after the period, e.g. 12345678.N.
-* `--memory, --disk, --cpu, --expected-lifetime` request this much memory, disk, number of cpus, and max run time. Jobs that exceed the requested amounts will go into a held state. Defaults are 2000 MB, 10 GB, 1, and 8h, respectively. Note that jobs are charged against the DUNE FermiGrid quota according to the greater of memory/2000 MB and number of CPUs, with fractional values possible. For example, a 3000 MB request is charged 1.5 "slots", and 4000 MB would be charged 2. You are charged for the amount **requested**, not what is actually used, so you should not request any more than you actually need (your jobs will also take longer to start the more resources you request). Note also that jobs that run offsite do NOT count against the FermiGrid quota. **In general, aim for memory and run time requests that will cover 90-95% of your jobs and use the [autorelease feature][job-autorelease] to deal with the remainder**.
-* `-l` (or `--lines=`) allows you to pass additional arbitrary HTCondor-style `classad` variables into the job. In this case, we're specifying exactly what `Singularity` image we want to use in the job. It will be automatically set up for us when the job starts. Any other valid HTCondor `classad` is possible. In practice you don't have to do much beyond the `Singularity` image. Here, pay particular attention to the quotes and backslashes.
-* `--append_condor_requirements` allows you to pass additional `HTCondor-style` requirements to your job. This helps ensure that your jobs don't start on a worker node that might be missing something you need (a corrupt or out of date `CVMFS` repository, for example). Some checks run at startup for a variety of `CVMFS` repositories. Here, we check that Singularity invocation is working and that the `CVMFS` repos we need ( [dune.opensciencegrid.org][dune-openscience-grid-org] and [larsoft.opensciencegrid.org][larsoft-openscience-grid-org] ) are in working order. Optionally you can also place version requirements on CVMFS repos (as we did here as an example), useful in case you want to use software that was published very recently and may not have rolled out everywhere yet.
-* `-e VAR=VAL` will set the environment variable VAR to the value VAL inside the job. You can pass this option multiple times for each variable you want to set. You can also just do `-e VAR` and that will set VAR inside the job to be whatever value it's set to in your current environment (make sure it's actually set though!) **One thing to note here as of May 2023 is that these two gfal variables may need to be set as shown to prevent problems with output copyback at a few sites.** It is safe to set these variable to the values shown here in all jobs at all sites, since the locations exist in the default container (assuming you're using that).
-
-## Job Output
-
-This particular test writes a file to `/pnfs/dune/scratch/users//job_output_.log`.
-Verify that the file exists and is non-zero size after the job completes.
-You can delete it after that; it just prints out some information about the environment.
-
-## Manipulating submitted jobs
-
-If you want to remove existing jobs, you can do
-
-```bash
-jobsub_rm -G dune --jobid=12345678.9@jobsub0N.fnal.gov
-```
-
-to remove all jobs in a given submission (i.e. if you used -N ) you can do
-
-```bash
-jobsub_rm -G dune --jobid=12345678@jobsub0N.fnal.gov
-```
-To remove all of your jobs, you can do
-```bash
-jobsub_rm -G dune --user=username
-```
-If you want to manipulate only a certain subset of jobs, you can use a HTCondor-style constraint. For example, if I want to remove only held jobs asking for more than say 8 GB of memory that went held because they went over their request, I could do something like
-```bash
-jobsub_rm -G dune --constraint='Owner=="username"&&JobStatus==5&&RequestMemory>=8000&&(HoldReasonCode==34||(HoldReasonCode==26&&HoldReasonSubCode==1))'
-```
-To hold jobs, it's the same procedure as `jobsub_rm`; just replace that with `jobsub_hold`. To release a held job (which will restart from the beginning), it's the same commands as above, only use `jobsub_release` in place of rm or hold.
-
-if you get tired of typing `-G dune` all the time, you can set the JOBSUB_GROUP environment variable to dune, and then omit the -G option.
-
-## Submit a job using the tarball containing custom code
-
-First off, a very important point: for running analysis jobs, **you may not actually need to pass an input tarball**, especially if you are just using code from the base release and you don't actually modify any of it. In that case, it is much more efficient to use everything from the release and refrain from using a tarball.
-All you need to do is set up any required software from CVMFS (e.g. dunetpc and/or protoduneana), and you are ready to go.
-If you're just modifying a fcl file, for example, but no code, it's actually more efficient to copy just the fcl(s) your changing to the scratch directory within the job, and edit them as part of your job script (copies of a fcl file in the current working directory have priority over others by default).
-
-Sometimes, though, we need to run some custom code that isn't in a release.
-We need a way to efficiently get code into jobs without overwhelming our data transfer systems.
-We have to make a few minor changes to the scripts you made in the previous tutorial section, generate a tarball, and invoke the proper jobsub options to get that into your job.
-There are many ways of doing this but by far the best is to use the Rapid Code Distribution Service (RCDS), as shown in our example.
-
-If you have finished up the LArSoft follow-up and want to use your own code for this next attempt, feel free to tar it up (you don't need anything besides the localProducts* and work directories) and use your own tar ball in lieu of the one in this example.
-You will have to change the last line with your own submit file instead of the pre-made one.
-
-First, we should make a tarball. Here is what we can do (assuming you are starting from /exp/dune/app/users/username/):
-
-```bash
-cp /exp/dune/app/users/kherner/setupmay2023tutorial-grid.sh /exp/dune/app/users/${USER}/
-cp /exp/dune/app/users/kherner/may2023tutorial/localProducts_larsoft_v09_72_01_e20_prof/setup-grid /exp/dune/app/users/${USER}/may2023tutorial/localProducts_larsoft_v09_72_01_e20_prof/setup-grid
-```
-
-Before we continue, let's examine these files a bit. We will source the first one in our job script, and it will set up the environment for us.
-
-~~~
-#!/bin/bash
-
-DIRECTORY=may2023tutorial
-# we cannot rely on "whoami" in a grid job. We have no idea what the local username will be.
-# Use the GRID_USER environment variable instead (set automatically by jobsub).
-USERNAME=${GRID_USER}
-
-source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
-export WORKDIR=${_CONDOR_JOB_IWD} # if we use the RCDS the our tarball will be placed in $INPUT_TAR_DIR_LOCAL.
-if [ ! -d "$WORKDIR" ]; then
- export WORKDIR=`echo .`
-fi
-
-source ${INPUT_TAR_DIR_LOCAL}/${DIRECTORY}/localProducts*/setup-grid
-mrbslp
-~~~
-{: .source}
-
-
-Now let's look at the difference between the setup-grid script and the plain setup script.
-Assuming you are currently in the /exp/dune/app/users/username directory:
-
-```bash
-diff may2023tutorial/localProducts_larsoft_v09_72_01_e20_prof/setup may2023tutorial/localProducts_larsoft_v09_72_01_e20_prof/setup-grid
-```
-
-~~~
-< setenv MRB_TOP "/exp/dune/app/users//may2023tutorial"
-< setenv MRB_TOP_BUILD "/exp/dune/app/users//may2023tutorial"
-< setenv MRB_SOURCE "/exp/dune/app/users//may2023tutorial/srcs"
-< setenv MRB_INSTALL "/exp/dune/app/users//may2023tutorial/localProducts_larsoft_v09_72_01_e20_prof"
----
-> setenv MRB_TOP "${INPUT_TAR_DIR_LOCAL}/may2023tutorial"
-> setenv MRB_TOP_BUILD "${INPUT_TAR_DIR_LOCAL}/may2023tutorial"
-> setenv MRB_SOURCE "${INPUT_TAR_DIR_LOCAL}/may2023tutorial/srcs"
-> setenv MRB_INSTALL "${INPUT_TAR_DIR_LOCAL}/may2023tutorial/localProducts_larsoft_v09_72_01_e20_prof"
-~~~
-
-As you can see, we have switched from the hard-coded directories to directories defined by environment variables; the `INPUT_TAR_DIR_LOCAL` variable will be set for us (see below).
-Now, let's actually create our tar file. Again assuming you are in `/exp/dune/app/users/kherner/may2023tutorial/`:
-```bash
-tar --exclude '.git' -czf may2023tutorial.tar.gz may2023tutorial/localProducts_larsoft_v09_72_01_e20_prof may2023tutorial/work setupmay2023tutorial-grid.sh
-```
-Note how we have excluded the contents of ".git" directories in the various packages, since we don't need any of that in our jobs. It turns out that the .git directory can sometimes account for a substantial fraction of a package's size on disk!
-
-Then submit another job (in the following we keep the same submit file as above):
-
-```bash
-jobsub_submit -G dune --mail_always -N 1 --memory=2500MB --disk=2GB --expected-lifetime=3h --cpu=1 --tar_file_name=dropbox:///exp/dune/app/users//may2023tutorial.tar.gz --singularity-image /cvmfs/singularity.opensciencegrid.org/fermilab/fnal-wn-sl7:latest --append_condor_requirements='(TARGET.HAS_Singularity==true&&TARGET.HAS_CVMFS_dune_opensciencegrid_org==true&&TARGET.HAS_CVMFS_larsoft_opensciencegrid_org==true&&TARGET.CVMFS_dune_opensciencegrid_org_REVISION>=1105&&TARGET.HAS_CVMFS_fifeuser1_opensciencegrid_org==true&&TARGET.HAS_CVMFS_fifeuser2_opensciencegrid_org==true&&TARGET.HAS_CVMFS_fifeuser3_opensciencegrid_org==true&&TARGET.HAS_CVMFS_fifeuser4_opensciencegrid_org==true)' -e GFAL_PLUGIN_DIR=/usr/lib64/gfal2-plugins -e GFAL_CONFIG_DIR=/etc/gfal2.d file:///exp/dune/app/users/kherner/run_may2023tutorial.sh
-```
-
-You'll see this is very similar to the previous case, but there are some new options:
-
-* `--tar_file_name=dropbox://` automatically **copies and untars** the given tarball into a directory on the worker node, accessed via the INPUT_TAR_DIR_LOCAL environment variable in the job. The value of INPUT_TAR_DIR_LOCAL is by default $CONDOR_DIR_INPUT/name_of_tar_file_without_extension, so if you have a tar file named e.g. may2023tutorial.tar.gz, it would be $CONDOR_DIR_INPUT/may2023tutorial.
-* Notice that the `--append_condor_requirements` line is longer now, because we also check for the fifeuser[1-4]. opensciencegrid.org CVMFS repositories.
-
-The submission output will look something like this:
-
-~~~
-Attempting to get token from https://htvaultprod.fnal.gov:8200 ... succeeded
-Storing bearer token in /tmp/bt_token_dune_Analysis_11469
-Using bearer token located at /tmp/bt_token_dune_Analysis_11469 to authenticate to RCDS
-Checking to see if uploaded file is published on RCDS
-Could not locate uploaded file on RCDS. Will retry in 30 seconds.
-Could not locate uploaded file on RCDS. Will retry in 30 seconds.
-Could not locate uploaded file on RCDS. Will retry in 30 seconds.
-Found uploaded file on RCDS.
-Transferring files to web sandbox...
-Copying file:///nashome/k/kherner/.cache/jobsub_lite/js_2023_05_24_224713_9669e535-daf9-496f-8332-c6ec8a4238d9/run_may2023tutorial.sh [DONE] after 0s
-Copying file:///nashome/k/kherner/.cache/jobsub_lite/js_2023_05_24_224713_9669e535-daf9-496f-8332-c6ec8a4238d9/simple.cmd [DONE] after 0s
-Copying file:///nashome/k/kherner/.cache/jobsub_lite/js_2023_05_24_224713_9669e535-daf9-496f-8332-c6ec8a4238d9/simple.sh [DONE] after 0s
-Submitting job(s).
-1 job(s) submitted to cluster 62007523.
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-
-Use job id 62007523.0@jobsub01.fnal.gov to retrieve output
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-~~~
-
-Note that the job submission will pause while it uploads the tarball to RCDS, and then it continues normally.
-
-Now, there's a very small gotcha when using the RCDS, and that is when your job runs, the files in the unzipped tarball are actually placed in your work area as symlinks from the CVMFS version of the file (which is what you want since the whole point is not to have N different copies of everything).
-The catch is that if your job script expected to be able to edit one or more of those files within the job, it won't work because the link is to a read-only area.
-Fortunately there's a very simple trick you can do in your script before trying to edit any such files:
-
-~~~
-cp ${INPUT_TAR_DIR_LOCAL}/file_I_want_to_edit mytmpfile # do a cp, not mv
-rm ${INPUT_TAR_DIR_LOCAL}file_I_want_to_edit # This really just removes the link
-mv mytmpfile file_I_want_to_edit # now it's available as an editable regular file.
-~~~
-
-You certainly don't want to do this for every file, but for a handful of small text files this is perfectly acceptable and the overall benefits of copying in code via the RCDS far outweigh this small cost.
-This can get a little complicated when trying to do it for things several directories down, so it's easiest to have such files in the top level of your tar file.
-
-
-
-
-
-## Monitor your jobs
-For all links below, log in with your FNAL Services credentials (FNAL email, not Kerberos password).
-
-* What DUNE is doing overall:
-[https://fifemon.fnal.gov/monitor/d/000000053/experiment-batch-details?orgId=1&var-experiment=dune](https://fifemon.fnal.gov/monitor/d/000000053/experiment-batch-details?orgId=1&var-experiment=dune)
-
-
-* What's going on with only your jobs:
-Remember to change the url with your own username and adjust the time range to cover the region of interest.
-[https://fifemon.fnal.gov/monitor/d/000000116/user-batch-details?orgId=1&var-cluster=fifebatch&var-user=kherner](https://fifemon.fnal.gov/monitor/d/000000116/user-batch-details?orgId=1&var-cluster=fifebatch&var-user=kherner)
-
-* Why your jobs are held:
-Remember to choose your username in the upper left.
-[https://fifemon.fnal.gov/monitor/d/000000146/why-are-my-jobs-held?orgId=1](https://fifemon.fnal.gov/monitor/d/000000146/why-are-my-jobs-held?orgId=1)
-
-## View the stdout/stderr of our jobs
-Here's the link for the history page of the example job: [link](https://fifemon.fnal.gov/monitor/d/000000115/job-cluster-summary?orgId=1&var-cluster=40351757&var-schedd=jobsub01.fnal.gov&from=1611098894726&to=1611271694726).
-
-Feel free to sub in the link for your own jobs.
-
-Once there, click "View Sandbox files (job logs)".
-In general you want the .out and .err files for stdout and stderr.
-The .cmd file can sometimes be useful to see exactly what got passed in to your job.
-
-[Kibana][kibana] can also provide a lot of information.
-
-You can also download the job logs from the command line with jobsub_fetchlog:
-
-```bash
-jobsub_fetchlog --jobid=12345678.0@jobsub0N.fnal.gov --unzipdir=some_appropriately_named_directory
-```
-
-That will download them as a tarball and unzip it into the directory specified by the --unzipdir option.
-Of course replace 12345678.0@jobsub0N.fnal.gov with your own job ID.
-
-> ## Quiz
->
-> Download the log of your last submission via jobsub_fetchlog or look it up on the monitoring pages. Then answer the following questions (all should be available in the .out or .err files):
-> 1. On what site did your job run?
-> 2. How much memory did it use?
-> 3. Did it exit abnormally? If so, what was the exit code?
->
-{: .solution}
-
-## Review of best practices in grid jobs (and a bit on the interactive machines)
-
-* When creating a new workflow or making changes to an existing one, **ALWAYS test with a single job first**. Then go up to 10, etc. Don't submit thousands of jobs immediately and expect things to work.
-* **ALWAYS** be sure to prestage your input datasets before launching large sets of jobs. This may become less necesaary in the future as we move to distributed storage locations.
-* **Use RCDS**; do not copy tarballs from places like scratch dCache. There's a finite amount of transfer bandwidth available from each dCache pool. If you absolutely cannot use RCDS for a given file, it's better to put it in resilient (but be sure to remove it when you're done!). The same goes for copying files from within your own job script: if you have a large number of jobs looking for a same file, get it from resilient. Remove the copy when no longer needed. Files in resilient dCache that go unaccessed for 45 days are automatically removed.
-* Be careful about placing your output files. **NEVER place more than a few thousand files into any one directory inside dCache. That goes for all type of dCache (scratch, persistent, resilient, etc). Subdirectories also count against the total for these purposes, so don't put too many subdirectories at any one level.
-* **AVOID** commands like `ifdh ls /some/path` inside grid jobs unless it is absolutely necessary. That is an expensive operation and can cause a lot of pain for many users, especially when a directory has large number of files in it. Remote listings take much, much longer than the corresponding op on a machine where the directory is mounted via NFS. If you just need to verify a directory exists, there are much better ways than ifdh ls, for example the gfal-stat command. Note also that ifdh cp will now, by default, create an output directory if it does not exist (so be careful that you've specified your output string properly).
-* Use xrootd when opening files interactively; this is much more stable than simply doing `root /pnfs/dune/... (and in general, do NOT do that...)`
-* **NEVER** copy job outputs to a directory in resilient dCache. Remember that they are replicated by a factor of 20! **Any such files are subject to deletion without warning**.
-* **NEVER** do hadd on files in `/pnfs` areas unless you're using `xrootd`. I.e. do NOT do hadd out.root `/pnfs/dune/file1 /pnfs/dune/file2 ...` This can cause severe performance degradations.
-* Generally aim for output file sizes of 1 GB or greater. dCache is really not a fan of small files. You may need to process multiple input files to get to that size (and we generally encourage that anyway!)
-* Very short jobs (measured in minutes) are quite inefficient, especially if you have an input tarball. In general you want to run for at least a few hours, and 8-12 is probably ideal (and of course longer jobs are allowed). Again you may need to process multiple input files, depending on what you're doing, or run multiple workflow stages in the same job. See the POMS section of the tutorial for more details.
-
-**Side note:** Some people will pass file lists to their jobs instead of using a SAM dataset. We do not recommend that for two reasons: 1) Lists do not protect you from cases where files fall out of cache at the location(s) in your list. When that happens your jobs sit idle waiting for the files to be fetched from tape, which kills your efficiency and blocks resources for others. 2) You miss out on cases where there might be a local copy of the file at the site you're running on, or at least at closer one to your list. So you may end up unecessarily streaming across oceans, whereas using SAM (or later Rucio) will find you closer, local copies when they exist.
-
-**Another important side note:** If you are used to using other programs for your work such as project.py (which is **NOT** officially supported by DUNE or the Fermilab Scientific Computing Division), there is a helpful tool called [Project-py][project-py-guide] that you can use to convert existing xml into POMS configs, so you don't need to start from scratch! Then you can just switch to using POMS from that point forward. As a reminder, if you use unsupported tools, you are own your own and will receive NO SUPPORT WHATSOEVER. You are still responsible for making sure that your jobs satisfy Fermilab's policy for job efficiency: https://cd-docdb.fnal.gov/cgi-bin/sso/RetrieveFile?docid=7045&filename=FIFE_User_activity_mitigation_policy_20200625.pdf&version=1
-
-## The cost of getting it wrong: a cautionary tale
-
-Earlier in May 2023 there was a fairly significant disruption to FNAL dCache, which resulted in at least five different tickets across four different experiments complaining of poor performance (resulting in jobs going held of exceeding time), timeouts, or other storage-related failures. It's unclear exactly how many jobs were affcted but it was likely in the many thousands. The root cause was a DUNE user running `ifdh ls $OUTDIR 0` to check the existence of a given directory. That command, though it only spits out the directory name, was indeed doing a full internal listing of the contents of $OUTDIR. Normally that's not the end of the world (see the comment in the best practices section), but this directory had over 100,000 files in it! The user was writing all job outputs into the same directory from what we could tell.
-
-Since the workflow was causing a systemwide disruption we immediately held all of the user's jobs and blocked new submissions until the workflow was re-engineered. Fortunately dCache performance recovered very quickly after that. The user's jobs are running again and they are also much more CPU-efficient than they were before the changes.
-
-*The bottom line: one single user not following best practices can disrupt the entire system if they get enough jobs running.* EVERYONE is responsible for following best practices. Getting it wrong affects not only you, but your collaborators!
-
-## A word on the DUNE Global Pool
-
-DUNE has also created a a global glideinWMS pool similar to the CMS Global Pool that is intended to serve as a single point through which multiple job submission systems (e.g. HTCondor schedulers at sites outside of Fermilab) can have access to the same resources. Jobs using the global pool still run in the exactly the same way as those that don't. We plan to move more and more work over to the global pool in 2023 and priority access to the FermiGrid quota will eventually be given to jobs submitted to the global pool. To switch to the global pool with jobsub, it's simply a matter of adding `--global-pool dune` as an option to your submission command. The only practical difference is that your jobs will come back with IDs of the form NNNNNNN.N@dunegpschedd0X.fnal.gov instead of NNNNNNN.N@jobsub0X.fnal.gov. Again, everything else is identical, so feel free to test it out.
-
-## Making subsets of sam definitions
-
-Running across very large number of files puts you at risk of system issues. It is often much nicer to run over several smaller subsets.
-Many official samweb definitions are large data collections defined only by their properties and not really suitable for a single job.
-
-There are two ways of reducing their size.
-
-You can create new dataset definitions; say `mydataset` has 10000 entries and you want to split it on to groups of 2000:
-
-```bash
-samweb create-definition $USER-mydataset-part0 “defname:mydataset limit 2000 offset 0”
-samweb create-definition $USER-mydataset-part1 “defname:mydataset limit 2000 offset 2000"
-samweb create-definition $USER-mydataset-part2 “defname:mydataset limit 2000 offset 4000”
-samweb create-definition $USER-mydataset-part3 “defname:mydataset limit 2000 offset 6000"
-samweb create-definition $USER-mydataset-part4 “defname:mydataset limit 2000 offset 8000”
-```
-
-Your name needs to be in there, unless it is already in the dataset name, but make certain you don’t miss a few at the end...
-
-Alternatively you can use the syntax `with stride 5 offset 0` to take every 5th file:
-```bash
-samweb create-definition $USER-mydataset-part0 “defname:mydataset limit 2000 with stride 5 offset 0”
-samweb create-definition $USER-mydataset-part1 “defname:mydataset limit 2000 with stride 5 offset 1"
-samweb create-definition $USER-mydataset-part2 “defname:mydataset limit 2000 with stride 5 offset 2”
-samweb create-definition $USER-mydataset-part3 “defname:mydataset limit 2000 with stride 5 offset 3"
-samweb create-definition $USER-mydataset-part4 “defname:mydataset limit 2000 with stride 5 offset 4”
-```
-
-
-
-
-More on samweb can be found [here]({{ site.baseurl }}/sam-by-schellman).
-
-## Verify Your Learning:
-
-> ## Question 01
->
-> What are the differences in environment between Fermilab worker nodes and those at other sites (assuming the site supports Singularity)?
->
->
Fermilab workers have additional libraries available.
->
Worker nodes at other sites have additional libraries installed.
->
No difference.
->
->
-> > ## Answer
-> > The correct answer is C - No difference.
-> > {: .output}
-> > Comment:
-> {: .solution}
-{: .challenge}
-
-> ## Question 02
->
-> After setting up a new workflow or preparing to run one that has not been exercised in a while, what is an that has not been exercised in a while, what is an appropriate number of test jobs to initially submit?
->
->
1.
->
10.
->
100.
->
As many as needed.
->
->
-> > ## Answer
-> > The correct answer is A - 1.
-> > {: .output}
-> > Comment:
-> {: .solution}
-{: .challenge}
-
-
-> ## Question 03
->
-> project.py is supported by the Fermilab Scientific Computing Division
->
->
True.
->
False.
->
->
-> > ## Answer
-> > The correct answer is B - False.
-> > {: .output}
-> > Comment:
-> {: .solution}
-{: .challenge}
-
-
-> ## Question 04
->
-> What is generally the best way to read in a .root file for analysis within a grid job?
->
->
Open with an xrootd URI (root://).
->
Copy the entire file at the beginning of the job.
->
Both A and B.
->
->
-> > ## Answer
-> > The correct answer is A - Open with an xrootd URI (root://).
-> > {: .output}
-> > Comment:
-> {: .solution}
-{: .challenge}
-
-
-> ## Question 05
->
-> What is the best way to specify your desired operating system and environment in a grid job?
->
->
Use the --OS option in jobsub.
->
Do not specify any OS, but control it with the SingularityImage classad.
->
Don’t specify anything. The grid does it
->
None of the Above
->
->
-> > ## Answer
-> > The correct answer is B - Do not specify any OS, but control it with the SingularityImage classad.
-> > {: .output}
-> > Comment:
-> {: .solution}
-{: .challenge}
-
-
-> ## Question 06
->
-> What is the best way to copy custom code into a grid job?
->
->
Use the RCDS (i.e. --tar_file_name=dropbox://foo/bar/) and stage the file in via CVMFS.
->
Copy a tarball to /pnfs/dune/scratch/users/username.
->
Copy a tarball to /pnfs/dune/persistent/users/username.
->
Copy a tarball to /pnfs/dune/resilient/users/username.
->
None of the Above
->
->
-> > ## Answer
-> > The correct answer is A - Use the RCDS (i.e. --tar_file_name=dropbox://foo/bar/) and stage the file in via CVMFS.
-> > {: .output}
-> > Comment:
-> {: .solution}
-{: .challenge}
-
-
-
-## Further Reading
-Some more background material on these topics (including some examples of why certain things are bad) is in these links:
-
-
-[December 2022 jobsub_lite demo and information session](https://indico.fnal.gov/event/57514/)
-
-[January 2023 additional experiment feedback session on jobsub_lite]( )
-
-[Wiki page listing differences between jobsub_lite and legacy jobsub](https://fifewiki.fnal.gov/wiki/Differences_between_jobsub_lite_and_legacy_jobsub_client/server)
-
-[DUNE Computing Tutorial:Advanced topics and best practices](DUNE_computing_tutorial_advanced_topics_20210129)
-
-[2021 Intensity Frontier Summer School](https://indico.fnal.gov/event/49414)
-
-[The Glidein-based Workflow Management System]( https://glideinwms.fnal.gov/doc.prd/index.html )
-
-[Introduction to Docker](https://hsf-training.github.io/hsf-training-docker/index.html)
-
-[job-autorelease]: https://cdcvs.fnal.gov/redmine/projects/fife/wiki/Job_autorelease
-
-[redmine-wiki-jobsub]: https://cdcvs.fnal.gov/redmine/projects/jobsub/wiki
-
-[redmine-wiki-using-the-client]: https://cdcvs.fnal.gov/redmine/projects/jobsub/wiki/Using_the_Client
-
-[fifemon-dune]: https://fifemon.fnal.gov/monitor/d/000000053/experiment-batch-details?orgId=1&var-experiment=dune
-
-[fifemon-userjobs]: https://fifemon.fnal.gov/monitor/d/000000116/user-batch-details?orgId=1&var-cluster=fifebatch&var-user=kherner
-
-[fifemon-whyheld]: https://fifemon.fnal.gov/monitor/d/000000146/why-are-my-jobs-held?orgId=1
-
-[kibana]: https://fifemon.fnal.gov/kibana/goto/8f432d2e4a40cbf81d3072d9c9d688a6
-
-[poms-page-ana]: https://pomsgpvm01.fnal.gov/poms/index/dune/analysis/
-
-[poms-user-doc]: https://cdcvs.fnal.gov/redmine/projects/prod_mgmt_db/wiki/POMS_User_Documentation
-
-[fife-launch-ref]: https://cdcvs.fnal.gov/redmine/projects/fife_utils/wiki/Fife_launch_Reference
-
-[poms-campaign-stage-info]: https://pomsgpvm01.fnal.gov/poms/campaign_stage_info/dune/analysis?campaign_stage_id=9023
-
-[project-py-guide]: https://cdcvs.fnal.gov/redmine/projects/project-py/wiki/Project-py_guide
-
-[DUNE_computing_tutorial_advanced_topics_20200129]: https://indico.fnal.gov/event/20144/contributions/55932/attachments/34945/42690/DUNE_computing_tutorial_advanced_topics_and_best_practices_20200129.pdf
-
-
-{%include links.md%}
diff --git a/_episodes/08-submit-jobs-w-justin.md b/_episodes/08-submit-jobs-w-justin.md
deleted file mode 100644
index f2e09d6..0000000
--- a/_episodes/08-submit-jobs-w-justin.md
+++ /dev/null
@@ -1,26 +0,0 @@
----
-title: Submit grid jobs with JustIn
-teaching: 20
-exercises: 0
-questions:
-- How to submit realistic grid jobs with JustIn
-objectives:
-- Demonstrate use of JustIn for job submission with more complicated setups.
-keypoints:
-- Always, always, always prestage input datasets. No exceptions.
----
-
-# PLEASE USE THE NEW JUSTIN SYSTEM INSTEAD OF POMS
-
-__The JustIn Tutorial is currently in docdb at: [JustIn Tutorial](https://docs.dunescience.org/cgi-bin/sso/RetrieveFile?docid=30145)__
-
-The JustIn system is describe in detail at:
-
-__[JustIn Home](https://justin.dune.hep.ac.uk/dashboard/)__
-
-__[JustIn Docs](https://justin.dune.hep.ac.uk/docs/)__
-
-
-> ## Note More documentation coming soon
-{: .callout}
-
diff --git a/_episodes/09-grid-batch-debug.md b/_episodes/09-grid-batch-debug.md
deleted file mode 100644
index e627487..0000000
--- a/_episodes/09-grid-batch-debug.md
+++ /dev/null
@@ -1,36 +0,0 @@
----
-title: Expert in the Room Grid and Batch System
-teaching: 20
-exercises: 0
-questions:
-- How to become a grid and batch yoda master?
-objectives:
-- Learn common job failures and how to avoid them in the future
-- Get tips and tricks to debug on your own
-keypoints:
-- Debugging requires a methodical and inquisitive mindset, gained through experience and good bookkeeping (write down how to you solved past issues!)
----
-
-#### Session Video
-
-This session will be captured on video a placed here after the workshop for asynchronous study.
-
-#### Live Notes
-
-Participants are encouraged to monitor and utilize the [Livedoc for May. 2023](https://docs.google.com/document/d/19XMQqQ0YV2AtR5OdJJkXoDkuRLWv30BnHY9C5N92uYs/edit?usp=sharing) to ask questions and learn. For reference, the [Livedoc from Jan. 2023](https://docs.google.com/document/d/1sgRQPQn1OCMEUHAk28bTPhZoySdT5NUSDnW07aL-iQU/edit?usp=sharing) is provided.
-
-## Debug Session
-
-This session of Expert in the Room is a Q&A. You bring content!
-Write on the live doc your current error(s) and experts will provide guidance to participants in a live debug session.
-On top of learning why a job failed, we hope it will give you the tools to solve future issues by yourself.
-
-
-*Debugging is an art.*
-
-
-*Confusion is the sweat of learning.*
-
-
-
-{%include links.md%}
diff --git a/_extras/al9_setup.md b/_extras/al9_setup.md
index 0eee419..0093283 100644
--- a/_extras/al9_setup.md
+++ b/_extras/al9_setup.md
@@ -10,20 +10,11 @@ You can store the code below as
`myal9.sh` and run it every time you log in.
> ## Note - the full LArSoft suite doesn't work yet with spack
-> Use the [Apptainer/sl7 method]({{ site.baseurl }}al9_setup.html) until we get that working if you want to use the full DUNE software suite.
+> Use the [Aptainer/SL7]({{ site.baseurl }}/sl7_setup.html) until we get that working if you want to use the full DUNE software suite.
{: .callout}
~~~
-# use spack to get applications
-source /cvmfs/larsoft.opensciencegrid.org/spack-packages/setup-env.sh
-
-# load metacat, rucio and sam and tell it you are on dune
-spack load r-m-dd-config experiment=dune
-spack load kx509
-export IFDH_CP_MAXRETRIES=0\0\0\0\0 # no retries
-export RUCIO_ACCOUNT=$USER
-
# access some disks
export DUNEDATA=/exp/dune/data/users/$USER
export DUNEAPP=/exp/dune/app/users/$USER
@@ -32,13 +23,6 @@ export SCRATCH=/pnfs/dune/scratch/users/$USER
# do some authentication
-voms-proxy-destroy
-kx509
-export EXPERIMENT=dune
-export ROLE=Analysis
-voms-proxy-init -rfc -noregen -voms dune:/dune/Role=$ROLE -valid 24:00
-export X509_USER_PROXY=/tmp/x509up_u`id -u`
-
htgettoken -i dune --vaultserver htvaultprod.fnal.gov
export BEARER_TOKEN_FILE=/run/user/`id -u`/bt_u`id -u`
@@ -51,8 +35,36 @@ export BEARER_TOKEN_FILE=/run/user/`id -u`/bt_u`id -u`
## setup specific versions of code here
~~~
-spack load root@6.28.12 # recent with xrootd
+# find a spack environment and set it up
+# setup spack
+
+source /cvmfs/larsoft.opensciencegrid.org/spack-v0.22.0-fermi/setup-env.sh
+export CVSROOT=minervacvs@cdcvs.fnal.gov:/cvs/mnvsoft
+
+# get the packages you need to run this - this is a total hack of guesswork
+echo "ROOT"
+spack load root@6.28.12%gcc@12.2.0 arch=linux-almalinux9-x86_64_v3
+
+echo "CMAKE"
+spack load cmake@3.27.9%gcc@11.4.1 arch=linux-almalinux9-x86_64_v3
+
+echo "GCC"
spack load gcc@12.2.0
-spack load fife-utils@3.7.4
+
+echo "Rucio and metacat"
+spack load r-m-dd-config experiment=dune lab=fnal.gov
+export RUCIO_ACCOUNT=${USER}
+export SAM_EXPERIMENT=dune
+
+echo "IFDHC"
+spack load ifdhc@2.8.0%gcc@12.2.0 arch=linux-almalinux9-x86_64_v3
+spack load ifdhc-config@2.6.20%gcc@11.4.1 arch=linux-almalinux9-x86_64_v3
+
+echo "PY-PIP"
+spack load py-pip@23.1.2%gcc@11.4.1 arch=linux-almalinux9-x86_64_v3
+
+echo "Justin"
+spack load justin
+
~~~
{: .language-bash}
diff --git a/_extras/al9_setup_2025.md b/_extras/al9_setup_2025.md
new file mode 100644
index 0000000..7cc8ccc
--- /dev/null
+++ b/_extras/al9_setup_2025.md
@@ -0,0 +1,26 @@
+---
+title: 2025 Example AL9 setup for a new session
+permalink: al9_setup
+keypoints:
+- getting basic applications on Alma9
+- getting authentication set up
+---
+
+You can store the code below as
+ `myal9.sh` and run it every time you log in.
+
+> ## Note - the full LArSoft suite doesn't work yet with spack
+> Use the [Apptainer/sl7 method]({{ site.baseurl }}/sl7_setup.html) until we get larsoft working if you want to use the full DUNE software suite.
+{: .callout}
+
+~~~
+# setup spack version 1.0 - generic env
+. /cvmfs/dune.opensciencegrid.org/dune-spack/spack-develop-fermi/setup-env.sh
+spack env activate dune-tutorial
+export RUCIO_ACCOUNT=${USER}
+~~~
+{: .language-bash}
+
+You can ignore the warning messages - this is still under development
+
+
diff --git a/_extras/setup_ruby.md b/_extras/setup_ruby.md
new file mode 100644
index 0000000..9c46cd2
--- /dev/null
+++ b/_extras/setup_ruby.md
@@ -0,0 +1,10 @@
+---
+title: Notes on building these pages on MACOS
+permalink: Building these pages
+keypoints:
+- need a recent version of ruby
+---
+on macs you need to use homebrew to install ruby
+then you need to override the system version
+
+`export PATH=/opt/homebrew/opt/ruby/bin:$PATH`
diff --git a/index.md b/index.md
index 2f94ddf..78896f7 100644
--- a/index.md
+++ b/index.md
@@ -8,11 +8,11 @@ country: "us"
language: "en"
latitude: "45"
longitude: "-1"
-humandate: "2024"
+humandate: "2025"
humantime: "asynchronous"
-startdate: "2024-05-20"
-enddate: "2024-12-01"
-instructor: ["Heidi Schellman","Dave Demuth","Michael Kirby","Steve Timm","Tom Junk","Ken Herner"]
+startdate: "2025-09-08"
+enddate: "2025-09-12"
+instructor: ["Heidi Schellman","Dave Demuth","Michael Kirby","Steve Timm","Tom Junk","Ken Herner","Nilay Bostan"]
helper: ["mentor1", "mentor2"]
email: ["schellmh@oregonstate.edu","dmdemuth@gmail.com","mkirby@bnl.gov","timm@fnal.gov","junk@fnal.gov","herner@fnal.gov"]
collaborative_notes: "2024-05-24-dune"
@@ -24,10 +24,12 @@ This tutorial will teach you the basics of DUNE Computing.
Instructors will engage students with hands-on lessons focused in three areas:
0. Basics of logging on, getting accounts, disk spaces
-1. Data storage and management,
-2. Introduction to LArSoft
-3. How to find futher training materials for DUNE and HEP software
+1. Data storage and management
+2. How to find futher training materials for DUNE and HEP software
+Other modules
+1. Introduction to LArSoft
+2. Introduction to batch systems
Mentors will answer your questions and provide technical support.
@@ -47,7 +49,8 @@ By the end of this workshop, participants will know how to:
* Utilize data volumes at FNAL.
* Understand good data management practices.
-* Provide a basic overview of art and LArSoft to a new researcher.
+* Know how to run basic root analysis on Fermilab or CERN unix systems.
+
There are additional materials provided that explain how to:
diff --git a/setup.md b/setup.md
index 8954619..ebd09dd 100644
--- a/setup.md
+++ b/setup.md
@@ -24,6 +24,12 @@ keypoints:
- Do an exercise to help us check if all is good
- Get streaming and grid access
+
+
+> ## If you run into problems now or later, check out the [Common Error Messages]({{ site.baseurl }}/ErrorMessages) page and the [FAQ page](https://github.com/orgs/DUNE/projects/19/)
+> if that doesn't help, use [DUNE Slack](https://dunescience.slack.com/archives/C02TJDHUQPR) channel `#computing-training-basics` to ask us about the problem - there is always a new one cropping up.
+{: .challenge}
+
## Requirements
@@ -58,11 +64,9 @@ If you have trouble getting access, please reach out to the training team severa
## Step 3: Mission setup (rest of this page)
-> ## If you run into problems, check out the [Common Error Messages]({{ site.baseurl }}/ErrorMessages) page and the [FAQ page](https://github.com/orgs/DUNE/projects/19/)
-> if that doesn't help, use Slack to ask us about the problem - there is always a new one cropping up.
-{: .challenge}
-We ask that you have completed the setup work to verify your access to the DUNE servers. It is not complicated, and should take 10 - 20 min.
+
+Before you start the tutorial, we ask that you have completed this setup to verify your access to the DUNE servers. It is not complicated, and should take 10 - 20 min.
If you are not familiar with Unix shell commands, here is a tutorial you can do on your own to be ready: [The Unix Shell](https://swcarpentry.github.io/shell-novice/)
@@ -82,7 +86,7 @@ Also check out our [Computing FAQ](https://github.com/orgs/DUNE/projects/19/view
[Computer Setup]({{ site.baseurl }}/ComputerSetup.html) goes through how to find a terminal and set up xwindows on MacOS and Windows. You can skip this if already familiar with doing that.
> ## Note
-> The instructions directly below are for FNAL accounts. If you do not have a valid FNAL account but a CERN one, go at the bottom of this page to the [Setup on CERN machines](#setup_CERN).
+> The instructions directly below are for FNAL accounts. If you do not have a valid FNAL account but a CERN one, go at the bottom of this page to the [Setup on CERN machines](#setup_CERN) section.
{: .challenge}
## 1. Kerberos business
@@ -216,7 +220,12 @@ GSSAPIDelegateCredentials yes
~~~
{: .output}
-Now you can try to log into a machine at Fermilab. There are now 15 different machines you can login to: from dunegpvm01 to dunegpvm15 (gpvm stands for 'general purpose virtual machine' because these servers run on virtual machines and not dedicated hardware, others nodes which are indented for building code run on dedicated hardware). The dunegpvm machines run Scientific Linux Fermi 7 (SLF7). To know the load on the machines, use this monitoring link: dunegpvm status.
+> ## Note: Your ssh_config may get replaced when you do operating system upgrades.
+When you set it up, make a backup copy so that you can restore it if it gets removed in an OS upgrade.
+{: .callout}
+
+
+Now you can try to log into a machine at Fermilab. There are now 15 different machines you can login to: from dunegpvm01 to dunegpvm15 (gpvm stands for 'general purpose virtual machine' because these servers run on virtual machines and not dedicated hardware, others nodes which are indented for building code run on dedicated hardware). The dunegpvm machines run Alma Linux 9.6. To know the load on the machines, use this monitoring link: [dunegpvm status](https://fifemon.fnal.gov/monitor/d/000000004/experiment-overview?var-experiment=dune&orgId=1&viewPanel=30). Any load > 4 likely results in reduced performance.
**How to connect?** The ssh command does the job. The -Y option turns on the xwindow protocol so that you can have graphical display and keyboard/mouse handling (quite useful). But if you have the line "ForwardX11Trusted yes" in your ssh config file, this will do the -Y option. For connecting to e.g. dunegpvm07, the command is:
@@ -226,7 +235,7 @@ ssh username@dunegpvmXX.fnal.gov
{: .language-bash}
where XX is a number from 01 to 15.
-If you experience long delays in loading programs or graphical output, you can try connecting with VNC. More info: [Using VNC Connections on the dunegpvms][dunegpvm-vnc].
+If you experience long delays in loading programs or graphical output, you can try connecting with VNC. More info: [Using VNC Connections on the dunegpvms][dunegpvm-vnc]. Please remember to shut down your VNC connection at least once/week - the machines can get overrun by zombies.
## 3. Get a clean shell
To run DUNE software, it is necessary to have a 'clean login'. What is meant by clean here? If you work on other experiment(s), you may have some environment variables defined (for NOvA, MINERvA, MicroBooNE). Theses may conflict with the DUNE environment ones.
@@ -248,7 +257,7 @@ env | grep -i nova
~~~
{: .language-bash}
-Another useful command that will detect UPS products that have been set up is
+
Once you identify environment variables that might conflict with your DUNE work, you can tweak your login scripts, like .bashrc, .profile, .shrc, .login etc., to temporarily comment out those (the "export" commands that are setting custom environment variables, or UPS's setup command). Note: files with names that begin with `.` are "hidden" in that they do not show up with a simple `ls` command. To see them, type `ls -a` which lists **a**ll files.
@@ -309,19 +318,37 @@ Here is how you set up basic DUNE software on Alma 9. We are using the super-com
~~~
# find a spack environment and set it up
-source /cvmfs/larsoft.opensciencegrid.org/spack-packages/setup-env.sh
-# get some basic things -
-# use the command spack find to find packages you might want
-# If you just type spack load ... you may be presented with a choice and will need to choose.
-#
-spack load root@6.28.12
-spack load cmake@3.27.7
+# setup spack
+
+source /cvmfs/larsoft.opensciencegrid.org/spack-v0.22.0-fermi/setup-env.sh
+export CVSROOT=minervacvs@cdcvs.fnal.gov:/cvs/mnvsoft
+
+# get the packages you need to run this - this will become simple in future
+echo "ROOT"
+spack load root@6.28.12%gcc@12.2.0 arch=linux-almalinux9-x86_64_v3
+
+echo "CMAKE"
+spack load cmake@3.27.9%gcc@11.4.1 arch=linux-almalinux9-x86_64_v3
+
+echo "GCC"
spack load gcc@12.2.0
-spack load fife-utils@3.7.4
-# load metacat, rucio and sam and tell it you are on dune
-spack load r-m-dd-config experiment=dune
-spack load kx509
+
+echo "Rucio and metacat"
+spack load r-m-dd-config experiment=dune lab=fnal.gov
+export RUCIO_ACCOUNT=${USER}
export SAM_EXPERIMENT=dune
+
+echo "IFDHC"
+spack load ifdhc@2.8.0%gcc@12.2.0 arch=linux-almalinux9-x86_64_v3
+spack load ifdhc-config@2.6.20%gcc@11.4.1 arch=linux-almalinux9-x86_64_v3
+
+
+echo "PY-PIP"
+spack load py-pip@23.1.2%gcc@11.4.1 arch=linux-almalinux9-x86_64_v3
+
+echo "Justin"
+spack load justin
+
~~~
{: .language-bash}
@@ -371,7 +398,6 @@ Apptainer>
You can then set up DUNE's code
~~~
-export UPS_OVERRIDE="-H Linux64bit+3.10-2.17" # makes certain you get the right UPS
source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
~~~
{: .language-bash}
@@ -399,7 +425,7 @@ Setting up DUNE UPS area... /cvmfs/dune.opensciencegrid.org/products/dune/
### Caveats for later
> ## Note: You cannot submit jobs from the Container
-> You cannot submit jobs from the Container - you need to open a separate window. In that window do the minimal [Alma9](#AL9_setup) setup below and submit your jobs from that window.
+> You cannot submit jobs from the Container - you need to open a separate window. In that window do the minimal [Alma9](#AL9_setup) setup above and submit your jobs from that window.
>
>You may need to print your submit command to the screen or a file to do so if your submission is done from a script that uses ups.
{: .callout}
@@ -444,7 +470,7 @@ Setting up DUNE UPS area... /cvmfs/dune.opensciencegrid.org/products/dune/
## 5. Exercise! (it's easy)
This exercise will help organizers see if you reached this step or need help.
-1) Start in your home area `cd ~` on the DUNE machine (normally CERN or FNAL) and create the file ```dune_presetup_2024.sh```.
+1) Start in your home area `cd ~` on the DUNE machine (normally CERN or FNAL) and create the file ```dune_presetup_2025.sh```.
Launch the *Apptainer* as described above in the [SL7 version](#SL7_setup)
@@ -460,7 +486,7 @@ alias dune_setup7='source /cvmfs/dune.opensciencegrid.org/products/dune/setup_du
{: .source}
When you start the training, you will have to source this file:
~~~
-source ~/dune_presetup_2024.sh
+source ~/dune_presetup_2025.sh
~~~
{: .language-bash}
Then, to setup DUNE, use the created alias:
@@ -470,7 +496,7 @@ setup dunesw $DUNELAR_VERSION -q $DUNELAR_QUALIFIER
~~~
{: .language-bash}
-2) Create working directories in the `dune/app` and `pnfs/dune` areas (these will be explained during the training):
+2) Create working directories in the `/exp/dune/app` and `/pnfs/dune` areas (these will be explained later in the training):
~~~
mkdir -p /exp/dune/app/users/${USER}
mkdir -p /pnfs/dune/scratch/users/${USER}
@@ -492,7 +518,7 @@ date >& /exp/dune/app/users/${USER}/my_first_login.txt
## 6. Getting setup for streaming and grid access
In addition to your kerberos access, you need to be in the DUNE VO (Virtual Organization) to access to global DUNE resources. This is necessary in particular to stream data and submit jobs to the grid. If you are on the DUNE collaboration list and have a Fermilab ID you should have been added automatically to the DUNE VO.
-To check if you are on the VO, two commands. The kx509 gets a certificate from your kerberos ticket. On a DUNE machine, type:
+
-To access the grid resources, you will need either need a proxy or a token. More information on proxy is available [here][proxy-info].
+To access the grid resources, you will need a token.
-## How to authorize with the KX509/Proxy method
+
### Tokens method
-We are moving from proxies to tokens - these are a bit different.
+We have moved from proxies to tokens for authentication as of 2025.
-#### 1. Get your token
+#### 1. Get and store your token
~~~
- htgettoken -i dune --vaultserver htvaultprod.fnal.gov
+htgettoken -i dune --vaultserver htvaultprod.fnal.gov
+export BEARER_TOKEN_FILE=/run/user/`id -u`/bt_u`id -u`
~~~
{: .language-bash}
@@ -609,9 +636,6 @@ With this done, you should be able to submit jobs and access remote DUNE storage
> If you have issues here, please ask [#computing-training-basics](https://dunescience.slack.com/archives/C02TJDHUQPR) in Slack to get support. Please mention in your message it is the Step 6 of the setup. Thanks!
{: .challenge}
-> ## Success
-> If you obtain the message starting with `Your proxy is valid until`... Congratulations! You are ready to go!
-{: .keypoints}
## Set up on CERN machines
@@ -627,7 +651,10 @@ See [https://github.com/DUNE/data-mgmt-ops/wiki/Using-Rucio-to-find-Protodune-fi
The directions above at: [AL9_setup](#AL9_setup) above should work directly at CERN, do those and proceed to step 3.
-### 2. Source the DUNE environment SL7 setup script
+### 2. For SL7
+
+#### Source the DUNE environment SL7 setup script
+
CERN access is mainly for ProtoDUNE collaborators. If you have a valid CERN ID and access to lxplus via ssh, you can setup your environment for this tutorial as follow:
log into `lxplus.cern.ch`
@@ -650,7 +677,6 @@ Set up the DUNE software
~~~
export UPS_OVERRIDE="-H Linux64bit+3.10-2.17" # makes certain you get the right UPS
source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
-setup kx509
~~~
{: .language-bash}
@@ -662,31 +688,12 @@ Setting up DUNE UPS area... /cvmfs/dune.opensciencegrid.org/products/dune/
### 3. Getting authentication for data access
-If you have a Fermilab account already, do this to get access the data catalog worldwide
-
-~~~
-kdestroy
-kinit -f @FNAL.GOV
-kx509
-export ROLE=Analysis
-voms-proxy-init -rfc -noregen -voms=dune:/dune/Role=$ROLE -valid 120:00
-~~~
-{: .language-bash}
-
-~~~
-Checking if /tmp/x509up_u79129 can be reused ... yes
-Your identity: /DC=org/DC=cilogon/C=US/O=Fermi National Accelerator Laboratory/OU=People/CN=Heidi n/CN=UID:
-Contacting voms1.fnal.gov:15042 [/DC=org/DC=incommon/C=US/ST=Illinois/O=Fermi Research Alliance/CN=voms1.fnal.gov] "dune" Done
-Creating proxy .......................................................................................... Done
-
-Your proxy is valid until Sat Aug 24 17:11:41 2024
-~~~
-{: .output}
+If you have a Fermilab account already, get a token as described in [tokens](#tokens)
### 4. Access tutorial datasets
-Normally, the datasets are accessible through the grid resource. But with your CERN account, you may not be part of the DUNE VO yet (more on this during the tutorial). We found a workaround: some datasets have been copied locally for you. You can check them here:
+Normally, the datasets are accessible through the grid resources. But with your CERN account, you may not be part of the DUNE VO yet (more on this during the tutorial). We found a workaround: some datasets have been copied locally for you. You can check them here:
~~~
ls /afs/cern.ch/work/t/tjunk/public/may2023tutorialfiles/
~~~