Information for Liaisons

MC liaisons are the contacts between the Simulation Project and the Physics Working Groups. Duties include:

  • collecting MC production requests from analysts and preparing them in LHCbDIRAC,

  • producing generator statistics tables for completed productions,

  • responding to faulty productions and ensuring they get fixed,

  • relaying news between the weekly Simulation meeting and relevant WG meeting,

  • maintaining the relevant WGConfig datapackage.

Becoming a liaison

Each working group typically appoints one or two MC liaisons for a limited time period. If you are interested in becoming and MC liaison, you can discuss with your WG convenors.

The WG convenors should inform the Simulation Project coordinators of any new MC liaisons, so that they can be added to the appropriate egroups. The new liaisons should also attend the annual training session, usually held early in the year.

Handling production requests

Before submitting a request, you should check that you have all of the required information:

  • event types and numbers of events per year and magnet poliarity,

  • output filetype (MDST, DST, …),

  • simulation conditions and processing passes, i.e.

    • collision type (beam particles and energies),

    • stripping version,

    • any special trigger/reconstruction versions.

Note

For filtered requests, please follow the procedure specified in the FilteredSimulationProduction TWiki page.

You should also check that any new code that the requests need has been released and deployed. This can include new DecFiles, new C++ code in Gauss, new options files in AppConfig, etc.

After submitting a request, the WG convenors should be notified and sign each request, while also assigning a priority level. Technical checks are also made by an expert. After this, the request will be “accepted” by the production manager, and test jobs will be launched. If the test jobs succeed, the request becomes “active” and spawns jobs on the Grid until the desired sample size is reached. If the test jobs fail, the production manager will create a Jira task and contact the MC liaison(s) for the appropriate WG, who should chase up the relevant experts to ensure that it gets fixed.

Histograms produced in the test jobs are also checked by the DQCS shifter, who will alert the production manager of any data-quality issues they might spot (see SimDQ).

Once a request is finished, the generator statistics tables should be generated.

Request size limits

While all production requests require approval from the WG convenors, additional approval from the PPG is required when the number of events exceeds certain thresholds. This is flagged by the PPG Approval Flag in the submission repository, in case the system detects that the request exceeds the thresholds. This is not something the MC liaison should deal with directly, but just be aware this might be a cause of some delay.

The system will flag the request for PPG approval if:

  • DST events > 20 M

  • Others > 100M

  • Generated Events > 500 M

Creating requests with LbMCSubmit

LbMCSubmit is a new tool for creating simulation production requests, which is intended to replace the LHCbDIRAC web interface for the majority of request types.

The submission procedure is similar to that of Analysis Productions: requests are specified in YAML and committed to a branch in the submission repository on GitLab. Automatic test productions are submitted by the CI, and once the branch is merged, the full requests are submitted.

Writing a YAML file

The minimal specification (“stage 0”) should be sufficient to cover all common use-cases — if not, please open an issue and the developers can add missing features. If you have a rare/one-off use case which is not covered by this, please get in touch with an expert.

The mandatory top-level fields to specify are sim-version, name, inform, WG, and samples, where samples is a list of dicts with mandatory keys event-types, data-types (i.e. years), num-events. Using only the mandatory keys will create Pythia8 proton–proton requests with MDST output and default processing pass versions for each specified year. By default, equal-size requests will be produced for both magnet polarities (MagUp and MagDown).

Tip

On this page, we will focus on examples, but please also consult the full stage 0 YAML schema.

Example: only mandatory keys

Below is a minimal example using only the mandatory keys, along with the corresponding full specification (“stage 6”) file that LbMCSubmit will produce. The example asks for samples of event types 23103005 and 23103006, produced in proton–proton collisions in the years 2012 and 2016 for the Charm WG.

Some things to notice:

  • This example generates 8 requests in total (2 years × 2 magnet polarities × 2 event types).

  • The num-events field is applied to each polarity, year and event-type, for a total of 800,000 events.

  • The inform key can take usernames or email addresses.

  • The individual requests are titled according to the pattern <name> <year> <collision type> <magnet polarity>.

  • Several default values are used, (see other examples for how to customise):

    • Generator: Pythia8,

    • Collision type: proton–proton,

    • Stripping: 21 for 2012 and 28r2 for 2016 (see the source code for the full list),

    • Output filetype: MDST.

sim-version: 09
name: Ds2KKpi
inform:
  - auser
  - firstname.surname@cern.ch
WG: Charm

samples:
  - event-types:
      - 23103005
      - 23103006
    data-types:
      - 2012
      - 2016
    num-events: 100_000

Setting priority

Priority can be set directly in the “stage 0” yaml file. Priority must be set at the samples level.

sim-version: 09
name: Ds2KKpi
inform:
  - auser
  - firstname.surname@cern.ch
WG: Charm

samples:
  - event-types:
      - 23103005
      - 23103006
    data-types:
      - 2012
      - 2016
    num-events: 100_000
    priority: 1a

Using eventtype - num-event pairs and SI Suffixes

It is possible to use event-type–num-events pairs in the YAML file, in case analysts would like different number of events for different event types.

Note that in this case, the num-events key must be omitted.

It is also possible to use the SI suffixes k, M for num-events, either in the num-events key or when using event-type–num-events pairs.

sim-version: 09
name: Ds2KKpi
inform:
  - auser
  - firstname.surname@cern.ch
WG: Charm

samples:
  - event-types:
      - 23103005 : 100_000
      - 23103006 : 200k 
    data-types:
      - 2012
      - 2016

Different sample sizes per year and event-type

Considering the different beam energies and integrated luminosities in different running periods, it’s quite common for analysts to request different sample sizes for different years. To achieve this in the YAML file, add more entries to the samples key:

sim-version: 09
name: Ds2KKpi
inform:
  - auser
  - firstname.surname@cern.ch
WG: Charm

samples:
  - event-types:
      - 23103005
    data-types:
      - 2011
      - 2015
    num-events: 50_000
  - event-types:
      - 23103005
    data-types:
      - 2012
    num-events: 100_000
  - event-types:
      - 23103005
    data-types:
      - 2016
      - 2017
      - 2018
    num-events: 200_000

Different stripping versions and file formats

Tip

Remember, for unfiltered MC samples, it is usually not necessary to request a completely new sample just to change the stripping version, since restripping can be run before making ntuples.

If you want a sample produced with a non-defaul stripping version, it can be specified using the stripping key on a per-sample basis. The example below uses incremental restripping versions for 2011 and 2012

sim-version: 09
name: Ds2KKpi
inform:
  - auser
  - firstname.surname@cern.ch
WG: Charm

samples:
  - event-types:
      - 23103005
    data-types:
      - 2011
    num-events: 50_000
    stripping:
      version: 21r1p2
  - event-types:
      - 23103005
    data-types:
      - 2012
    num-events: 100_000
    stripping:
      version: 21r0p2

By default, the output file format is MDST. If you need full DST (or some other format) this can be specified with the file-format key, either at the top-level or on a per-sample basis.

sim-version: 09
name: Ds2KKpi
inform:
  - auser
  - firstname.surname@cern.ch
WG: Charm
file-format: DST

samples:
  - event-types:
      - 23103005
    data-types:
      - 2011
    num-events: 50_000
  - event-types:
      - 23103005
    data-types:
      - 2012
    num-events: 100_000

Note

Some other keys (generation, fast-mc and stripping) can also be used at either the top-level or per-sample basis.

Caution

Some formats will remove steps from the request (e.g. specifying DIGI will make LbMCSubmit write only the Gauss and Boole steps for each request). This should only be done under advice of experts.

Sprucing for Run 3

Currently there is no sprucing added by default to Run 3 productions, but where available it can be added in stage 0 yaml file. Options are provided to just add default version, or specific version required:

sim-version: 10d
name: Production with default sprucing
inform:
  - firstname.surname@cern.ch
WG: B2OC
file-format: DST

samples:
  - event-types:
      - 23163031
    data-types:
      - 2024.W31.34
    magnet-polarities:
      - MagUp
    num-events: 1_000_000
    sprucing: true
  

Different generators, collision types and fast simulation

The default collision type is proton–proton. Other types of collisions (proton–ion, ion–ion, SMOG) can be chosen with the collision-types and smog keys:

sim-version: 09

name: Example PbPb
inform:
  - auser
WG: IFT

samples:
  - event-types:
      - 28142001
    data-types:
      - 2015
      - 2018
    num-events: 10_000
    collision-types:
      - PbPb
    magnet-polarities:
      - MagDown
    file-format: DST

The default generator is Pythia8 for proton–proton collisions and EPOS for proton–ion, ion–ion and SMOG collisions. Another generator can be chosen using the generation key:

sim-version: 09
name: Bctochicpi
inform:
  - auser
  - firstname.surname@cern.ch
WG: BandQ

generation:
  production-tool: BcVegPy

samples:
  - event-types:
      - 14243211
    data-types:
      - 2017
      - 2018
    num-events: 100_000

Enabling ReDecay or SplitSim can be done with the fast-mc key, which can be either at the top-level or per-sample:

sim-version: 09
name: Ds2KKpi
inform:
  - auser
  - firstname.surname@cern.ch
WG: Charm

samples:
  - event-types:
      - 23103005
      - 23103006
    data-types:
      - 2012
      - 2016
    num-events: 100_000

fast-mc:
  redecay: yes

Note

Since SplitSim filters events at the Gauss step, the key retention-rate must be specified.

Running local checks and tests

Note

LbMCSubmit is available as a python package, which can be installed using pip

$ pip install git+https://gitlab.cern.ch/lhcb-simulation/lbmcsubmit.git@2024.01.10.0

Where @2024.01.10.0 at the end can be omitted to get the latest commit on the default branch, or substituted with any tag or branch name. This is handy if you need to check out a specific version.

Alternatively, if you are using a machine with CVMFS installed, LbMCSubmit is available using LbConda:

$ lb-conda Simulation/liaisons lb-mc ...

This is kept up-to-date with the latest release of LbMCSubmit and synchronised with the version used in the request submission pipeline.

The lb-mc command from LbMCSubmit will convert stage 0 YAML files to stage 6 and perform consistency checks.

$ lb-mc my_request.stage0.yaml my_request.stage6.yaml

Some of the checks assume that you have CVMFS mounted at /cvmfs. To disable these, use the command-line option --no-validation.

Test jobs can be run locally using the LHCbDIRAC command dirac-production- request-run-local with a stage 6 YAML file. This will produce 10 events per request in the file by default. This number can be configured with the key num-test-events, if e.g. you need to generate more events to test filtered requests.

$ lb-mc my_request.stage0.yaml my_request.stage6.yaml
$ lb-dirac dirac-production-request-run-local my_request.stage6.yaml

Optional positional arguments can be used to narrow down which requests get tested.

Usage:
  dirac-production-request-run-local [options] ... yaml_path name event_type

Arguments:
  yaml_path:   Path to the YAML file containing productions to submit
  name:        Name of the production to submit (optional)
  event_type:  The event type to generate (optional)

Submitting requests to DIRAC

Tip

This is the recommended way to submit production requests for analysis. The analysts themselves could write the YAML file(s) and create the Merge Request, however it should be reviewed by the WG’s MC liaison before submission.

If you have already used Analysis Productions, this procedure for submitting requests should be very familiar to you, and indeed it uses much of the same machinery behind the scenes.

First create a new branch in the submission repository and create a new directory containing your YAML files, which should follow the naming pattern <directory>/<filename>.yaml, where directory and filename will be used to identify and group the requests in the web interface (NB: due to a technical limitation, please keep these short — below 64 characters). For example, the directory could be an analysis, and the filename simply “request” (for simple cases) or any sensible grouping of requests for cases where it’s useful to write multiple files.

Next, create a Merge Request with your new branch targeting main. The CI pipeline will launch small test productions, which you can monitor from the web interface. The pipeline will automatically add labels to the MR based on the WGs specified in the YAML files.

Once the pipeline passes, the analysts can optionally check that the request corresponds to what they asked for. Merging to main will trigger submission of the full request. Approval must be given by a liaison and a working group convener before merging, and liaisons may approve their own MRs.

After merging, a GitLab issue will be created automatically to track the request. Any problems that arise will be raised here, and the issue will be closed automatically when the production is finished.