User Tools

Site Tools


Les Houches

2019 Session

Wikis of Previous sessions

Les Houches Themes



Tools and Monte Carlos: Jet Substructure

Jet substructure analysis is an important tool for extractions of the significant signal in jets and the reconstruction of characteristic jet mass and shape variables, and the e.g. internal flow inside jets, for e.g. searches. This project tries to evaluate various substructure techniques and configurations with respect to their effectiveness in the suppression of pile-up for given observables, and the enhancement of signal-to-background ratios.

Beam conditions and pile-up

LHC in 2015 and beyond

While the exact conditions at LHC are not yet know, we expect at least 30 but likely more (more than 200 is possible) pile-up interactions in each recorded event. The center-of-mass energy is set to $ \sqrt{s} = 13 {\rm\ TeV} $. }

Minimum bias samples

Minimum bias samples have been produced with Pythia8 using the tune 4C. The samples include single, double and non-diffractive interactions at the default mix. Particles are generated without any phase space restriction. The generating code fills a ROOT tuple, the structure of which is documented in the BOOST2012 TWiki.

The generating code can be found in the file The example Makefile is in the file makefile.txt. This is highly taylored for the setup on my machine, but should give you an idea.

Signal samples

The following signal samples are discussed:

  1. Final states with boosted objects:
    1. boosted top in $t\bar{t}$ ($\hat{p}_{T} > 200(450) {\rm\ GeV}$); full hadronic ok;
    2. inclusive jets in similar phase space as $t\bar{t}$;
    3. $HZ\to b\bar{b} \nu\nu$ with $H(125)$; boosted $b\bar{b}$ system and large $E_{\rm T}^{\rm miss}$;
  2. VBS/VBF with (not boosted) tag jets
    1. WW/ZZ/WZ continuum, high $M_{WW/ZZ/WZ} > 1 {\rm\ TeV}$ (may be not possible in Pythia8 - maybe have to go for a high mass Higgs-like particle?)
    2. VBF Higgs $H(125)$, $H\to\gamma\gamma$

Configurations & Software

The following configurations for the (average) number of pile-up interactions $\langle\mu\rangle$ are suggested $$ \langle\mu\rangle = \{ 30, 60, 120, 240 \} $$ The actual number of pile-up interactions added to the signal event is taken from a Poisson distribution around $\langle\mu\rangle$.

Software version 1

Pile-up can be added dynamically using Peter's eventmanager software. All the software for the LH pile-up/substructure studies is being developed through a Bitbucket repository: you can check out without an account. Send a request for access if you want to commit (push) to the main code repository.

(Previous version of software as a tarball.)

The main concept here is that the ROOT based raw data is converted into an Event object containing lists of PseudoJets (from Fastjet) representing

  • the total particle (hadron) level event (signal + pile-up)
  • the particle (hadron) level signal event
  • the particle (hadron) level pile-up event

The code is not very convenient to use in this version. On most systems I expect a

make all

should work to compile the library and the example in anal02.C. For implementing your own analysis, please check the anal02.C and Zprime_Py8::analyze(Event& rEvt) (your playground) in Zprime_Py8.C as examples. The program supports a few command line arguments

anal02.exe --help --mu=<mu> --nevts=<number of (signal) events>
           --sigflist=<text file with list of signal files>
           --puflist=<text file with list of pile-up files>

Some hints:

  • –help prints a brief usage instruction (which I think is not up-to-date, so please ignore!)
  • –mu=<mu> expects the number of interactions per event. if <mu> < 0, exactly |<mu>| interactions are collected into one event. <mu> > 0 means a Poisson-distributed number of pile-up interactions will be collected from the pile-up (minimum bias) event samples.
    • if you specify both signal and pile-up input, <mu> should be the number of pile-up events to be added to one signal event
  • –sigflist=<file> specifies a text file (no “” around file name!) with a list of signal files to be processed. If this list is not given, Zprime_Py8::analyze(…) will not be invoked.
  • –puflist=<file> specifies a text file (no “” around file name!) with a list of pile-up files to be processed. If this list is not given, only signal events are analyzed (you should set –mu=1 in this case!).


The idea is to test filtering/grooming techniques as a way to reduce the sensiticity to bsoft backgrounds (initially the filter used with the BDRS tagger was meant to reduce the sensitivity to the UE).

Following the Filter tool in FastJet3, we need (i) a jet definition to break the jet into subjets and (ii) a selection criterion that decides what subjets are kept. We'll consider 3 options:

  • “Filtering”:
    • cluster with Cambridge/Aachen, $R_{\rm filt} = \eta_{\rm filt} R_{\rm jet}$ [$\eta_{\rm filt}$ between 1/3 and 1/2 sounds about right]
    • keep the $n_{\rm filt}$ hardest subjets [$n_{\rm filt}$=2.3.4]
  • “Trimming”:
    • cluster with Cambridge/Aachen, $R_{\rm filt} = 0.2$
    • keep all subjets with $p_{t,\rm sub} > f p_{t,jet}$ [$f$ between 0.01 and 0.05 should be fine]
  • “Area-filtering”:
    • cluster with Cambridge/Aachen, $R_{\rm filt} = 0.2$
    • keep all subjets with $p_{t,\rm sub} > \rho A_{\rm sub} + n \sigma \sqrt{A_{\rm sub}}$ [$n$ between 2 and 5]

The 3rd option is new and based on the idea that the PU scales $\rho$ and $\sigma$ should set the scale of the PU/noise removal.

For pileup sutraction, one usually want to first subtract the average PU contamination ($\rho A_{\rm sub}$) from each subjet before deciding which subjets are to be kept. This is implemented in the fastjet::Filter. Note that with “Area-filtering” you should then cut with $p_{t,\rm sub} > n \sigma \sqrt{A_{\rm sub}}$ since the baseline has already been subtracted.

2013/groups/tools/substructure.txt · Last modified: 2013/06/07 15:46 by andy.buckley