Differences

This shows you the differences between two versions of the page.

--- 2019:groups:tools:correlations [2019/06/25 13:41]
sabine.kraml
+++ 2019:groups:tools:correlations [2019/06/27 17:10] (current)
sezen.sekmen [Quantifying overlaps between analysis search regions using ADLs]
@@ Line 1: / Line 1: @@
  ====== Study: correlations between signal regions ======
-//Members: Sophie, Wolfgang, Humberto, Benj, Andy, Sabine //
+//Members: **Sophie**, Wolfgang, Humberto, Benj, Andy, Sabine, Sezen //
 Problem statement: to identify pairs of signal regions of the analyses in the PAD database that can safely be treated as approximately uncorrelated.
@@ Line 10: / Line 10: @@
 For the stats, we'll use bootstrapping rather than directly study which events fall into common signal regions. Procedure: in the analysis framework the set of populated SRs is reported for // each event//, e.g. as a line of N_SR 1s and 0s in an output file. We then process this: for each event (=line) we sample N_history = O(100-1000) Poisson(lambda=1) weights, and enter these into a set of N_history histograms (each histogram has N_SR bins). We then build a correlation matrix between the bins using the standard sample covariance cov_ij = <sumw_i sumw_j> - <sumw_i> <sumw_j>  and corr_ij = cov_ij / sqrt( cov_ii cov_jj ). Finally, convert this to a binary "sufficiently independent SRs" matrix with a corr threshold: indep_ij = (|corr_ij| < thres). The 1s in each row (or column) of this binary matrix define a set of statistically independent SRs, which can be trivially combined in a likelihood.
+=== MA5 package to be used for correlation studies ===
+- Any version of the code from v1.8.20 onwards, to be downloaded e.g. from {{ 2019:groups:tools:ma5_v1.8.20.tgz | here }}.
+- This version contains two new dedicated functions (to be added in tools/PAD/Build/Main/main.cpp):
+   * void manager.DumpSR(std::ostream&): writes a series of 0 and 1. One entry for each considered signal region. One line per event. To be included in the event loop.
+   * void manager.HeadSR(std::ostream&): writes the header of the file: one comment line (starting with a hash) with the list of analysis-SR. To be included before entering the event loop/
+**Generic overview of [[RecastCodeComparison|what is implemented in which recast framework]]**
@@ Line 21: / Line 34: @@
 | CMS-SUS-16-033     | multijet + MET, 36 fb-1   | T1, T1bbbb, T1tttt(off), T2, T2bb, T2tt(off) | UL |
 | CMS-SUS-16-039     | multilepton EWK  | TChiWZ(off), TChiWH, TChipmSlep... | UL |
-| CMS-SUS-16-052     | 1L stop, soft    | T2bbWWoff, T6bbWWoff | UL, agg-EM | SModelS has only PAS version of this |
+| CMS-PAS-SUS-16-052     | 1L stop, soft    | T2bbWWoff, T6bbWWoff | UL, agg-EM |  |
 | CMS-SUS-17-001     | 2L stop          | T2tt(off), T6bbWW | UL  |
+See also http://madanalysis.irmp.ucl.ac.be/wiki/PublicAnalysisDatabase
@@ Line 42: / Line 57: @@
 | CMS-SUS-16-039     | multilepton EWK  | TChiWZ(off), TChiWH, TChipmSlep... | UL |
+==== Quantifying overlaps between analysis search regions using ADLs ====
+Members: Sezen, Wolfgang (, Harrison)
-=== CheckMate - SModelS correspondence (13 TeV): ===
+Find and visualize overlaps in a model-independent way, without generating events using simple descriptions done using an [[[[2019:groups:tools:adl|analysis description language]].  Directly sample the event selection.  Useful for analysis design phase, or quick comparisons within experiments (e.g. Run2 CMS SUSY pMSSM combination)
-//... to be done ...//
+  * Start from the analysis description, which lists objects and event selections.
+  * Construct a feature space from all mathematically orthogonal "basic" variables (e.g. MET, jet1.pt, jet2.pt, electron1.eta, ...).
+  * Randomly sample the feature space for each analysis based on cuts on the feature space components (jet1.pt > 100, MET > 299, etc.).
+  * Use the sampled points to compute values for "composite" variables such as HT(jets), dphi(jets), MT(lepton, MET), etc.
+  * Compare feature spaces between analyses, find and visualize overlaps and exclusions.
+  * As a very simple first step, we simply check if two analyses are disjoint in any of the basic variables.

Les Houches

User Tools

Site Tools

Differences

Page Tools