Light

aam97 / qbdt Goto Github PK

View Code? Open in Web Editor NEW

This project forked from xialigang/qbdt

0.0 0.0 0.0 21.1 MB

a new Boosting Decision Tree method with Systematical Uncertainties into training for High Energy Physics

C 1.85% Python 98.13% Shell 0.02%

qbdt's Introduction

QBDT

a new Boosting Decision Tree method with Systematical Uncertainties into training for High Energy Physics
reference: https://arxiv.org/abs/1810.08387 (accepted for publication in NIM A)
An example in High Energy Physics, search for Higgs -> tau tau gamma, under the directory tautaugamma
To use it, just git clone. No complie. Only python and ROOT are required.

Contact

Ligang Xia, [email protected], [email protected]

First try without systematics:

run trainin: python runbdt.py trees0 0 0 10
run testing: python testbdt.py trees0 (after training is done)
I put the training and testing results in trees0/example/. You can have a comparison.

First try with ONE systematic source:

run training: python runbdt.py trees1 1 1 10
run testing: python testbdt.py trees1 1 (after training is done)
I put the training and testing results in trees1/example/. You can have a comparison.

How to run training?

command format: python runbdt.py dir Nsysts Switch Ntrees # see the explanation below
dir: directory for storing training results
Nsysts: number of systematics
Switch: a boolean flag to switch on systematics or not in training. If Nsysts==0, Switch will be always 0.
Ntrees: number of trees used for training, 100 by default if not specified.

How to test and show performance?

command format: python testbdt.py dir Nsysts Ntrees # see the explanation below
dir: directory for storing training results
Nsysts: number of systematics, 0 by default if not specified
Ntrees: number of trees used for testing, 100 by default if not specified.

Description of the files

qbdtmodule.py : define QBDT class (you do not need to touch it)
runbdt.py : perform training
testbdt.py : test and show performance
root_dir : a directory storing root including nominal and systematic ntuples
share : a directory storing other scripts, maybe useful, but you do not need to touch it
AtlasStyle : a config script for plotting, borrowed from ATLAS

Warning

We have to add a branch in the root file to tell the algorithm which events are used for training or testing. In the current example, this branch is "trainflag". It is generated randomly and uniformly from 0 to 1. Events with "trainflag<0.5" are used for training while the other events used for testing. I will try to split the events automatically in the future.

To-do

Add a function to split events for training and testing automatically.
Try to improve the training speed. I find python is slow. Maybe I should consider rewritting using C++.

Acknowledgement

I would like to thank my wife, who is always pushing me to publish PRL/Science/Nature papers and I always make her disappointed ...

qbdt's People

Contributors

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.