Git Product home page Git Product logo

higgs-boson-detection's Introduction

Higgs boson identification: Machine Learning Approach

Python version: 3

Libraries:

  • NumPy
  • Matplotlib

In this repository one can find all the code written for the project as well as the paper which briefly explains our approach.

We provide the following notebooks which describe our full approach:

  • EDA & Feature importance.ipynb - The notebook summarizes the exploratory data analysis and feature importance analysis done before training any models.
  • Modeling.ipynb - In the following notebook we present the full deployment of our model, from hyperparameter tuning to predictions.

The notebooks use several Python files, which contain the implementation of all algorithms and general functions used for the project.

  • data_processing.py - Contains all functions that are used within our data processing part of the code.
  • feature_importance.py - Contains methods such as Riemann approximation and Gaussian test, which serve a purpose of finding the important features.
  • implementations.py - This is the file, which contain the optimization algorithms of our project. Apart from the 6 required functions we have implemented several advanced methods which help us achieve our best score.
  • objective_functions.py - All objective functions such as 'mse_loss' and 'logit_loss' plus the functions for calculating the gradient are located in this file.
  • proj1_helpers.py - Functions provided by the teaching team for execution of simple data loading and creation of submission.
  • extra_helpers.py - General function that we implement to simplify our data processing and modeling.
  • run.py - Running this file loads our best model and produce the predictions for the test set.
  • run_functions.py - Functions to avoid code repetition in our run.py file.

Additionally, we provide the full collection of .csv files used for the project in the data folder. It contains the following information:

  • train.csv - labeled Higgs boson data used for training the models.
  • test.csv - dataset for for which we have to predict labels.
  • run folder - contains all optimized hyper-parameters and characteristics of the final models. Used for run file, which instead of finding the optimal parameters, loads them directly, which speeds up the processes.

higgs-boson-detection's People

Contributors

anthonyyazdani avatar zhecho1215 avatar nelik21 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.