Git Product home page Git Product logo

ml-project-1-tp-link's Introduction

Machine Learning Fall 2021 Project 1 (Team: TP-Link)

This is our code for the Project 1 of Machine Learning Fall 2021

  • Team Name: TP-Link
  • Team Member:
  1. Silin Gao ([email protected])
  2. Shaobo Cui ([email protected])
  3. Dongge Wang ([email protected])

Requirements

  • Python
  • Numpy
  • Matplotlib

Scripts

  • implementations.py: implementations of baseline models and our improved model.
  • processor.py: data processor, including data imputation, normalization, outlier filtering, feature augmentation and selection.
  • run_baselines.py: script for training and evaluating baseline models, including cross validation and test set prediction.
  • run.py: script for training and evaluating our improved model, including cross validation and test set prediction.
  • toolkits.py: toolkits including data loader, batch generator, metrics computer, file writer, etc.
  • plot_weights.py: script for plotting output weights of features in reg_logistic_regression (outlier factor = 10, polynomial factor = 4), used for feature selection.

Running Experiments

Data Processing

Place the original training and test sets (train.csv and test.csv) in the root directory of our project.

python processor.py

Outputs:

  • filter_factor_${outlier_factor}_y.csv: training labels after outlier filtering.
  • filter_factor_${outlier_factor}_tx.csv: training features after outlier filtering.
  • filter_factor_${outlier_factor}_poly_${polynomial_factor}_tx.csv: training features after outlier filtering and feature augmentation.
  • select_feature_top20_filter_factor_${outlier_factor}_poly_${polynomial_factor}_tx.csv: training features after outlier filtering, feature augmentation and feature selection.
  • test_id.csv: sample ids in test set.
  • test_tx.csv: original test features.
  • test_poly_${polynomial_factor}_tx.csv: test features after feature augmentation.

Baseline Training and Evaluation

python run_baselines.py

Note: Baseline running directly loads and pre-processes the original data (train.csv and test.csv), which is independent from the data processing. Outputs:

  • results_${model_name}_k_${cross_validation_folds}.csv: cross validation results.
  • results_${model_name}.csv: final prediction results on test set.

Improved Model Training and Evaluation

python run.py

Outputs:

  • results_reg_logistic_dynamic_k_${cross_validation_folds}_poly_${polynomial_factor}.csv: cross validation results.
  • results_reg_logistic_dynamic_poly_${polynomial_factor}.csv: final prediction results on test set (best submission).

Results

We include the test set prediction results of all baseline models and our improved model under the folder "predictions".

ml-project-1-tp-link's People

Contributors

cui-shaobo avatar

Watchers

Matteo Pagliardini avatar Roberto Castello avatar Maksym Andriushchenko avatar  avatar  avatar

Forkers

silin159

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.