Git Product home page Git Product logo

paper-megameta-hyperparameter-training's Introduction

DOI

Hyperparameter-training for the Mega-Meta project

The repository is part of the so-called, Mega-Meta study on reviewing factors contributing to substance use, anxiety, and depressive disorders. The study protocol has been pre-registered at Prospero. The procedure for obtaining the search terms, the exact search query, and selecting key papers by expert consensus can be found on the Open Science Framework.

The screening was conducted in the software ASReview (Van de Schoot et al., 2020 using the protocol as described in Hofstee et al. (2021). The server installation is described in Melnikov (2021), and the post-processing is described by van de Brand et al., 2021. The data can be found on DANS [LINK NEEDED].

This repository stores the scripts and plugins that were used for the creation of the three final project files. These final project files were created using a classifier based on a convolutional neural network, optimized using Optuna.

Content

The Plugins folder contains the 2 used ASReview plugins. asreview-cnn-hpo was used to find the optimal settings for each dataset, and asreview-model-cnn-17-layer was used to implement these settings.

The scripts folder contains a Jupyter Notebook that can be run in Google Colab. This script installs the plugins found in the plugins folder and then creates an ASReview lab instance that can be used to employ these plugins, find the optimal hyperparameters, and create the final project files.

Step-by-step quickguide

  1. Create a project file from one of the input files.
  2. Select the HPO-CNN optimizer as classifier for the created project file.
  3. Train the project file to recieve the optimal CNN hyper parameters. The optimal parameters appear in the console used to start ASReview.
  4. Plug these parameters into the CNN model (not the HPO-CNN model).
  5. Train the CNN model.
  6. Start screening records with the optimized CNN model.

Step-by-step guide

This guide details how the CNN was optimized and trained.

  1. The process started with 3 different excel files containing ASReview output.

  2. In Google Colab, upload and run the cnn_training_script_for_use_in_google_colab.ipynb file in found in the folder called scripts. This file is used to optimize the CNNs for the later processing.

    • A custom-made version of the HPO-CNN was used to determine the optimal hyperparameters for each of the project files. These hyperparameters were then applied to the CNN classifier model.

      • To use this custom-made version of the HPO-CNN, upload the folder containing the HPO-CNN version to Colab. A quick way of doing this is by zipping-up the folder and uploading this zip file.

      • The notebook contains code for unzipping, and then installing the plugin automatically.

    • The notebook uses NGROK to access the ASReview frontend. A personal NGROK token is needed for the NGROK_AUTH_TOKEN variable. A link with instructions on where to get such a token can be found in the notebook.

    !pip install pyngrok --quiet
    from pyngrok import ngrok
    
    # Terminate open tunnels if exist
    ngrok.kill()
    
    # Setting the authtoken (optional)
    # Get your authtoken from https://dashboard.ngrok.com/auth
    NGROK_AUTH_TOKEN = "fill token here"
    ngrok.set_auth_token(NGROK_AUTH_TOKEN)
    • After running the notebook, there will be a link produced by NGROK. This link will open the ASReview frontend.
    ngrok.connect(port="80", proto="http")
    
    <NgrokTunnel: "http://d3c5-35-197-26-146.ngrok.io" -> "http://localhost:80">
    • In this front end, create a new project file with one of the Excel files.

    • When the training of the new project file is finished (after a long time), the still processing colab cell running ASReview will print the optimal parameters in the output.

    Hpo trail:  77/80
    Hpo trail:  77/80
    Hpo trail:  77/80
    Hpo trail:  77/80
    FOUND HYPERPARAMETERS:  {'nlayers': 3, 'nfilters':  209}
    
  3. These parameters are used to optimize the CNN used for the final project files.

    • For the training of these project files, the parameters that resulted as an output of the HPO are used for the CNN. The output is set by filling in the results into the nlayers and nfilters variables in the asreview-plugin-model-cnn-17-layer\asreviewcontrib\models\cnn.py file.
    def _create_dense_nn_model(_size):
      def model_wrapper():
        backend.clear_session()
    
        nfilters = 209
    
        model = Sequential()
  4. Install this newly created CNN. Using this new optimized CNN, create the projects files for screening.

Funding

This project is funded by a grant from the Centre for Urban Mental Health, University of Amsterdam, The Netherlands

Licence

The content in this repository is published under the MIT license.

Contact

For any questions or remarks, please send an email to the ASReview-team.

paper-megameta-hyperparameter-training's People

Contributors

jteijema avatar rensvandeschoot avatar sagevdbrand avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.