Git Product home page Git Product logo

skm's Introduction

Sparse Kernel Methods (SKM)

SKM is an R package for machine learning that includes functions for model training, tuning, prediction, metrics evaluation and sparse kernels computation. The main goal of this package is not to provide a full toolkit for data analysis but focus specifically in 7 supervised type of models: 1) generalized boosted machines, which internally uses gbm package, 2) generalized linear models of glmnet package, 3) support vector machines of e1071 package, 4) random forest of randomForestSRC package, 5) bayesian regression models of BGLR package, 6) deep neural networks of keras package and 7) partial least squares, which uses pls package.

The model functions in SKM were designed keeping in mind simplicity, so the parameters, hyperparameters and tuning specifications are defined directly when calling the function and the user only have to see one example to understand how the package works. The most important hyperparameters of each model can be tuned with two different methods, grid search and Bayesian optimization.

The SKM paper can be consulted in the following link: https://doi.org/10.3389/fgene.2022.887643.

Installation

Currently this package is only available in Github and as it requires the most updated versions of some dependencies, you have to make sure you have these updated versions installed. You can use the following code to install SKM:

if (!require("devtools")) {
  install.packages("devtools")
}
devtools::install_github("gdlc/BGLR-R")
devtools::install_github("rstudio/keras")
devtools::install_github("rstudio/tensorflow")
devtools::install_github("brandon-mosqueda/SKM")

If you find problems installing the library, specially windows users that does not have installed Rtools, you can use the installation script of this repository:

source("https://raw.githubusercontent.com/brandon-mosqueda/SKM/main/install.R")

Authors

  • Osval Antonio Montesinos López (author)
  • Brandon Alejandro Mosqueda González (author, maintainer)
  • Abel Palafox González (author)
  • Abelardo Montesinos López (author)
  • José Crossa (author)

Citation

First option, by paper:

@article{10.3389/fgene.2022.887643,
  author={Montesinos López, Osval Antonio and
    Mosqueda González, Brandon Alejandro and
    Palafox González, Abel and
    Montesinos López, Abelardo and
    Crossa, José},
  title={A General-Purpose Machine Learning R Library for
    Sparse Kernels Methods With an Application for
    Genome-Based Prediction},
  journal={Frontiers in Genetics},
  volume={13},
  year={2022},
  url={https://www.frontiersin.org/articles/10.3389/fgene.2022.887643},
  doi={10.3389/fgene.2022.887643},
  issn={1664-8021}
}

Second option, by package:

citation("SKM")

## To cite package 'SKM' in publications use:
##
##  Montesinos López O, Mosqueda González B, Palafox González A,
##  Montesinos López A, Crossa J (2022). _SKM: Sparse Kernels Methods_.
##  R package version 1.0.
##
## A BibTeX entry for LaTeX users is
##
##  @Manual{,
##    title = {SKM: Sparse Kernels Methods},
##    author = {Osval Antonio {Montesinos López} and
##    Brandon Alejandro {Mosqueda González} and
##    Abel {Palafox González} and
##    Abelardo {Montesinos López} and
##    José Crossa},
##    year = {2022},
##    note = {R package version 1.0},
##  }

skm's People

Contributors

brandon-mosqueda avatar

Stargazers

Alex M avatar

Watchers

 avatar

skm's Issues

Deep Learning Error

Error in py_call_impl(callable, call_args$unnamed, call_args$named) :
RuntimeError: Detected a call to Model.fit inside a tf.function. Model.fit is a high-level endpoint that manages its own tf.function. Please move the call to Model.fitoutside of all enclosingtf.functions. Note that you can call a Modeldirectly onTensors inside a tf.functionlike:model(x)`.

I get the above error with several attempts to run deep learning, univariate or multivariate, models.

model <- SKM::deep_learning(
x=as.matrix(kerntest[training,1:32]), y=as.matrix(use3[training,71:82]),
epochs_number = c(1,5,10),
learning_rate = c(0.01,0.005),
tune_type = "grid_search",
tune_cv_type = "k_fold",
tune_folds_number = 5
)

Run in parallel?

Is there a way to run models in parallel ? (gbm, rf)

Especially for grid search or bayesian optimization of hyperparameters, large data can be very slow to test without running in parallel.

PLS methods all give same results?

Kernel,orthogonal, wide kernel etc all give the same prediction accuracy (same test train split, same number of latent variables to predict with)

Is this expected?

Random CV - how does it work?

Hello,

I am using bayesian optimization with random as my CV option with .2 missing.

Does 15 tune_folds_number mean there are 15 seperate traint / test splits with .8 and .2 respectively?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.