Git Product home page Git Product logo

ridge's Introduction

Hi there, welcome to my GitHub Profile 👋

  • 🔭 I’m currently working on Missing Value Treatment in Time Series
  • 🌱 I’m interested in machine learning, time series analysis, artificial intelligence, anomaly detection and data compression
  • 🚀 Also visit my personal homepage for more info about me: www.data-science-decaf.com
  • 📫 How to reach me: Just write me a mail if you have questions or have suggestions for one of my projects.

ridge's People

Contributors

dfrankow avatar jeromepaul avatar steffenmoritz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

ridge's Issues

Error in numeric(nPCs) : invalid 'length' argument

Here in https://github.com/cran/ridge/blob/master/R/linearRidge.R#L96,

nPCs <- which.max(propVar[propVar < 90]) + 1

what if all the elements in propVar are greater than 90? It will throw an error, which is the title of this issue.

logisticRidge predict

Hello,
I am using logisticRidge function to ask how my binary variable (0/1) is related to a set of linear predictors. I am writing to ask for clarification on the ‘predict’ portion of the model.

After fitting a logisticRidge model, I am trying to use the predict function to get fitted probabilities of outcome. However, I appear to be getting different values when I use ‘predict’ and ‘predict’ but supplying the original data that I fitted the model to as new data:

data(mtcars)
library(ridge)
model<-logisticRidge(vs~.,data=mtcars)
predict(model,type="response")
predict(model,type="response",newdata=mtcars[,-8])

Which usage of predict provides accurate values?

I have tested this out on a several datasets and with different combinations of binary predictors and continuous predictors. They seem to produce the same mismatches in predict values. I am happy to provide more replicable examples if you would like me to.

Thank you for your time!

ridge:::vcov.ridgeLinear computes wrong ridge standard errors

Dear Steffen,

I believe there are some errors in the expression employed to compute the standard errors of the ridge linear coefficients (ridge:::vcov.ridgeLinear). Specifically, it always penalizes the intercept (even when argument scaling = "none" on the ridge::linearRidge call) and it uses an incomplete formula. The proper formula for the Ridge standard errors is:

Where Gamma is the unitary matrix of dimensions (m+1)×(m+1) but where element i=1, j=1, is set to zero to avoid penalizing the intercept (only in the case of scaling = "none"). And it is not as programmed in ridge:::vcov.ridgeLinear:

You may find references for this formula below:
https://www.statlect.com/fundamentals-of-statistics/ridge-regression
https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Ridge_Regression.pdf
https://stats.stackexchange.com/questions/2121/how-can-i-estimate-coefficient-standard-errors-when-using-ridge-regression

See also below a script reproducing what I believe the proper Ridge standard errors are:

# Load libraries
library(MASS)
library(ridge)

# Generate X, y
set.seed(1)
n = 100
m = 10
mu = rnorm(m) # Generate mean vector
Sigma = rWishart(n = 1, df = m, Sigma = diag(m))[,,1] # Generate Covariance matrix
X = mvrnorm(n = n, mu = mu, Sigma = Sigma)
X_ = cbind(1, X) # X with an intercept column
THETA = rnorm(m+1)
y = X_ %*% THETA + rnorm(n, sd = 3)

# Using unscaled ridge Regression
lambda = 100
ridge_model = linearRidge(y ~ X, lambda = lambda, scaling = "none")
I = diag(m+1); I[1,1] = 0

# Linear coefficients
solve(t(X_) %*% X_ + lambda*I) %*% t(X_) %*% y
(theta_ridge = coef(ridge_model)) # Ridge coefficients coincides

# Variance of residuals
(var_epsilon = sum((y - X_ %*% theta_ridge)^2)/(n-(m+1)))

# Standard errors
var_epsilon * solve(t(X_) %*% X_ + lambda*I) %*% t(X_) %*% X_ %*% solve(t(X_) %*% X_ + lambda*I)

# However, ridge:::vcov.ridgeLinear computes:
ridge:::vcov.ridgeLinear(ridge_model)
var_epsilon * solve(t(X_) %*% X_ + lambda*diag(m+1)) # WRONG FORMULA

I am aware that due to the lack of scale invariance of the ridge estimator, that center-scaling is recommended (and hence dropping the intercept column). Also, the degrees of freedom in ridge regression are not really n-(m+1) (some authors speak of effective degrees of freedom as n minus the trace of the hat matrix).

FYI, I am using ridge_3.0 on R version 4.1.0 (2021-05-18), platform: x86_64-w64-mingw32/x64 (64-bit), running under: Windows >= 8 x64 (build 9200).

Please let me know if I have overseen anything. Have a nice day,

Best,
Ben.

linux conda install ridge failed

  1. I try to use ‘conda install -c r r-ridge’, but it keep Solving environment without stopped
  2. I try to install ridge with https://cran.r-project.org/src/contrib/ridge_3.2.tar.gz,i apt install gcc and GSL, but it still can not install.
    information is as follows:

install.packages("/home/shufan/software/oncoPredict/ridge_3.2.tar.gz",repos = NULL,type = "source")

  • installing source package ‘ridge’ ...
    ** package ‘ridge’ successfully unpacked and MD5 sums checked
    ** using staged installation
    /home/xxx/anaconda3/envs/drug/lib/R/bin/config: 1: eval: make: not found
    /home/xxx/anaconda3/envs/drug/lib/R/bin/config: 1: eval: make: not found
    /home/xxx/anaconda3/envs/drug/lib/R/bin/config: 1: eval: make: not found
    /home/xxx/anaconda3/envs/drug/lib/R/bin/config: 1: eval: make: not found
    /home/xxx/anaconda3/envs/drug/lib/R/bin/config: 1: eval: make: not found

checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether the compiler supports GNU C... yes
checking whether gcc accepts -g... yes
checking for gcc option to enable C11 features... none needed
checking for gsl-config... /usr/bin/gsl-config
checking for stdio.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for strings.h... yes
checking for sys/stat.h... yes
checking for sys/types.h... yes
checking for unistd.h... yes
checking for gsl/gsl_version.h... yes
configure: creating ./config.status
config.status: creating R/linearRidgeGenotypes.R
config.status: creating R/linearRidgeGenotypesPredict.R
config.status: creating R/logisticRidgeGenotypes.R
config.status: creating R/logisticRidgeGenotypesPredict.R
config.status: creating src/Makevars
config.status: creating src/config.h
config.status: src/config.h is unchanged
** libs
sh: 1: make: not found
Warning in system(cmd) : 调用命令时发生了错误
ERROR: compilation failed for package ‘ridge’

predict.linearRidge does not work if formula goes out of scope

Test case:

library(testthat)

tol <- 0.0001

test_that("test linearRidge with formula variable that goes out of scope", {
  the_formula <- formula('mpg ~ wt + cyl')
  model1 <- lm(the_formula, data = mtcars)
  model2 <- linearRidge(the_formula, data = mtcars, lambda = 0)
  # suppose the_formula goes away:
  rm(the_formula)
  
  expect_equal(coef(model1), coef(model2), tolerance = tol, label = "coefficients")
  expect_equal(predict(model1), predict(model2), tolerance = tol, label = "predictions")
})

This is separate from my changes, and also fails on 2.4.

predict.linearRidge does not work with factors

Here is an example:

library(ridge)

foo <- mtcars
foo$cyl_factor <- as.factor(paste0("cyl", foo$cyl))
model1 <- lm(mpg ~ wt + cyl_factor, data = foo)
model2 <- linearRidge(mpg ~ wt + cyl_factor, data = foo, lambda = 0.0)

predict(model1)
# has output..
predict(model2)
Error in as.matrix(mm) %*% beta : 
  requires numeric/complex matrix/vector arguments

It's because cyl_factor breaks out into multiple columns, while mm has only one column for that factor. In predict.linearRidge, right before res <- drop(as.matrix(mm) %*% beta:

Browse[2]> names(mm)
[1] "1"          "wt"         "cyl_factor"
Browse[2]> names(beta)
[1] "(Intercept)"    "wt"             "cyl_factorcyl6" "cyl_factorcyl8"

misplaced print statement in linearRidge.R

Since the last commit (548f48d), on line 47 in linearRidge.R there is a misplaced print(m). This is probably a leftover from a debug session that causes unwanted clutter in the output when using ridge.

Panel Data Ridge Regression?

Hello Steffen, How are you doing?

Thanks for keeping this package.

I'm thinking of using your package for a set of panel data, 30 years, 38 c_id.

Your package can be used for such setting?
Thank you very much

Fail to install ridge in R4.0.2 in MacOS

Hi, I kept getting error when installing ridge in the latest version of R4.0.2 in MacOS (fine in Windows system).
Below is the error information:
ff1987cbfb73477aeab700db091df28
Any solutions?
Many thanks advanced!

Issues installing ridge

Hello,

I am having significant issues installing the ridge toolbox (see below error). I have used it before, so not sure what is going on. I have installed GSL. For reference, I am working on a MacOS (Catalina, 10.15.4).

Thank you,
Cristina

install.packages("ridge")
Installing package into ‘/Users/cristinaroman/Library/R/4.0/library’
(as ‘lib’ is unspecified)
Package which is only available in source form, and may need compilation of C/C++/Fortran: ‘ridge’
Do you want to attempt to install these from sources? (Yes/no/cancel) yes
installing the source package ‘ridge’

trying URL 'https://cloud.r-project.org/src/contrib/ridge_2.5.tar.gz'
Content type 'application/x-gzip' length 124719 bytes (121 KB)

downloaded 121 KB

  • installing source package ‘ridge’ ...
    ** package ‘ridge’ successfully unpacked and MD5 sums checked
    ** using staged installation
    checking for gsl-config... no
    configure: WARNING: gsl-config not found, is GSL installed?
    configure: WARNING: ridge will be installed but some functions will be unavailable
    configure: creating ./config.status
    config.status: creating R/linearRidgeGenotypes.R
    config.status: creating R/linearRidgeGenotypesPredict.R
    config.status: creating R/logisticRidgeGenotypes.R
    config.status: creating R/logisticRidgeGenotypesPredict.R
    config.status: creating src/config.h
    ** libs
    xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
    ERROR: compilation failed for package ‘ridge’
  • removing ‘/Users/cristinaroman/Library/R/4.0/library/ridge’

The downloaded source packages are in
‘/private/var/folders/p4/1n_2sb096sg87thjdrz9whwm0000gn/T/RtmpTeLO5O/downloaded_packages’
Warning message:
In install.packages("ridge") :
installation of package ‘ridge’ had non-zero exit status

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.