Git Product home page Git Product logo

krls's People

Contributors

chadhazlett avatar lukesonnet avatar nfultz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

krls's Issues

Farm out as much as possible to Rcpp

Things that should be checked for and moved to Rcpp:

  • Any trace functions should be moved from sum(diag(x)) to trace_mat (~5x faster)
  • Any multiplications of a matrix with a diagonal matrix should be moved to mult_diag

Eigvalue of 0 in some instances

There are some cases where R is returning an eigenvalue of 0 in the generateK() function. This causes all sorts of problems downstream. This seems to only happen without truncation, and is thus caused by R's default eigen() function. Should I prevent this by adding the minimum double R can handle to 0 eigenvalues? I very rarely run into it and it's only when lambda is very large and there is no truncation.

Add warning for solution near lambdainterval endpoints

great. I wonder if it's worth throwing an error (or at least warning) if the optimization hits the edge. e.g.
f <- function (x) (x - .5)^2

tryoptimize = function(f, interval){
xmin <- optimize(f, interval = interval)
tooclose=abs(xmin$minimum-lambdainterval)<.Machine$double.eps^.25
if(tooclose[1]) stop("Too close to lower bound of lambda interval; please decrease lower bound.")
if(tooclose[2]) stop("Too close to upper bound of lambda interval; please raise upper bound.")
}

tryoptimize(f, interval=c(0,1))
tryoptimize(f, interval=c(1,2))

KRLS will not install without pre-requisite RSpectra

Try to install KRLS on a Windows machine without RSpectra installed; this will fail silently.

Install RSpectra then install KRLS and it works.

Presumably the intended behaviour should be to automatically resolve dependencies or fail very loudly and immediately, rather than getting to the end of the build process and failing silently?

Warnings vs. errors in inference.krls2

Currently, if someone tries to get standard errors for KRLogit without truncation, we throw a warning but still return the pwmfx and the avg pwmfx. So if you do:
summary(krls(X=X, y=y, loss='logistic'))
you get a warning and no SEs but you get pwmfx. We could also error and force them to write:
summary(krls(X=X, y=y, loss='logistic'), vcov = F)
or we could change the default so it didn't try to get SEs without truncation, but I think that's worse than throwing the warning. So I think we currently have the right behavior, but I wanted to check.

If someone tries to get robust or clustered standard errors for KRLS without truncation, we throw a warning but still return the pwmfx and the avg pwmfx. So if you do:
summary(krls(X=X, y=y), robust = T)
you will get a warning telling you that you have to truncate, but still returning the PWMFX. The alternative is to error out, since this isn't default behavior, the user specified robust = T but I think the warning is sufficient. So again I think we have the right behavior, but this one I'm less sure about.

Simulations comparing OLS, logistic regression, KRLS, and KRlogit

Compare the above on MSE of predicted probabilities to latent Y* (which has noise) and NLL on predicted probabilities to observed Y. Also compare on MSE of average marginal effects. Do this for the following settings:

  • Well specified logistic DGP (few + many predictors)
  • logistic DGP with misspecification in linear component (ommitted variables + interactions + higher order terms)
  • well specified probit DGP
  • probit DGP with misspecification in linear component
  • linear DGP

Restructure CPP code in to different files

All of the CPP is in one file. I'm going to split it in to different files:

  • KRLS functions (functions, gradients, hessians)
  • KRLogit functions (...)
  • Kernel fucntions (kern_gauss...)
  • Solvers (solve_for_d_ls)
  • Inference (pwmfx)
  • Helper functions (trace_mat, mult_diag, euc_dist)

KRLS compile broken on Mac.

I am sure Luke will just looooove to get this since I nagged him all day at this point, but KRLS' compile is broken on my Mac due to a missing fortran dependency of some kind.

After a few steps of clang++ compiling, I get:

ld: warning: directory not found for option '-L/usr/local/lib/gcc/x86_64-appple-darwin13.0.0/4.8.2'
ld: library not found for -lgfortran
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [KRLS2.so] Error 1
ERROR: compilation failed for package 'KRLS2'

Works fine in Windows 10.

Dealing with missingness

Ideally we'd deal with missingness much as lm does -- i.e. it is na.omitted internally but then the predictions etc. are put back into the full length vector with the NA intermixed.

Compare golden search with optimize for KRLS and KRlogit

First compare golden search for KRLS with optimize. If golden search works considerably better, build it for KRlogit. If golden search is no better than or worse than KRlogit, replace golden search with optimize as the default but keep it as a legacy option.

Test suite for combination of settings

Build a test suite which tests the different instances of KRLS and KRLogit. These will be fast and simple tests that just make sure that they pass, not necessarily to check that the answers are all correct. This should test the interaction of:

Top priority:

  • Loss function: 'leastsquares', 'logistic'
  • Truncation: T/F
  • Lambdasearch: fixed, optimize, grid
  • whichkernel: gaussian, poly2
  • standarderrors: regular, robust, clustered

Lower priority:

  • bsearch

Clean up code

  • Read through and simplify code
  • Format according to Google's R style guide unless it is a big problem for backwards compatability
  • Update documentation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.