hoxo-m / densratio_py Goto Github PK

View Code? Open in Web Editor NEW

126.0 7.0 30.0 437 KB

A Python Package for Density Ratio Estimation

Home Page: https://github.com/hoxo-m/densratio_py

License: Other

Makefile 0.68% Python 99.32%

machine-learning machine-learning-algorithms machine-learning-library anomalydetection density-ratio-estimation

densratio_py's People

Contributors

Stargazers

Watchers

densratio_py's Issues

Choosing kernel centers from test data

Hello. I'm trying to understand density-ratio estimation including RuLSIF for implementing transition detection w.r.t. smart home data. Thank you for making such useful module.

As written in the RuLSIF.py, I read one reference 'A Least-squares Approach to Direct Importance Estimation' about LOOCV to understand how sigma and lambda are determined.

In this reference, it says that it randomly chooses kernel centers from test data "without replacement".
But line 48 of RuLSIF.py,
centers = x[randint(nx, size=kernel_num)]
If we run the code, it chooses elements with replacement so there are duplicated data points.

So, I think the code should be changed into this.
from numpy.random import choice
centers = x[choice(nx, kernel_num, replace=False)]

Please check whether it is right and give some comments!
Thank you.

density ratio estimation of high dimension data

Hi. Thanks for sharing such wonderful package. I am wondering, whether it is possible to estimate the density ratio between two distribution of very high dimension, like 50 dimension?

tests.test_RuLSIF::test_alphadensratio_2d fails

As of 5757f36, when compute_kernel_Gaussian function is executed with numba (both cpu and parallel as the set_compute_kernel_target target argument), the tests.test_RuLSIF::test_alphadensratio_2d test fails with the following error:

ValueError: _compute_kernel_Gaussian: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (m, p),(p),()->(m) (size 2 is different from 1)

The failure is due to density_ratio being a 2d multivariate estimator, but a 1d linear space is passed as an argument to the compute_density_ratio method. Unlike numpy, numba places a strict constraint on operand dimensions with the guvectorize signature argument; numpy simply broadcasts.

hi

Hi copula devs. Just stopping by to let you know that could use this library to continuously set up estimation and submission, and maybe win a prize. It's easy to use github actions to set and forget. I'd have put this in discussion if it were enabled.

Fork https://lnkd.in/g6AEMYW
Open up https://lnkd.in/gUm4Ns5 in colab and run it to generate yourself a write key.
Save the key as a Github secret called WRITE_KEY
Click on "accept" when Github asks you if you want to enable GitHub actions. Go to Actions and you'll see the only action used in your repo (similar to https://lnkd.in/gv_Agt6). Enable it.

Or see links from this post.
https://www.linkedin.com/posts/petercotton_micropredictionmicroactors-activity-6746618350325104640-UIW3

what's the x and y?

An excellent package for Density Ratio Estimation, but for a greener, how to get the x, y variables confused me. For example, with a train data and test data, the x means a sample in train data? the y means a sample in test data? If so, how can we get a y that associated with x ?

estimate density ratio of large training set and test set

Hi,

Thank you for sharing this python package for density ratio estimation.
In practical applications, the training sets are often very large.
Is it possible to use this tool to estimate density ratio of large training set and test set? For example, a training set of 20 GB data.

instable results

Hi there,

I'm using the package to calculate density ratio for multi-dimensional data. I run the program many times using the same training and test datasets, but the estimated density ratio are often slightly different. Is there a way to make the result more stable.

Another question involves how to set sigma and lambda search range. These hyper-params affect the estimations too much!

Thanks in advance!!

Consider making densratio.density_ratio.DensityRatio pickable.

Contingent upon data volume, as well as cross validation parameters, computing densratio.RuLSIF.RuLSIF function outcome can take significant time. Yet the result might be reusable, hence valuable. It would be nice if the resulting densratio.density_ratio.DensityRatio object could be pickable, according to What can be pickled and unpickled? Currently there is a non top-level function that is a member of the object's, so trying to dump the pickle brings about the Can't pickle local object 'RuLSIF.<locals>.alpha_density_ratio error.

hoxo-m / densratio_py Goto Github PK

densratio_py's People

Contributors

Stargazers

Watchers

Forkers

densratio_py's Issues

Choosing kernel centers from test data

density ratio estimation of high dimension data

tests.test_RuLSIF::test_alphadensratio_2d fails

hi

what's the x and y?

estimate density ratio of large training set and test set

instable results

Consider making densratio.density_ratio.DensityRatio pickable.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent