Comments (4)
Thanks for letting me know!
This should be fixed now with the new version 0.1.20 of KTBoost.
from ktboost.
Wow that's quick! I have explored this library a bit and found this approach provides better performance than other famous counterparts like XGBoost.
Having said that, I suspect I could have more performance gain by playing with the loss function. My target variable is following a zero-inflated log-normal / zero-inflated gamma, so I think using a tweedie deviance as loss function should be more appropriate. I tried Tobit loss but my zeros are in very high fraction (>.8) so the truncated normality assumption does not work well.
I am looking at how to define a custom loss, but it seems that apart from defining the loss, I would also need to work with the _update_terminal_region, which is not very familiar to me.
Would you have time to look into this? Or is there any doc that would be useful for this?
Thanks!
from ktboost.
Thanks for your feedback. I agree with your assessment that other discrete-continuous losses, apart from the Tobit loss, can lead to better performances.
Yes, you are right about what needs to be done in terms of coding for implementing other losses. Unfortunately, I don't have any docs for this. But the structure is the same as in scikit-learn.
For what type of application do you plan to use this? We might collaborate on this. If this is an option, send me an e-mail.
from ktboost.
The application is to model claims, for a Kaggle-like competition.
After spending some time reading the code, I find the function update_terminal_region is most of the time making a single Newton-Raphson step only. I think the implementation of tweedie loss should be similar to poisson loss, I will give it a try and make a pull request later.
Thanks!
from ktboost.
Related Issues (10)
- Method - update_terminal_regions in LossFunction Class- If Condition HOT 4
- Tobit with yl and yu varied by observations HOT 1
- KTBoost.BoostingRegressor TypeError: __cinit__() takes exactly 6 positional arguments (7 given) HOT 1
- Multiprocessing with KTBoost
- mae criterion is very slow compared to mse or friedman_mse for classification HOT 1
- TypeError: __init__() got an unexpected keyword argument 'min_weight_leaf' HOT 2
- Implement non-constant learning rates HOT 1
- sample_weight is being multiplied twice - Tobit Loss HOT 3
- Is it possible to add a monotone constraint? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ktboost.