Comments (8)
At the moment boruta tries to set the random state to all estimators. cuML's RF classifier do not have this parameter.
You can try a fix like with lightgbm. Something like this before the else part could help you.
if isinstance(self.estimator, cuml_type_here): pass
Lines 340 to 344 in f2f1e3c
from boruta_py.
Thanks @Wuuzzaa.
I made the adjustment you recommended but now I'm receiving this error: "ValueError: Only methods with feature_importance_ attribute are currently supported in BorutaPy."
Any recommendations on this issue?
from boruta_py.
Seems like the implementation from cuML´s random forest differs quiete a lot from sklearns. I just took a look at the docu and do not found something similar to the feature importance.
cuML Random Forest
Some kind of feature importance is necessary for boruta to determine which features are useful. I think there is no easy way to work around this issue.
from boruta_py.
@curtisraymond and @Wuuzzaa Hi ... any solution for this?
I'm going through the same problem. However, I'm getting a different error: "integer required"
Error
TypeError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/boruta/boruta_py.py in _get_imp(self, X, y)
383 try:
--> 384
385 self.estimator.fit(X, y)
randomforestclassifier.pyx in cuml.ensemble.randomforestclassifier.RandomForestClassifier.fit()
TypeError: an integer is required
ValueError: Please check your X and y variable. The providedestimator cannot be fitted to your data.
an integer is required
from boruta_py.
My blind guess would be an error on your y data? y must be integers. Did you check your X and y for compatible Data types.
For the types see: docu
from boruta_py.
Hi @Wuuzzaa ..
Thank you for the quick reply.
y are integers. It works fine when I use sklearn's RF classifier. But I get this error when I use cuML's RF classifier.
My guess is that there might be an incompatibility between cuML and BorutaPy
from boruta_py.
BorutaPy was never planned to be used within cuML. Seems like it still do not work. Like beckernick mentioned there is still an open Issue on cuML for the implementation of the Feature Importance which is needed for boruta to work.
from boruta_py.
Thanks for linking that issue @Wuuzzaa !
@lindeberg25 , we'd love to learn more about your use case and performance impact of using cuML's Random Forest vs. scikit-learn's RF. Let's continue the discussion on the linked issue.
from boruta_py.
Related Issues (20)
- ImportError: cannot import name 'BorutaPy' from 'boruta' HOT 1
- PKG for the survival analysis HOT 6
- Can I somehow speed the Borutapy process HOT 2
- Version update of Boruta on pypi? HOT 13
- What percentage of shadow features does each real feature outperform?
- AttributeError: module 'numpy' has no attribute 'int'. HOT 9
- Possible problems in installation HOT 1
- TypeError: BorutaPy.__init__() got an unexpected keyword argument 'early_stopping' HOT 1
- Kaggle n_estimators issue with DecisionTreeClassifier HOT 2
- Error when using BorutaPy with LogisticRegression
- AttributeError: module 'numpy' has no attribute 'bool' when using BorutaPy with RandomForestClassifier HOT 3
- BorutaPy selects different features in different iterations HOT 1
- AttributeError: module 'numpy' has no attribute 'int'. `np.int` was a deprecated alias for the builtin `int`. HOT 13
- Does boruta apply to time series data? HOT 1
- New release HOT 2
- Why does the number of total features (Confirmed + Tentative + Rejected) not equal to the input features?
- Error with package version specification upon installation HOT 1
- How to speed up the algorithm? HOT 1
- AttributeError: module 'numpy' has no attribute 'int'. HOT 4
- Error while using it with CatBoost
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from boruta_py.