Comments (2)
Hi,
What't the dimension of X and y? Are you sure they're both numpy arrays?
from boruta_py.
Hi Daniel,
I have the same problem as @robinbing. Here is my test code
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from boruta_py import BorutaPy
# load X and y
# NOTE BorutaPy accepts numpy arrays only, hence the .values attribute
#X = pd.read_csv('my_X_table.csv', index_col=0).values
#y = pd.read_csv('my_y_vector.csv', index_col=0).values
X = 10*np.random.random((1000,210))
y = np.zeros(1000, dtype=int)
y[np.random.random(1000) >= 0.5] = 1
# define random forest classifier, with utilising all cores and
# sampling in proportion to y labels
rf = RandomForestClassifier(n_jobs=-1, class_weight='auto', max_depth=5)
# define Boruta feature selection method
feat_selector = BorutaPy(rf, n_estimators='auto', verbose=2, max_iter=1000)
# find all relevant features
feat_selector.fit(X, y)
# check selected features
feat_selector.support_
# check ranking of features
feat_selector.ranking_
# call transform() on X to filter it down to selected features
X_filtered = feat_selector.transform(X)
it's basically your same example code, but with randomly generated data. Here is the error:
Traceback (most recent call last):
File "boruta_example.py", line 23, in <module>
feat_selector.fit(X, y)
File "/home/martin/Repositories/svm/lib/boruta_py.py", line 191, in fit
return self._fit(X, y)
File "/home/martin/Repositories/svm/lib/boruta_py.py", line 325, in _fit
iter_ranks = self._nanrankdata(imp_history_rejected, axis=1)
File "/home/martin/Repositories/svm/lib/boruta_py.py", line 493, in _nanrankdata
ranks = sp.stats.mstats.rankdata(np.ma.masked_invalid(X), axis=axis)
File "/home/martin/miniconda2/envs/python3/lib/python3.5/site-packages/scipy/stats/mstats_basic.py", line 260, in rankdata
return ma.apply_along_axis(_rank1d,axis,data,use_missing).view(ndarray)
File "/home/martin/miniconda2/envs/python3/lib/python3.5/site-packages/numpy/ma/extras.py", line 394, in apply_along_axis
res = func1d(arr[tuple(i.tolist())], *args, **kwargs)
File "/home/martin/miniconda2/envs/python3/lib/python3.5/site-packages/scipy/stats/mstats_basic.py", line 248, in _rank1d
for r in repeats[0]:
TypeError: iteration over a 0-d array
It seems an error of SciPy
's rankdata
function.
Note: It was tested on Anaconda's Python2
and Python3
from boruta_py.
Related Issues (20)
- Numpy types aliases deprecated (`np.int`, `np.bool` and `np.float`)
- why estimators num is calculated by feature num in this way?
- max_iter values HOT 3
- ImportError: cannot import name 'BorutaPy' from 'boruta' HOT 1
- PKG for the survival analysis HOT 6
- Can I somehow speed the Borutapy process HOT 2
- Version update of Boruta on pypi? HOT 5
- What percentage of shadow features does each real feature outperform?
- AttributeError: module 'numpy' has no attribute 'int'. HOT 9
- Possible problems in installation HOT 1
- TypeError: BorutaPy.__init__() got an unexpected keyword argument 'early_stopping' HOT 1
- Kaggle n_estimators issue with DecisionTreeClassifier HOT 2
- Error when using BorutaPy with LogisticRegression
- AttributeError: module 'numpy' has no attribute 'bool' when using BorutaPy with RandomForestClassifier HOT 3
- BorutaPy selects different features in different iterations HOT 1
- AttributeError: module 'numpy' has no attribute 'int'. `np.int` was a deprecated alias for the builtin `int`. HOT 13
- Does boruta apply to time series data? HOT 1
- New release HOT 1
- Why does the number of total features (Confirmed + Tentative + Rejected) not equal to the input features?
- Does BorutaPy work with cuML RandomForestClassifier? HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from boruta_py.