Comments (8)
Hello, the problem still remains. Unless I have missed the solution
boruta 0.1.5
Python 3.6 running on Docker
`TypeError Traceback (most recent call last)
<ipython-input-124-be126db958b1> in <module>()
1 # find all relevant features
----> 2 feat_selector.fit(X, y)
/opt/conda/lib/python3.6/site-packages/boruta/boruta_py.py in fit(self, X, y)
199 """
200
--> 201 return self._fit(X, y)
202
203 def transform(self, X, weak=False):
/opt/conda/lib/python3.6/site-packages/boruta/boruta_py.py in _fit(self, X, y)
333 imp_history_rejected = imp_history[1:, not_selected] * -1
334 # calculate ranks in each iteration, then median of ranks across feats
--> 335 iter_ranks = self._nanrankdata(imp_history_rejected, axis=1)
336 rank_medians = np.nanmedian(iter_ranks, axis=0)
337 ranks = self._nanrankdata(rank_medians, axis=0)
/opt/conda/lib/python3.6/site-packages/boruta/boruta_py.py in _nanrankdata(self, X, axis)
500 Replaces bottleneck's nanrankdata with scipy and numpy alternative.
501 """
--> 502 ranks = sp.stats.mstats.rankdata(X, axis=axis)
503 ranks[np.isnan(X)] = np.nan
504 return ranks
/opt/conda/lib/python3.6/site-packages/scipy/stats/mstats_basic.py in rankdata(data, axis, use_missing)
264 return _rank1d(data, use_missing)
265 else:
--> 266 return ma.apply_along_axis(_rank1d,axis,data,use_missing).view(ndarray)
267
268
/opt/conda/lib/python3.6/site-packages/numpy/ma/extras.py in apply_along_axis(func1d, axis, arr, *args, **kwargs)
394 i.put(indlist, ind)
395 j = i.copy()
--> 396 res = func1d(arr[tuple(i.tolist())], *args, **kwargs)
397 # if res is a number, then we have a smaller output array
398 asscalar = np.isscalar(res)
/opt/conda/lib/python3.6/site-packages/scipy/stats/mstats_basic.py in _rank1d(data, use_missing)
252
253 repeats = find_repeats(data.copy())
--> 254 for r in repeats[0]:
255 condition = (data == r).filled(False)
256 rk[condition] = rk[condition].mean()
TypeError: iteration over a 0-d array
`
from boruta_py.
Same issues here:
Traceback (most recent call last):
boruta_selector.fit(dataX.values, dataY.values.ravel())
File "/spare/local/fchen/virtualenv/lib/python3.5/site-packages/boruta/boruta_py.py", line 201, in fit return self._fit(X, y)
File "/spare/local/fchen/virtualenv/lib/python3.5/site-packages/boruta/boruta_py.py", line 335, in _fit iter_ranks = self._nanrankdata(imp_history_rejected, axis=1)
File "/spare/local/fchen/virtualenv/lib/python3.5/site-packages/boruta/boruta_py.py", line 502, in _nanrankdata
ranks = sp.stats.mstats.rankdata(X, axis=axis)
File "/spare/local/fchen/virtualenv/lib/python3.5/site-packages/scipy/stats/mstats_basic.py", line 265, in rankdata
return ma.apply_along_axis(_rank1d,axis,data,use_missing).view(ndarray)
File "/spare/local/fchen/virtualenv/lib/python3.5/site-packages/numpy/ma/extras.py", line 395, in apply_along_axis
res = func1d(arr[tuple(i.tolist())], *args, **kwargs)
File "/spare/local/fchen/virtualenv/lib/python3.5/site-packages/scipy/stats/mstats_basic.py", line 253, in _rank1d
for r in repeats[0]:
TypeError: iteration over a 0-d array
from boruta_py.
what is the solution of this issue? I am still facing the issue
from boruta_py.
I can confirm that this remains an issue. When this problem was previously reported (#8, #12) the issues appear to have been closed without any real resolution.
(In my case at least) the problem is caused by the inability to handle nan
data as reported in #8. A (local) solution is to change the _nanrankdata
method back to bottleneck.nanrankdata
.
Commit 80a74c1 explicitly removed the dependency on bottleneck, but this isn't really a solution as the new functionality is broken.
from boruta_py.
@bittremieux's PR apparently fixed this. Let me know if the issue persist, but for the time being, I'll close this. Thanks again!
from boruta_py.
Facing same issues here.
from boruta_py.
same issues:
TypeError Traceback (most recent call last)
in ()
1 feat_selector = BorutaPy(rf, n_estimators = 50, verbose = 2, random_state = 1)
2
----> 3 feat_selector.fit(X, y)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\boruta\boruta_py.py in fit(self, X, y)
199 """
200
--> 201 return self._fit(X, y)
202
203 def transform(self, X, weak=False):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\boruta\boruta_py.py in _fit(self, X, y)
333 imp_history_rejected = imp_history[1:, not_selected] * -1
334 # calculate ranks in each iteration, then median of ranks across feats
--> 335 iter_ranks = self._nanrankdata(imp_history_rejected, axis=1)
336 rank_medians = np.nanmedian(iter_ranks, axis=0)
337 ranks = self._nanrankdata(rank_medians, axis=0)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\boruta\boruta_py.py in _nanrankdata(self, X, axis)
500 Replaces bottleneck's nanrankdata with scipy and numpy alternative.
501 """
--> 502 ranks = sp.stats.mstats.rankdata(X, axis=axis)
503 ranks[np.isnan(X)] = np.nan
504 return ranks
~\AppData\Local\Continuum\anaconda3\lib\site-packages\scipy\stats\mstats_basic.py in rankdata(data, axis, use_missing)
263 return _rank1d(data, use_missing)
264 else:
--> 265 return ma.apply_along_axis(_rank1d,axis,data,use_missing).view(ndarray)
266
267
~\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\ma\extras.py in apply_along_axis(func1d, axis, arr, *args, **kwargs)
392 i.put(indlist, ind)
393 j = i.copy()
--> 394 res = func1d(arr[tuple(i.tolist())], *args, **kwargs)
395 # if res is a number, then we have a smaller output array
396 asscalar = np.isscalar(res)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\scipy\stats\mstats_basic.py in _rank1d(data, use_missing)
251
252 repeats = find_repeats(data.copy())
--> 253 for r in repeats[0]:
254 condition = (data == r).filled(False)
255 rk[condition] = rk[condition].mean()
TypeError: iteration over a 0-d array
from boruta_py.
Same error here - any solutions? it seems to be uncomfortable with the different Boruta versions
from boruta_py.
Related Issues (20)
- Numpy types aliases deprecated (`np.int`, `np.bool` and `np.float`)
- why estimators num is calculated by feature num in this way?
- max_iter values HOT 3
- ImportError: cannot import name 'BorutaPy' from 'boruta' HOT 1
- PKG for the survival analysis HOT 6
- Can I somehow speed the Borutapy process HOT 2
- Version update of Boruta on pypi? HOT 5
- What percentage of shadow features does each real feature outperform?
- AttributeError: module 'numpy' has no attribute 'int'. HOT 9
- Possible problems in installation HOT 1
- TypeError: BorutaPy.__init__() got an unexpected keyword argument 'early_stopping' HOT 1
- Kaggle n_estimators issue with DecisionTreeClassifier HOT 2
- Error when using BorutaPy with LogisticRegression
- AttributeError: module 'numpy' has no attribute 'bool' when using BorutaPy with RandomForestClassifier HOT 3
- BorutaPy selects different features in different iterations HOT 1
- AttributeError: module 'numpy' has no attribute 'int'. `np.int` was a deprecated alias for the builtin `int`. HOT 13
- Does boruta apply to time series data? HOT 1
- New release HOT 1
- Why does the number of total features (Confirmed + Tentative + Rejected) not equal to the input features?
- Does BorutaPy work with cuML RandomForestClassifier? HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from boruta_py.