Comments (4)
Hi, creating a new environment did work. I tested both numpy 1.26 and 2.01 on my windows PC and no issue. There must be something else in my other environment that is conflicting. No worries, I will just create a fork and make the changes I need and hopefully have more time later to pin point the issue. Thanks for your help :)
from arfs.
Hello @jmrichardson, could you print out the version of numpy and arfs you are using?
import arfs
print(f"numpy {np.__version__} and ARFS {arfs.__version__}")
As the error says, the array is read-only. It might be due how you instantiate X
and y
. A simple solution is copying your array or changing the numpy flag. Everything should be fine if you use pandas DF
Are you able to run the timeseries tuto?
It runs fine with numpy 1.26.4
, numpy 2.0.1
and ARFS 2.3.0
I prefer not to change shuffle
to permutation, as permutation creates a copy of the numpy variable, which can be solved upward by instantiating X, y and w.
Let me know if that works, thanks for reaching out
from arfs.
Hi @ThomasBury ,
Thank you for the fast reply!
import arfs
print(f"numpy {np.__version__} and ARFS {arfs.__version__}")
numpy 1.26.4 and ARFS 2.3.0
It fails on the tutorial. I just pasted the tutorial below in my python terminal and got the same error:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_openml
from sklearn.ensemble import HistGradientBoostingRegressor
from sklearn.model_selection import cross_validate
from sklearn.model_selection import TimeSeriesSplit
from arfs.benchmark import highlight_tick
from arfs.feature_selection.allrelevant import GrootCV
bike_sharing = fetch_openml("Bike_Sharing_Demand", version=2, as_frame=True)
df = bike_sharing.frame
y = df["count"] #/ df["count"].max()
X = df.drop("count", axis="columns")
X["weather"] = (
X["weather"]
.astype(object)
.replace(to_replace="heavy_rain", value="rain")
.astype("category")
)
ts_cv = TimeSeriesSplit(
n_splits=5,
gap=48,
max_train_size=10000,
test_size=1000,
)
feat_selector = GrootCV(
objective="poisson",
cutoff=1,
n_folds=5,
folds=ts_cv,
n_iter=5,
silent=True,
fastshap=False,
n_jobs=0,
)
feat_selector.fit(X, y, sample_weight=None)
Python 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:27:34) [MSC v.1937 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.20.0 -- An enhanced Interactive Python. Type '?' for help.
PyDev console: using IPython 8.20.0
Cross Validation: 0%| | 0/5 [00:00<?, ?it/s]
Traceback (most recent call last):
File "D:\Anaconda3\envs\mld\lib\site-packages\IPython\core\interactiveshell.py", line 3553, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-1-9ec4508f8aff>", line 39, in <module>
feat_selector.fit(X, y, sample_weight=None)
File "D:\Anaconda3\envs\mld\lib\site-packages\arfs\feature_selection\allrelevant.py", line 2077, in fit
self.selected_features_, self.cv_df, self.sha_cutoff = _reduce_vars_lgb_cv(
File "D:\Anaconda3\envs\mld\lib\site-packages\arfs\feature_selection\allrelevant.py", line 2307, in _reduce_vars_lgb_cv
new_x_tr, shadow_names = _create_shadow(X_train)
File "D:\Anaconda3\envs\mld\lib\site-packages\arfs\feature_selection\allrelevant.py", line 1696, in _create_shadow
np.random.shuffle(X_shadow[c].values)
File "numpy\\random\\mtrand.pyx", line 4594, in numpy.random.mtrand.RandomState.shuffle
ValueError: array is read-only
My X and y are pandas dataframe and series respectively. Ive added a .copy() to both X and y and got the same error:
feat_selector.fit(X.copy(), y.copy(), sample_weight=None)
Not sure what is different in our environments which could cause the issue?
from arfs.
Alright, we can try two things:
- run in a fresh env:
conda create -n arfs python jupyter ipykernel
conda activate arfs
pip install arfs -U
Then run the tuto using this python kernel.
If it still fails, try to change the numpy flag (see the link in my previous message)
If none works, I'll need to investigate further. I just tested on two different laptops with fresh env, it works fine (linux and windows, numpy 1.26 and 2.01)
🤞
from arfs.
Related Issues (20)
- potential to specify time series splitter HOT 7
- GrootCV is missing class_weight param for muticlass classification HOT 1
- Numba HOT 1
- Consider using FastTreeSHAP? HOT 5
- Ability to pass in a model to GrootCV HOT 7
- arfs.feature_selection module not found HOT 4
- Cannot suppress runtime warning HOT 1
- [BUG] - add a safeguard when there is a single categorical column
- LightGBM bump and folds var HOT 3
- [BUG] User-Specified Threshold for CollinearityThreshold is not Applied. HOT 1
- Leshy fit method always overwrites to importance==shap if fasttreeshap not installed HOT 3
- Issue with Custom Callable Implementation in CollinearityThreshold Class HOT 2
- Issue with Overly Aggressive Feature Removal in CollinearityThreshold Class
- Bug: MinRedundancyMaxRelevance Function Modifies Input DataFrame by Adding target Column HOT 2
- Possible bugs in `CollinearityThreshold` HOT 9
- CollinearityThreshold has the wrong default
- Duplicated feature importance columns in reduce_vars_sklearn HOT 2
- Max on the wrong axis in _reduce_vars_sklearn HOT 3
- Feature Selection Accuracy Comparison
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arfs.