Hi, I am testing GrootCV and got the following error: <div class

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Alright, we can try two things: run in a fresh env:<ul dir="a

numpy.random.mtrand.RandomState.shuffle ValueError: array is read-only about arfs HOT 4 CLOSED

thomasbury commented on September 27, 2024

numpy.random.mtrand.RandomState.shuffle ValueError: array is read-only

from arfs.

Comments (4)

jmrichardson commented on September 27, 2024 1

Hi, creating a new environment did work. I tested both numpy 1.26 and 2.01 on my windows PC and no issue. There must be something else in my other environment that is conflicting. No worries, I will just create a fork and make the changes I need and hopefully have more time later to pin point the issue. Thanks for your help :)

from arfs.

ThomasBury commented on September 27, 2024

Hello @jmrichardson, could you print out the version of numpy and arfs you are using?

import arfs
print(f"numpy {np.__version__} and ARFS {arfs.__version__}")

As the error says, the array is read-only. It might be due how you instantiate X and y. A simple solution is copying your array or changing the numpy flag. Everything should be fine if you use pandas DF

Are you able to run the timeseries tuto?
It runs fine with numpy 1.26.4, numpy 2.0.1 and ARFS 2.3.0

I prefer not to change shuffle to permutation, as permutation creates a copy of the numpy variable, which can be solved upward by instantiating X, y and w.

Let me know if that works, thanks for reaching out

from arfs.

jmrichardson commented on September 27, 2024

Hi @ThomasBury ,

Thank you for the fast reply!

import arfs
print(f"numpy {np.__version__} and ARFS {arfs.__version__}")
numpy 1.26.4 and ARFS 2.3.0

It fails on the tutorial. I just pasted the tutorial below in my python terminal and got the same error:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_openml
from sklearn.ensemble import HistGradientBoostingRegressor
from sklearn.model_selection import cross_validate
from sklearn.model_selection import TimeSeriesSplit
from arfs.benchmark import highlight_tick
from arfs.feature_selection.allrelevant import GrootCV
bike_sharing = fetch_openml("Bike_Sharing_Demand", version=2, as_frame=True)
df = bike_sharing.frame
y = df["count"] #/ df["count"].max()
X = df.drop("count", axis="columns")
X["weather"] = (
    X["weather"]
    .astype(object)
    .replace(to_replace="heavy_rain", value="rain")
    .astype("category")
)
ts_cv = TimeSeriesSplit(
    n_splits=5,
    gap=48,
    max_train_size=10000,
    test_size=1000,
)
feat_selector = GrootCV(
    objective="poisson",
    cutoff=1,
    n_folds=5,
    folds=ts_cv,
    n_iter=5,
    silent=True,
    fastshap=False,
    n_jobs=0,
)
feat_selector.fit(X, y, sample_weight=None)
Python 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:27:34) [MSC v.1937 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.20.0 -- An enhanced Interactive Python. Type '?' for help.
PyDev console: using IPython 8.20.0
Cross Validation:   0%|          | 0/5 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "D:\Anaconda3\envs\mld\lib\site-packages\IPython\core\interactiveshell.py", line 3553, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-1-9ec4508f8aff>", line 39, in <module>
    feat_selector.fit(X, y, sample_weight=None)
  File "D:\Anaconda3\envs\mld\lib\site-packages\arfs\feature_selection\allrelevant.py", line 2077, in fit
    self.selected_features_, self.cv_df, self.sha_cutoff = _reduce_vars_lgb_cv(
  File "D:\Anaconda3\envs\mld\lib\site-packages\arfs\feature_selection\allrelevant.py", line 2307, in _reduce_vars_lgb_cv
    new_x_tr, shadow_names = _create_shadow(X_train)
  File "D:\Anaconda3\envs\mld\lib\site-packages\arfs\feature_selection\allrelevant.py", line 1696, in _create_shadow
    np.random.shuffle(X_shadow[c].values)
  File "numpy\\random\\mtrand.pyx", line 4594, in numpy.random.mtrand.RandomState.shuffle
ValueError: array is read-only

My X and y are pandas dataframe and series respectively. Ive added a .copy() to both X and y and got the same error:

feat_selector.fit(X.copy(), y.copy(), sample_weight=None)

Not sure what is different in our environments which could cause the issue?

from arfs.

ThomasBury commented on September 27, 2024

Alright, we can try two things:

run in a fresh env:
- conda create -n arfs python jupyter ipykernel
- conda activate arfs
- pip install arfs -U

Then run the tuto using this python kernel.

If it still fails, try to change the numpy flag (see the link in my previous message)

If none works, I'll need to investigate further. I just tested on two different laptops with fresh env, it works fine (linux and windows, numpy 1.26 and 2.01)

🤞

from arfs.

numpy.random.mtrand.RandomState.shuffle ValueError: array is read-only about arfs HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent