Git Product home page Git Product logo

Comments (4)

jmrichardson avatar jmrichardson commented on September 27, 2024 1

Hi, creating a new environment did work. I tested both numpy 1.26 and 2.01 on my windows PC and no issue. There must be something else in my other environment that is conflicting. No worries, I will just create a fork and make the changes I need and hopefully have more time later to pin point the issue. Thanks for your help :)

from arfs.

ThomasBury avatar ThomasBury commented on September 27, 2024

Hello @jmrichardson, could you print out the version of numpy and arfs you are using?

import arfs
print(f"numpy {np.__version__} and ARFS {arfs.__version__}")

As the error says, the array is read-only. It might be due how you instantiate X and y. A simple solution is copying your array or changing the numpy flag. Everything should be fine if you use pandas DF

Are you able to run the timeseries tuto?
It runs fine with numpy 1.26.4, numpy 2.0.1 and ARFS 2.3.0

I prefer not to change shuffle to permutation, as permutation creates a copy of the numpy variable, which can be solved upward by instantiating X, y and w.

Let me know if that works, thanks for reaching out

from arfs.

jmrichardson avatar jmrichardson commented on September 27, 2024

Hi @ThomasBury ,

Thank you for the fast reply!

import arfs
print(f"numpy {np.__version__} and ARFS {arfs.__version__}")
numpy 1.26.4 and ARFS 2.3.0

It fails on the tutorial. I just pasted the tutorial below in my python terminal and got the same error:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_openml
from sklearn.ensemble import HistGradientBoostingRegressor
from sklearn.model_selection import cross_validate
from sklearn.model_selection import TimeSeriesSplit
from arfs.benchmark import highlight_tick
from arfs.feature_selection.allrelevant import GrootCV
bike_sharing = fetch_openml("Bike_Sharing_Demand", version=2, as_frame=True)
df = bike_sharing.frame
y = df["count"] #/ df["count"].max()
X = df.drop("count", axis="columns")
X["weather"] = (
    X["weather"]
    .astype(object)
    .replace(to_replace="heavy_rain", value="rain")
    .astype("category")
)
ts_cv = TimeSeriesSplit(
    n_splits=5,
    gap=48,
    max_train_size=10000,
    test_size=1000,
)
feat_selector = GrootCV(
    objective="poisson",
    cutoff=1,
    n_folds=5,
    folds=ts_cv,
    n_iter=5,
    silent=True,
    fastshap=False,
    n_jobs=0,
)
feat_selector.fit(X, y, sample_weight=None)
Python 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:27:34) [MSC v.1937 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.20.0 -- An enhanced Interactive Python. Type '?' for help.
PyDev console: using IPython 8.20.0
Cross Validation:   0%|          | 0/5 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "D:\Anaconda3\envs\mld\lib\site-packages\IPython\core\interactiveshell.py", line 3553, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-1-9ec4508f8aff>", line 39, in <module>
    feat_selector.fit(X, y, sample_weight=None)
  File "D:\Anaconda3\envs\mld\lib\site-packages\arfs\feature_selection\allrelevant.py", line 2077, in fit
    self.selected_features_, self.cv_df, self.sha_cutoff = _reduce_vars_lgb_cv(
  File "D:\Anaconda3\envs\mld\lib\site-packages\arfs\feature_selection\allrelevant.py", line 2307, in _reduce_vars_lgb_cv
    new_x_tr, shadow_names = _create_shadow(X_train)
  File "D:\Anaconda3\envs\mld\lib\site-packages\arfs\feature_selection\allrelevant.py", line 1696, in _create_shadow
    np.random.shuffle(X_shadow[c].values)
  File "numpy\\random\\mtrand.pyx", line 4594, in numpy.random.mtrand.RandomState.shuffle
ValueError: array is read-only

My X and y are pandas dataframe and series respectively. Ive added a .copy() to both X and y and got the same error:

feat_selector.fit(X.copy(), y.copy(), sample_weight=None)

Not sure what is different in our environments which could cause the issue?

from arfs.

ThomasBury avatar ThomasBury commented on September 27, 2024

Alright, we can try two things:

  • run in a fresh env:
    • conda create -n arfs python jupyter ipykernel
    • conda activate arfs
    • pip install arfs -U

Then run the tuto using this python kernel.

If it still fails, try to change the numpy flag (see the link in my previous message)

If none works, I'll need to investigate further. I just tested on two different laptops with fresh env, it works fine (linux and windows, numpy 1.26 and 2.01)

🤞

from arfs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.