Git Product home page Git Product logo

Comments (4)

AutoViML avatar AutoViML commented on July 28, 2024

HI @arjay55 👍
This problem has been fixed. Please upgrade via:
pip install featurewiz --upgrade

Then let us know if it works.
AutoViML team

from featurewiz.

arjay55 avatar arjay55 commented on July 28, 2024

Hi. I'm still encountering an error.

Skipping feature engineering since no feature_engg input...
Skipping category encoding since no category encoders specified in input...
Loading train data...
Shape of your Data Set loaded: (143974, 2880)
Loading test data...
No file given. Continuing...
Classifying features using 10000 rows...
loading a random sample of 10000 rows into pandas for EDA


ValueError Traceback (most recent call last)
/tmp/ipykernel_328/641257510.py in
----> 1 outputs = featurewiz(dataset, collist, corr_limit=0.93, verbose=1, dask_xgboost_flag=False)

~/miniconda3/envs/CS280/lib/python3.7/site-packages/featurewiz/featurewiz.py in featurewiz(dataname, target, corr_limit, verbose, sep, header, test_data, feature_engg, category_encoders, dask_xgboost_flag, nrows, **kwargs)
1082 targets = copy.deepcopy(target)
1083 ##### you can use
-> 1084 train_small = select_rows_from_dataframe(dataname, targets, nrows_limit, DS_LEN=dataname.shape[0])
1085 features_dict = classify_features(train_small, target)
1086 else:

/miniconda3/envs/CS280/lib/python3.7/site-packages/featurewiz/featurewiz.py in select_rows_from_dataframe(train_dataframe, targets, nrows_limit, DS_LEN)
4001 list_of_few_classes = train_dataframe[each_target].value_counts()[train_dataframe[each_target].value_counts()<=3].index.tolist()
4002 train_dataframe = train_dataframe.loc[
(train_dataframe[each_target].isin(list_of_few_classes))]
-> 4003 train_small, _ = train_test_split(train_dataframe, test_size=test_size, stratify=train_dataframe[targets])
4004 else:
4005 ### For Regression problems: load a small sample of data into a pandas dataframe ##

~/miniconda3/envs/CS280/lib/python3.7/site-packages/sklearn/model_selection/_split.py in train_test_split(*arrays, **options)
2129 n_samples = _num_samples(arrays[0])
2130 n_train, n_test = _validate_shuffle_split(n_samples, test_size, train_size,
-> 2131 default_test_size=0.25)
2132
2133 if shuffle is False:

~/miniconda3/envs/CS280/lib/python3.7/site-packages/sklearn/model_selection/_split.py in _validate_shuffle_split(n_samples, test_size, train_size, default_test_size)
1812 'resulting train set will be empty. Adjust any of the '
1813 'aforementioned parameters.'.format(n_samples, test_size,
-> 1814 train_size)
1815 )
1816

ValueError: With n_samples=0, test_size=0.9 and train_size=None, the resulting train set will be empty. Adjust any of the aforementioned parameters.

from featurewiz.

AutoViML avatar AutoViML commented on July 28, 2024

Hi @arjay55 👍
Sorry I cannot tell where this error is coming from without the data. I also noticed that you have not provided the version number of featurewiz you are using.

Please note that the latest version is 0.0.70 and you need to show me you have upgraded to this version and if you can attach a small sample dataset (via zip file) here.
Thanks
AutoViML

from featurewiz.

arjay55 avatar arjay55 commented on July 28, 2024

Hi. Already my run on my dataset I confirmed that the bug is resolved. Thanks!

from featurewiz.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.