Comments (3)
Great question @MySciencePlayground 👍
The reason I chose this method is that I found permutation based methods where:
- Multiple permutations of features are selected sequentially is very time consuming and for large datasets, can be explosive in terms of cost
- Random selection of baskets of features may leave out certain features and that could be costly.
The best compromise between the two methods is this sequential selection method where smaller baskets of features are selected and it scales well even when 100's or 1000's of features are in a dataset. That's why it works well for large datasets as well as for small datasets.
Hope this clarifies
AutoVimal
from featurewiz.
Thanks for the quick respond, @AutoViML! So with the current approach, the column order can have some effect on the final selection. Out of curiosity, is there any test/evaluation on how different the results could be ? (applying same dataset but changing the column order)
from featurewiz.
Hi @MySciencePlayground
That's a good question but I was hoping you would try it yourself out of curiosity by changing the column order and testing it. I have done my tests and I will leave it to you to try it and report back your findings and questions if any 👍
Please upgrade to the latest version since it fixes a bug related to this issue:
!pip install featurewiz --ignore-installed --no-deps --upgrade
This is the tryout medal for your efforts!🥇
AutoVimal
from featurewiz.
Related Issues (20)
- Category type, indexes don't match on AutoEncoding HOT 3
- Issue with working with Featureviz HOT 1
- Comment has incorrect code ( verbose=0. imbalanced=False [verbose=0, imbalanced=False]) HOT 1
- make tensorflow optional HOT 4
- lazytransform.py float to integer error HOT 2
- Dealing with a Numpy array as features HOT 1
- Convert binary columns to categorical HOT 1
- featurewiz ignores category columns HOT 2
- dont show chart for more than 1000 features HOT 1
- featurewiz ignores category columns with an example HOT 5
- Universal API required for smooth working HOT 4
- Unpin requirements? HOT 2
- Conda package outdated
- TypeError: expected string or bytes-like object on int type column name HOT 2
- Conflict Error Among Poetry Package Dependencies: lazytransform, tqdm, featurewiz HOT 8
- TYPO ERROR
- Typo Error
- Version Conflict for scikit-learn - Bump to 1.3.2 possible? HOT 1
- Can't get featurewiz to work HOT 1
- ValueError: Length mismatch: Expected axis has X elements, new values have Y elements HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from featurewiz.