hi ! thanks for writing this package, looks very interesting, I saw the article on

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

method do transform only data, after fitted about featurewiz HOT 9 CLOSED

autoviml commented on July 28, 2024

method do transform only data, after fitted

from featurewiz.

Comments (9)

rsesha commented on July 28, 2024 2

I am working on a fit_transform(train) and transform(test) version of featurewiz. I will announce it here. Make sure you star it or watch this Github to get an update. AutoViML team

…

On Fri, Jun 11, 2021 at 5:37 PM Doug Nicholson ***@***.***> wrote: Nifty package! I second nemar3's suggestion of having a separate transform method. I'd love to use this package in a production workflow, but this current limitation means that I would have to either a) retrain every time I wanted to score new data or b) reverse engineer the code that transforms my raw features into featwiz features. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGEUZ7CZEFLJRMN2PB5T56DTSJ6XZANCNFSM43EAI3BQ> .

from featurewiz.

dougnicholson commented on July 28, 2024 1

Nifty package! I second nemar3's suggestion of having a separate transform method.

I'd love to use this package in a production workflow, but this current limitation means that I would have to either a) retrain every time I wanted to score new data or b) reverse engineer the code that transforms my raw features into featwiz features.

from featurewiz.

rsesha commented on July 28, 2024

Hi Nemar3:
These are very good suggestions. Let me see what I can do.
Thanks
Ram

from featurewiz.

rankwe commented on July 28, 2024

Hi @rsesha, any progress on fit_transform(train) and transform(test) class of the featurewiz liobrary? when should we expect it.

from featurewiz.

rsesha commented on July 28, 2024

Hi: I have nearly 20 functions that are currently not on a fit-transform syntax. Would you need all of them or some of them? Can you please look at the code snd tell me? Thanks Ram

…

On Wed, Aug 25, 2021 at 12:10 PM rankwe ***@***.***> wrote: Hi @rsesha <https://github.com/rsesha>, any progress on fit_transform(train) and transform(test) class of the featurewiz liobrary? when should we expect it. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGEUZ7AYBEDV55VG2L6SFM3T6UIYLANCNFSM43EAI3BQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email> .

from featurewiz.

chrico-bu-uab commented on July 28, 2024

class FeatureWiz(BaseEstimator, TransformerMixin):
    def __init__(self):
        self.features = None

    def fit(self, X, y):

        # Build dataframe. The column names are numbers represented as strings
        df = pd.DataFrame(X, columns=[str(i) for i in range(X.shape[1])])

        # Add a column with binarized labels
        target_col = str(X.shape[1])
        df[target_col] = y

        # Select features using featurewiz
        features, _ = featurewiz(df, target_col)

        # Convert the remaining column names back to integers and drop the
        # column of labels
        self.features = [int(s) for s in np.squeeze(features)][:-1]

        return self

    def transform(self, x):
        return x[:, self.features]

from featurewiz.

AutoViML commented on July 28, 2024

Hi @chrico-bu-uab 👍

Hahaha. I love your thoughtfulness 💯 I would have created this kind of code (class) long long ago if you could just turn the 100's of things that featurewiz does into a transformer mixin.

There are 2 ways to solve the problem you all have mentioned here:

1. Keep featurewiz as rich and full featured as it is: You can't just throw featurewiz into an existing TranformerMixin class. It will blow up... The reason is that you will need to containerize the entire featurewiz library if you do what you are saying here. But if you think that is not a problem for many folks like yourself, I can then try to offer this version.

2. Create featurewiz-lite version: I am working on creating a new version of featurewiz based on the TransformerMixin above that takes in a dataset and completely transforms it into numeric variables and then selects the best features from it based on recursive XGBoost. It is a completely scikit-learn compatible Pipeline object. I call it featurewiz_lite. So it will work in any Python data pipeline you create. and won't need a special container However, this version cannot do SULOV since SULOV uses networkx library (graph networks) which is an extremely complicated piece of code to make it happen. Do you think that featurewiz-lite will be noteworthy and helpful to data and ML engineers?

Looking for comments and feedback on above.
AutoViML

from featurewiz.

chrico-bu-uab commented on July 28, 2024

Thanks, and point taken! I had just hacked something together (have only lightly tested it) and thought I would share. I would definitely be in favor of a featurewiz_lite version. The full version has so many "bells and whistles" (which is a great thing) that it wouldn't be a huge loss if a little of that functionality were sacrificed in order to have a scikit-learn compatible version.

As they say, "A designer knows he has achieved perfection, not when there is nothing left to add, but when there is nothing left to take away."

from featurewiz.

AutoViML commented on July 28, 2024

Hi @chrico-bu-uab, @rankwe, @nemar3:
👍

You are in luck! I was able to finally create a Transformer class out of featurewiz called FeatureWiz. You can now use it to perform feature selection using the fit and predit syntax of scikit-learn as follows:

from featurewiz import FeatureWiz
features = FeatureWiz(corr_limit=0.70, feature_engg='', category_encoders='', dask_xgboost_flag=False, nrows=None, verbose=2)
X_train_selected = features.fit_transform(X_train, y_train)
X_test_selected = features.transform(X_test)
features.features  ### provides the list of selected features ###

You will get a Transformer that can select the top variables from your dataset.

You must first upgrade your featurewiz version to 0.0.90 or higher via:

pip install featurewiz --upgrade

Hope you like it. Please provide your feedback and comments here.
AutoViML

from featurewiz.

method do transform only data, after fitted about featurewiz HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent