Git Product home page Git Product logo

ex-n0.5's Introduction

Ex-No.5 FEATURE GENERATION

AIM:

To read the given data and perform Feature Generation process and save the data to a file.

EXPLANATION:

Feature Generation (also known as feature construction, feature extraction or feature engineering) is the process of transforming features into new features that better relate to the target

ALGORITHM:

STEP 1:

Read the given Data.

STEP 2:

Clean the Data Set using Data Cleaning Process.

STEP 3:

Apply Feature Generation techniques to all the feature of the data set.

STEP 4:

Save the data to the file.

CODE:

Data set:

Ordinal Encoder:

import pandas as pd

import numpy as np

import seaborn as sns

from google.colab import files

uploaded = files.upload()

df = pd.read_csv("data.csv")

from sklearn.preprocessing import LabelEncoder,OrdinalEncoder

classes = ['Cold','Warm','Hot','Very Hot']

enc = OrdinalEncoder(categories = [classes])

enc.fit_transform(df[["Ord_1"]])

df['ord_1']=enc.fit_transform(df[["Ord_1"]])

df

Label Encoder:

import pandas as pd

import numpy as np

import seaborn as sns

from google.colab import files

uploaded = files.upload()

df = pd.read_csv("data.csv")

from sklearn.preprocessing import LabelEncoder,OrdinalEncoder

classes = [0,1]

enc = OrdinalEncoder(categories = [classes])

enc.fit_transform(df[["Target"]])

df['target']=enc.fit_transform(df[["Target"]])

df

Binary Encoder:

import pandas as pd

import numpy as np

import seaborn as sns

from google.colab import files

uploaded = files.upload()

df = pd.read_csv("data.csv")

!pip install category_encoders

from category_encoders import BinaryEncoder

be=BinaryEncoder()

newdata=be.fit_transform(df['bin_1'])

df1=pd.concat([df,newdata],axis=1)

df1

OneHotEncoder:

import pandas as pd

import numpy as np

import seaborn as sns

from google.colab import files

uploaded = files.upload()

df = pd.read_csv("data.csv")

from sklearn.preprocessing import OneHotEncoder

ohe = OneHotEncoder(sparse=False)

df1 = df.copy()

enc = pd.DataFrame(ohe.fit_transform(df1[['City']]))

df1 = pd.concat([df1,enc],axis=1)

df1

Encoding Data Set:

Ordinal Encoder:

import pandas as pd

import numpy as np

import seaborn as sns

from google.colab import files

uploaded = files.upload()

df = pd.read_csv("Encoding Data.csv")

from sklearn.preprocessing import LabelEncoder,OrdinalEncoder

classes = ['Red','Blue','Green']

enc = OrdinalEncoder(categories = [classes])

enc.fit_transform(df[["nom_0"]])

df['Nom_0']=enc.fit_transform(df[["nom_0"]])

df

Binary Encoder:

import pandas as pd

import numpy as np

import seaborn as sns

from google.colab import files

uploaded = files.upload()

df = pd.read_csv("Encoding Data.csv")

!pip install category_encoders

from category_encoders import BinaryEncoder

be=BinaryEncoder()

newdata=be.fit_transform(df['bin_1'])

df1=pd.concat([df,newdata],axis=1)

df1

Titanic Data Set:

Ordinal Encoder:

import pandas as pd

import numpy as np

import seaborn as sns

from google.colab import files

uploaded = files.upload()

df = pd.read_csv("titanic_dataset.csv")

from sklearn.preprocessing import LabelEncoder,OrdinalEncoder

classes = [1,2,3]

enc = OrdinalEncoder(categories = [classes])

enc.fit_transform(df[["Pclass"]])

df['ord_2']=enc.fit_transform(df[["Pclass"]])

df

import pandas as pd

import numpy as np

import seaborn as sns

from google.colab import files

uploaded = files.upload()

df = pd.read_csv("titanic_dataset.csv")

from sklearn.preprocessing import LabelEncoder,OrdinalEncoder

classes = ['C','Q','S',np.nan]

enc = OrdinalEncoder(categories = [classes])

enc.fit_transform(df[["Embarked"]])

df['ord']=enc.fit_transform(df[["Embarked"]])

df

Label Encoder:

import pandas as pd

import numpy as np

import seaborn as sns

from google.colab import files

uploaded = files.upload()

df = pd.read_csv("titanic_dataset.csv")

from sklearn.preprocessing import LabelEncoder,OrdinalEncoder

le = LabelEncoder()

df['Name1']=le.fit_transform(df['Name'])

df

Binary Encoder:

import pandas as pd

import numpy as np

import seaborn as sns

from google.colab import files

uploaded = files.upload()

df = pd.read_csv("titanic_dataset.csv")

!pip install category_encoders

from category_encoders import BinaryEncoder

be=BinaryEncoder()

newdata=be.fit_transform(df['Sex'])

df1=pd.concat([df,newdata],axis=1)

df1

OUTPUT:

Data set:

Ordinal Encoder:

Screenshot (43)

Label Encoder:

Screenshot (44)

Binary Encoder:

Screenshot (45)

OneHotEncoder:

Screenshot (46)

Encoding data set:

Ordinal Encoder:

Screenshot (47)

Binary Encoder:

Screenshot (48)

Titanic Data set:

Ordinal Encoder:

Screenshot (49) Screenshot (50)

Label Encoder:

Screenshot (51)

Binary Encoder:

Screenshot (52)

RESULT:

Thus the Feature Generation for the given data set is executed and output was verified successfully

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.