Git Product home page Git Product logo

ex02-outlier's Introduction

Ex02-Outlier

You are given bhp.csv which contains property prices in the city of banglore, India. You need to examine price_per_sqft column and do following,

(1) Remove outliers using IQR

(2) After removing outliers in step 1, you get a new dataframe.

(3) use zscore of 3 to remove outliers. This is quite similar to IQR and you will get exact same result

(4) for the data set height_weight.csv find the following

(i) Using IQR detect weight outliers and print them

(ii) Using IQR, detect height outliers and print them

DEVELOPED BY: Lavanya S

REFERENCE NO: 212221220030

import pandas as pd
import seaborn as sns
import numpy as np
from scipy import stats
df=pd.read_csv('bhp.csv')
print(df['price_per_sqft'])
sns.boxplot(x="price_per_sqft",data=df)
df.shape


q1=df['price_per_sqft'].quantile(0.25)
q3=df['price_per_sqft'].quantile(0.75)
iqr=q3-q1
print("FIRST QUANTILE=",q1,"\nSECOND QUANTILE=",q3)
low=q1-1.5*iqr
high=q3+1.5*iqr
df_filtered=df[((df['price_per_sqft']>=low)&(df['price_per_sqft']<=high))]
sns.boxplot(x="price_per_sqft",data=df_filtered)


from scipy import stats
import numpy as np
z=np.abs(stats.zscore(df['price_per_sqft']))
df_filtered=df_filtered[(z<3)]
sns.boxplot(x='price_per_sqft',data=df_filtered)


q1=df.quantile(0.25)
q3=df.quantile(0.75)
iqr=q3-q1
low=q1-1.5*iqr
high=q3+1.5*iqr
df_fil=df[((df<=high)&(df>=low))]
outliers=np.setdiff1d(df['weight'],df_fil['weight'])
print("THE OUTLIERS IN THE DATA SET WEIGHT:",outliers)


q1=df.quantile(0.25)
q3=df.quantile(0.75)
iqr=q3-q1
low=q1-1.5*iqr
high=q3+1.5*iqr
df_fil=df[((df<=high)&(df>=low))]
outliers=np.setdiff1d(df['height'],df_fil['height'])
print("THE OUTLIERS IN THE DATA SET height:",outliers)

Screenshot (16) Screenshot (17) Screenshot (18) Screenshot (19) Screenshot (20)

ex02-outlier's People

Contributors

karthi-govindharaju-ai avatar lavanyasit avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.