Git Product home page Git Product logo

dsc-1-09-11-the-probability-density-function-lab's Introduction

The Probability Density Function (PDF) - Lab

Introduction

In this lab we will look at building visualizations known as density plots to estimate the probability density for a given set of data.

Objectives

You will be able to:

  • Calculate the PDF from given dataset containing real valued random variables
  • Plot density functions and comment on the shape of the plot
  • Plot density functions using seaborn

Let's get started!

We'll import all the required libraries for you for this lab.

# Import required libraries
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')
import pandas as pd 

Import the dataset 'weight-height.csv' as pandas dataframe . Calculate the mean and standard deviation for weights and heights for male and female individually.

Hint : Use your pandas dataframe subsetting skills like loc(), iloc() and groupby()

data = None
male_df =  None
female_df =  None

  

# Male Height mean: 69.02634590621737
# Male Height sd: 2.8633622286606517
# Male Weight mean: 187.0206206581929
# Male Weight sd: 19.781154516763813
# Female Height mean: 63.708773603424916
# Female Height sd: 2.696284015765056
# Female Weight mean: 135.8600930074687
# Female Weight sd: 19.022467805319007
Male Height mean: 69.02634590621737
Male Height sd: 2.8633622286606517
Male Weight mean: 187.0206206581929
Male Weight sd: 19.781154516763813
Female Height mean: 63.708773603424916
Female Height sd: 2.696284015765056
Female Weight mean: 135.8600930074687
Female Weight sd: 19.022467805319007

Plot overlapping normalized histograms for male and female heights - use binsize = 10, set alpha level so that overlap can be visualized

<matplotlib.legend.Legend at 0x10a5a38d0>

png

# Record your observations - are these inline with your personal observations?

Write a function density() that takes in a random variable and calculates the density function using np.hist and interpolation. The function should return two lists carrying x and y coordinates for plotting the density function

def density(x):
    
    pass



# Generate test data and test the function - uncomment to run the test
# np.random.seed(5)
# mu, sigma = 0, 0.1 # mean and standard deviation
# s = np.random.normal(mu, sigma, 100)
# x,y = density(s)
# plt.plot(x,y, label = 'test')
# plt.legend()
<matplotlib.legend.Legend at 0x10acba668>

png

Add Overlapping density plots for male and female heights to the histograms plotted earlier

# You code here 
[<matplotlib.lines.Line2D at 0x10e25c9b0>]

png

Repeat above exerice for male and female weights

# Your code here 
[<matplotlib.lines.Line2D at 0x115c5fa90>]

png

Write your observations in the cell below.

# Record your observations - are these inline with your personal observations?


# So whats the takeaway when comparing male and female heights and weights 

Repeat Above experiments in seaborn and compare with your results.

Text(0.5,1,'Comparing weights')

png

Text(0.5,1,'Comparing Weights')

png

# Your comments on the two approaches here. 
# are they similar ? what makes them different if they are ?

Summary

In this lesson we saw how to build the probability density curves visually for given datasets and compare on the distribution visually by looking at the spread , center and overlap between data elements. This is a useful EDA technique and can be used to answer some initial questions before embarking on a complex analytics journey.

dsc-1-09-11-the-probability-density-function-lab's People

Contributors

loredirick avatar mathymitchell avatar peterbell avatar shakeelraja avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.