The ds-vis-stats_seaborn-readme from learn-co-curriculum

Statistical visualizations with `seaborn`

Scatterplots, as we briefly seen in our introductory lessons and labs, display the values of 2 sets of data on 2 dimensions. Each dot represents an observation. The position on the X (horizontal) and Y (vertical) axis represents the values of the 2 variables. These are useful to study the relationship between different variables. It is common to provide even more information using colors or shapes (to show groups, or a third variable).

The sample scatter plot above shows a relationship between Income and health index for a given sample of data. We can see an overall trend in the data depicting that increase in the income may have some effect on the health of individuals.

Creating Scatter Plots in SeaBorn

As seen earlier, scatter plots are simple to draw in ,atplotlib using the .scatter() method. In this lesson, we shall use a different plotting library avaialble in Python calles seaborn. Seaborn provides a high-level interface for drawing attractive and informative statistical graphics. Here is some of the functionality that seaborn offers out of the box:

An API for examining relationships between multiple variables
Support for using categorical variables and their aggregate statistics
Visualizing univariate or bivariate distributions
High-level abstractions for structuring multi-plot grids
Concise control over matplotlib figure styling with several built-in themes
Tools for choosing color palettes that faithfully reveal patterns in your data

Let's focus at scatter plots for now. In order to use seaborn, we first need to import it alongside matplotlib as shown below:

import matplotlib.pyplot as plt
import seaborn as sns

Seaborn comes packaged with a number of datasets for pratice and exploration. We shall import the famous iris dataset for drawing our scatter plots in the lesson.

# Load the iris dataset into a pandas dataframe
iris_data = sns.load_dataset('iris')
# View the head of dataset
iris_data.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	sepal_length	sepal_width	petal_length	petal_width	species
0	5.1	3.5	1.4	0.2	setosa
1	4.9	3.0	1.4	0.2	setosa
2	4.7	3.2	1.3	0.2	setosa
3	4.6	3.1	1.5	0.2	setosa
4	5.0	3.6	1.4	0.2	setosa

A detailed list of all the datasets included with seaborn is available at this github repository.

With seaborn, scatterplots are made using the regplot() function (compareable to .scatter() in matplotlib). Here is an example showing regplot() in action with the most basic settings. This function needs 2 lists for the positions of points on the X and Y axis. By default it also draws a linear regression fit which we shall remove with fit_reg=False.

Let's draw a scatter plot between the sepal length and sepal width using regplot.

# Use seaborn to draw a scatter plot between speal length and sepal width columns from the dataset
sns.regplot(x=iris_data["sepal_length"], y=iris_data["sepal_width"], fit_reg=False);

Seaborn offers custom coloring and further customization of the scatter plots with shapes. the scatter_kws can be used to specify the size , color and transparency of markers as shown the example below:

# More marker customization:
sns.regplot(x=iris_data["sepal_length"], 
            y=iris_data["sepal_width"], 
            fit_reg=False, 
            scatter_kws={"color":"darkblue",
                         "alpha":0.3,
                         "s":200} )
plt.show()

Once you understood how to make a basic scatterplot with seaborn and how to custom shapes and color, you probably want the color corresponds to a categorical variable (a group). This is possible using the hue argument: it’s here that you must specify the column to use to map the color.

# Use the 'hue' argument to provide a factor variable
sns.lmplot(x="sepal_length", 
           y="sepal_width", 
           data=iris_data, 
           fit_reg=False, 
           hue='species', 
           legend=True);
plt.show()

learn-co-curriculum / ds-vis-stats_seaborn-readme Goto Github PK

ds-vis-stats_seaborn-readme's Introduction

Statistical visualizations with `seaborn`

Creating Scatter Plots in SeaBorn

ds-vis-stats_seaborn-readme's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

learn-co-curriculum / ds-vis-stats_seaborn-readme Goto Github PK

ds-vis-stats_seaborn-readme's Introduction

Statistical visualizations with seaborn

Creating Scatter Plots in SeaBorn

ds-vis-stats_seaborn-readme's People

Contributors

Watchers

Recommend Projects

Recommend Topics

Recommend Org

Statistical visualizations with `seaborn`