Friday, February 3, 2017

Visualising univariate data using seaborn



In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from mpl_toolkits.basemap import Basemap
%matplotlib inline
import warnings
warnings.filterwarnings('ignore')
%config InlineBackend.figure_format = 'retina'
In [4]:
cars = pd.read_csv("http://web.pdx.edu/~gerbing/data/cars.csv")
In [9]:
cars.columns = ["model","mpg","cyl","engine","hp","weight","accelerate","year","origin"]
In [10]:
cars.head()
Out[10]:
model mpg cyl engine hp weight accelerate year origin
0 amc ambassador dpl 15.0 8 390.0 190 3850 8.5 70 American
1 amc gremlin 21.0 6 199.0 90 2648 15.0 70 American
2 amc hornet 18.0 6 199.0 97 2774 15.5 70 American
3 amc rebel sst 16.0 8 304.0 150 3433 12.0 70 American
4 buick estate wagon (sw) 14.0 8 455.0 225 3086 10.0 70 American

Strip plots

(Single continuous data vs categorical data)
In [16]:
plt.figure(figsize=(10,8))
plt.subplot(2,1,1)
sns.stripplot(x='cyl', y='hp', data=cars)
Out[16]:
<matplotlib.axes._subplots.AxesSubplot at 0x123de9828>
In [17]:
plt.figure(figsize=(10,8))
plt.subplot(2,1,2)
sns.stripplot(x='cyl', y='hp', data=cars, jitter=True, size=5)
Out[17]:
<matplotlib.axes._subplots.AxesSubplot at 0x1240b2e10>

Swarm plots

  • Single continuous data vs categorical data
  • It spreads out datapoint for better visualization
In [19]:
plt.figure(figsize=(10,8))
plt.subplot(2,1,1)
sns.swarmplot(x='cyl', y='hp', data=cars)
Out[19]:
<matplotlib.axes._subplots.AxesSubplot at 0x124395cc0>
In [22]:
plt.figure(figsize=(10,10))
plt.subplot(2,1,2)
sns.swarmplot(x='hp', y='cyl', data=cars, hue='origin', orient='h')
Out[22]:
<matplotlib.axes._subplots.AxesSubplot at 0x124aced30>
Both Strip plots and Swarm plots are meaningless(overplotting) for visualizing a large dataset.

Violin plots

  • Similar to box plots
In [26]:
plt.figure(figsize=(10,10))
plt.subplot(2,1,1)
sns.violinplot(x='cyl', y='hp', data=cars)
Out[26]:
<matplotlib.axes._subplots.AxesSubplot at 0x125ae4208>

Combining strip plot and violin plot

In [27]:
plt.figure(figsize=(10,10))
plt.subplot(2,1,2)
sns.violinplot(x='cyl', y='hp', data=cars, inner=None, color='lightgray')
# Overlay a strip plot on the violin plot
sns.stripplot(x='cyl', y='hp', data=cars, jitter=True, size=1.5)
Out[27]:
<matplotlib.axes._subplots.AxesSubplot at 0x126752278>

No comments :

Post a Comment