Friday, February 3, 2017

Visualising Multivariate data using seaborn



In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from mpl_toolkits.basemap import Basemap
%matplotlib inline
import warnings
warnings.filterwarnings('ignore')
In [2]:
cars = pd.read_csv("http://web.pdx.edu/~gerbing/data/cars.csv")
In [3]:
cars.columns = ["model","mpg","cyl","engine","hp","weight","accelerate","year","origin"]
In [4]:
cars.head()
Out[4]:
model mpg cyl engine hp weight accelerate year origin
0 amc ambassador dpl 15.0 8 390.0 190 3850 8.5 70 American
1 amc gremlin 21.0 6 199.0 90 2648 15.0 70 American
2 amc hornet 18.0 6 199.0 97 2774 15.5 70 American
3 amc rebel sst 16.0 8 304.0 150 3433 12.0 70 American
4 buick estate wagon (sw) 14.0 8 455.0 225 3086 10.0 70 American

Joint plot

It is used to visualise bivariate distribution
In [5]:
sns.jointplot(x='hp', y='mpg', data=cars, size = 8)
Out[5]:
<seaborn.axisgrid.JointGrid at 0x11a8bdb70>

'kind' parameter in joint plot

scatter, reg, resid, kde, hex
In [6]:
sns.jointplot(x='hp', y='mpg', data=cars, kind='hex')
Out[6]:
<seaborn.axisgrid.JointGrid at 0x11d62d390>
In [7]:
sns.jointplot(x='hp', y='mpg', data=cars, kind='reg')
Out[7]:
<seaborn.axisgrid.JointGrid at 0x11d62dcc0>

pairplot( )

In [8]:
sns.pairplot(cars)
Out[8]:
<seaborn.axisgrid.PairGrid at 0x11dbe4828>
In [9]:
sns.pairplot(cars, hue='origin', kind='reg')
Out[9]:
<seaborn.axisgrid.PairGrid at 0x120531128>