Skip to content Skip to sidebar Skip to footer

Displaying Pair Plot In Pandas Data Frame

I am trying to display a pair plot by creating from scatter_matrix in pandas dataframe. This is how the pair plot is created: # Create dataframe from data in X_train # Label the co

Solution 1:

This code worked for me using Python 3.5.2:

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn import datasets

iris_dataset = datasets.load_iris()
X = iris_dataset.data
Y = iris_dataset.target

iris_dataframe = pd.DataFrame(X, columns=iris_dataset.feature_names)

# Create a scatter matrix from the dataframe, color by y_train
grr = pd.plotting.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o',
                                 hist_kwds={'bins': 20}, s=60, alpha=.8)

For pandas version < v0.20.0.

Thanks to michael-szczepaniak for pointing out that this API had been deprecated.

grr = pd.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o',
                        hist_kwds={'bins': 20}, s=60, alpha=.8)

I just had to remove the cmap=mglearn.cm3 piece, because I was not able to make mglearn work. There is a version mismatch issue with sklearn.

To not display the image and save it directly to file you can use this method:

plt.savefig('foo.png')

Also remove

# %matplotlib inline

Enter image description here

Solution 2:

Just an update to Vikash's excellent answer. The last two lines should now be:

grr = pd.plotting.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o',
                                 hist_kwds={'bins': 20}, s=60, alpha=.8)

The scatter_matrix function has been moved to the plotting package, so the original answer, while correct is now deprecated.

So the complete code would now be:

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn import datasets

iris_dataset = datasets.load_iris()
X = iris_dataset.data
Y = iris_dataset.target

iris_dataframe = pd.DataFrame(X, columns=iris_dataset.feature_names)
# create a scatter matrix from the dataframe, color by y_train
grr = pd.plotting.scatter_matrix(iris_dataframe, c=Y, figsize=(15, 15), marker='o',
                                 hist_kwds={'bins': 20}, s=60, alpha=.8)

Solution 3:

This is also possible using seaborn:

import seaborn as sns

df = sns.load_dataset("iris")
sns.pairplot(df, hue="species")

Seaborn pairplot of iris data

Solution 4:

I finally know how to do it with PyCharm.

Just import matploblib.plotting as plt instead:

import numpy as np
import matplotlib.pyplotas plt
import pandas as pd
import mglearn
from pandas.plottingimport scatter_matrix

from sklearn.datasetsimport load_iris
from sklearn.model_selectionimport train_test_split

iris_dataset = load_iris()

X_train,X_test,Y_train,Y_test = train_test_split(iris_dataset['data'],iris_dataset['target'],random_state=0)
iris_dataframe = pd.DataFrame(X_train,columns=iris_dataset.feature_names)

grr = scatter_matrix(iris_dataframe,c = Y_train,figsize = (15,15),marker = 'o',
                        hist_kwds={'bins':20},s=60,alpha=.8,cmap = mglearn.cm3)
plt.show()

Then it works perfect as below:

Plot image

Solution 5:

enter image description herefirst of all use pip install mglearn then import the mglearn

the code will be like this...

from sklearn.datasetsimport load_iris
from sklearn.model_selectionimport train_test_split
import pandas as pd
import mglearn
import matplotlib.pyplotas plt

iris_dataframe=pd.DataFrame(X_train,columns=iris_dataset.feature_names)
grr=pd.scatter_matrix(iris_dataframe,
                  c=y_train,figsize=(15,15),marker='o',hist_kwds={'bins':20},
                  s=60,alpha=.8,cmap=mglearn.cm3)
plt.show()

Post a Comment for "Displaying Pair Plot In Pandas Data Frame"