How to Extract Data from Matplotlib Plot

Retrieve XY data from matplotlib figure

This works:

In [1]: import matplotlib.pyplot as plt

In [2]: plt.plot([1,2,3],[4,5,6])
Out[2]: [<matplotlib.lines.Line2D at 0x30b2b10>]

In [3]: ax = plt.gca() # get axis handle

In [4]: line = ax.lines[0] # get the first line, there might be more

In [5]: line.get_xdata()
Out[5]: array([1, 2, 3])

In [6]: line.get_ydata()
Out[6]: array([4, 5, 6])

In [7]: line.get_xydata()
Out[7]:
array([[ 1., 4.],
[ 2., 5.],
[ 3., 6.]])

I found these by digging around in the axis object. I could only find some minimal information about these functions, apperently you can give them a boolean flag to get either original or processed data, not sure what the means.

Edit: Joe Kington showed a slightly neater way to do this:

In [1]: import matplotlib.pyplot as plt

In [2]: lines = plt.plot([1,2,3],[4,5,6],[7,8],[9,10])

In [3]: lines[0].get_data()
Out[3]: (array([1, 2, 3]), array([4, 5, 6]))

In [4]: lines[1].get_data()
Out[4]: (array([7, 8]), array([ 9, 10]))

How to extract data from matplotlib plot

Jakub is right about modifying the Python script to write out the data directly from the source from which it was sent into the plot; that's the way I'd prefer to do this. But for reference, if you do need to get data out of a plot, I think this should do it

gca().get_lines()[n].get_xydata()

Alternatively you can get the x and y data sets separately:

line = gca().get_lines()[n]
xd = line.get_xdata()
yd = line.get_ydata()

How to extract data from pyplot figure

I think the following code does what you want for a simple line plot:

import numpy as np
import matplotlib.pyplot as plt

def foo(xaxis, yaxis):
fig, ax = plt. subplots(figsize=(20, 10), dpi=100)
curve = ax.plot(xaxis, yaxis)
# curve = [Line2D object]
return curve[0].get_xdata(), curve[0].get_ydata()

x,y = foo(range(10), range(0,20,2))
print(x,y)

How to extract data from plt.imshow() or plt.matshow()?

To get RGBA array of the image you plotted on a matplotlib axes, firstly, you grab the image object (here im3). Secondly, get its colormap (here ccmap). And the final step, pass the data array, im3._A, to ccmap.

import matplotlib.cm as cm
import numpy as np
import matplotlib.pyplot as plt

data = np.random.random((10,10))
# imshow or matshow is OK
#im3 = plt.imshow(data, cmap="viridis_r") #any colormap will do
im3 = plt.matshow(data, cmap="viridis_r")
plt.axis('off')
#plt.savefig("question2.png",bbox_inches='tight',pad_inches=0)
plt.show()

# get the colormap used by the previous imshow()
ccmap = im3.get_cmap() #it is a function
print(ccmap.name) # 'viridis_r'

# get the image data ***YOU ASK FOR THIS***
# the data is passed to the colormap function to get its original state
img_rgba_array = ccmap(im3._A)

# plot the image data
ax = plt.subplot(111)
ax.imshow(img_rgba_array); #dont need any cmap to plot

Sample output plots:

img_data

How to get data from matplotlib bar chart

  • The API for matplotlib.pyplot.bar returns a BarContainer object
    • matplotlib.patches.Rectangle provides a full accounting of the Patch methods.
    • This object is iterable, and the various location components can be extracted with the appropriate methods, as shown below.
import matplotlib.pyplot as plt

rects = plt.bar([1,2,3], [4,5,6])

for rect in rects:
print(rect)
xy = rect.get_xy()
x = rect.get_x()
y = rect.get_y()
height = rect.get_height()
width = rect.get_width()

[out]:
Rectangle(xy=(0.6, 0), width=0.8, height=4, angle=0)
Rectangle(xy=(1.6, 0), width=0.8, height=5, angle=0)
Rectangle(xy=(2.6, 0), width=0.8, height=6, angle=0)

Sample Image

Extract and plot data from a file in python

You could read and plot as follows:

import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv('xyz.dat', sep=' ', header=None, usecols=(1, 2, 3)).T
df.plot(lw=0.5)
plt.show()

This would give you output:

matplotlib output

Your 3 rows contain very similar data.


If the number of columns is unknown, you could just read it in normally and drop the first CSV column (actually first row after being transposed):

df = pd.read_csv('xyz.dat', sep=' ', header=None).T.drop(0)
df.plot(lw=0.5)
plt.show()

A python function to extract data within a griddata plot

The first thing you need is a boolean mask of your circle.
Then you can apply it to grid_z to isolate all the values inside the circle and compute their mean.

A better approach is to use the mask directly on grid_x and grid_y to interpolate the function only on the needed points.

# Compute the mask of a circle
center_grid_x = 0.5
center_grid_y = 0.5
radius = 0.25
mask = (grid_x - center_grid_x) ** 2 + (grid_y - center_grid_y) ** 2 < radius **2

# The mask can be visualized with
# plt.imshow(mask)

# Apply the mask to grid_z and compute the mean
mean1 = np.mean(grid_z[mask])

# Or better compute only the values of points inside the circle
values_z = griddata(points, values, (grid_x[mask], grid_y[mask), method='linear')
mean2 = np.mean(values_z)


Related Topics



Leave a reply



Submit