Use Index in Pandas to Plot Data

Use index in pandas to plot data

You can use reset_index to turn the index back into a column:

monthly_mean.reset_index().plot(x='index', y='A')

Look at monthly_mean.reset_index() by itself- the date is no longer in the index, but is a column in the dataframe, which is now just indexed by integers. If you look at the documentation for reset_index, you can get a bit more control over the process, including assigning sensible names to the index.

How to plot the index column in pandas/matplotlib?

I modify my answer with your feedback reproducing more accurate the issue.

With this:

df_topex = pd.read_csv('datasets/TOPEX.dat', 
sep='\s+', #multiple spaces as separator
index_col=0, #convert first column to index
names=["Time", "Anomaly"], #naming the headers
)

You've got something like this, where the column "Time" is the index:

    Time    Anomaly
--------- ---------
1992.9595 2.0000
1992.9866 3.0000
1993.0138 4.0000
1993.0409 5.0000
1993.0681 6.0000
1993.0952 7.0000

To plot it, we can do the following as you say, but just fyi there is an issue with this method (https://github.com/pandas-dev/pandas/issues/16529 but for now not a big deal):

df_topex.reset_index(inplace=True)
tabulate_df(df_topex)

It could be safer:

df_topex = df_topex.reset_index()

Anyway, we have "Time" as column ready to be used in a plot (I point that "Time" seems to me not having time format):

            Time    Anomaly
------ --------- ---------
0 1992.9595 2.0000
1 1992.9866 3.0000
2 1993.0138 4.0000
3 1993.0409 5.0000
4 1993.0681 6.0000
5 1993.0952 7.0000

To plot it:

df_topex.plot(kind='scatter', x='Time', y='Anomaly', color='red')

Then let's think following your last question: well... We've got the plot, but now we can't make use of the advantages of using "Time" as index, isn't it?

Index have significative performance impact when filtering millions of rows. Maybe you are interested in use "Time" column as index because you have or foresee high volumen. Plotting million of points can be done (data shading for example) but is not very common. Filtering any DataFrame before plotting it is quite common, and at that point, having indexed the column to filter can really help, after that normally comes the plot.

So we can work in phases with different DataFrames, or altogether doing the following after the csv import operation, that is, keeping the index to play with it and plot over the Time2 column at any time:

df_topex['Time2'] = df_topex.index

So we keep "Time" as index:

    Time    Anomaly      Time2
--------- --------- ---------
1992.9595 2.0000 1992.9595
1992.9866 3.0000 1992.9866
1993.0138 4.0000 1993.0138
1993.0409 5.0000 1993.0409
1993.0681 6.0000 1993.0681
1993.0952 7.0000 1993.0952

How to take advantage of indexing?
Nice post in which mensures the performance on filtering over the index: What is the performance impact of non-unique indexes in pandas?

In short, you're interested in having a unique index or at least sorted.

# Performance preference in index type to filtering tasks: 
# 1) unique
# 2) if not unique, at least sorted (monotonic increase o decrease)
# 3) Worst combination: non-unique and unsorted.

# Let's check:
print ("Is unique?", df_topex.index.is_unique)
print ("Is is_monotonic increasing?", df_topex.index.is_monotonic_increasing)
print ("Is is_monotonic decreasing?", df_topex.index.is_monotonic_decreasing)

From the sample data:

Is unique? True
Is is_monotonic increasing? True
Is is_monotonic decreasing? False

If not sorted, you can perform the ordering task by:

df_topex = df_topex.sort_index()
# Ready to go on filtering...

Hope it helps.

Scatter plot form dataframe with index on x-axis

This is kind of ugly (I think the matplotlib solution you used in your question is better, FWIW), but you can always create a temporary DataFrame with the index as a column usinng

df.reset_index()

If the index was nameless, the default name will be 'index'. Assuming this is the case, you could use

df.reset_index().plot(kind='scatter', x='index', y='columnA')

How to plot graph against index in python

To add on to the discussion, I am going to demonstrate three plotting methods (from most straightforward to least) using the first five points of your data frame.

Import libraries

import pandas as pd
import matplotlib.pylab as plt

The sample dataframe

         1     2     3   4
foo
t0 95.00 95.5 75.5 85
t1 95.75 95.5 75.5 85
t2 96.50 95.5 75.5 85
t3 96.50 95.5 75.5 85
t4 96.50 95.5 75.5 85

Method 1

df.plot()
plt.show()

Method 2

df.reset_index().plot(x='foo')
plt.show()

Method 3

plt.plot(df.index,df)
plt.legend(df.columns)
plt.xlabel(df.index.name)
plt.show()

Output (the same for all methods)

Sample Image

Using a Pandas dataframe index as values for x-axis in matplotlib plot

It seems the issue was that I had .values. Without it (i.e. site2.index) the graph displays correctly.

Can I make a pie chart based on indexes in Python?

You can use matplotlib to plot the pie chart using dataframe and its indexes as labels of the chart:

import matplotlib.pyplot as plt
import pandas as pd
data = ['percentage':["70%"], ["20%"], ["10%"]]
example = pd.DataFrame(data, columns = ['percentage'])
my_labels = 'Lasiogl', 'Centella', 'Osmia'
plt.pie(example,labels=my_labels,autopct='%1.1f%%')
plt.show()

DataFrame: how to draw a 3D graph using Index and Columns as x and y, and data as z?

You need to create a meshgrid from your row and col indices and then you can use plot_surface:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame([[150, 120, 170], [190, 160, 130]],
index=[2, 4], columns=[10, 30, 70])
x,y = np.meshgrid(df.columns, df.index)
fig, ax = plt.subplots(subplot_kw={"projection": "3d"})
ax.plot_surface(x, y, df)

Sample Image



Related Topics



Leave a reply



Submit