Right Way to Reverse a Pandas Dataframe

Right way to reverse a pandas DataFrame?


or simply:


will reverse your data frame, if you want to have a for loop which goes from down to up you may do:

for idx in reversed(data.index):
print(idx, data.loc[idx, 'Even'], data.loc[idx, 'Odd'])


for idx in reversed(data.index):
print(idx, data.Even[idx], data.Odd[idx])

You are getting an error because reversed first calls data.__len__() which returns 6. Then it tries to call data[j - 1] for j in range(6, 0, -1), and the first call would be data[5]; but in pandas dataframe data[5] means column 5, and there is no column 5 so it will throw an exception. ( see docs )

reverse dataframe's rows' order with pandas

Check out http://pandas.pydata.org/pandas-docs/stable/indexing.html

You can do

reversed_df = df.iloc[::-1]

Reversing the order of only the rows in Pandas Python

You can just use a reset_index to reset your index after you've reversed it.


Reverse rows in time series dataframe

NumPy has support for this via its datetime and timedelta data types.

First you reverse both columns in your time series as follows:

import pandas as pd
import numpy as np
df2 = df
df2 = df2.iloc[::-1]

where df is your original time series data and df2 (shown below) is the reversed time series.

    value   date
7 66.45 2017-04-25
6 65.36 2017-03-25
5 62.13 2017-03-05
4 63.84 2017-02-12
3 64.02 2017-02-05
2 63.88 2017-01-29
1 63.95 2017-01-22
0 63.85 2017-01-15

Next you find the day differences and store them as timedelta objects:

dates_np = np.array(df2.date).astype(np.datetime64)       # Convert dates to np.datetime64 ojects
timeDeltas = np.insert(abs(np.diff(dates_np)), 0, 0) # np.insert is to account for -1 length during np.diff call

d2 = {'value': df_reversed.value, 'day_diff': timeDeltas} # Create new dataframe (df3)
df3 = pd.DataFrame(data=d2)

where df3 (the day differences table) looks like this:

    value   day_diff
7 66.45 0 days
6 65.36 31 days
5 62.13 20 days
4 63.84 21 days
3 64.02 7 days
2 63.88 7 days
1 63.95 7 days
0 63.85 7 days

Lastly, to get back to dates accumulating from a start data, you do the following:

startDate = np.datetime64('2000-01-01')         # You can change this if you like
df4 = df2 # Copy coumn data from df2
df4.date = np.array(np.cumsum(df3.day_diff) + startDate # np.cumsum accumulates the day_diff sum

where df4 (the start date accumulation) looks like this:

    value   date
7 66.45 2000-01-01
6 65.36 2000-02-01
5 62.13 2000-02-21
4 63.84 2000-03-13
3 64.02 2000-03-20
2 63.88 2000-03-27
1 63.95 2000-04-03
0 63.85 2000-04-10

I noticed there is a 1-day discrepancy with my final table, however this is most likely due to the implementation of timedelta inclusivity/exluclusivity.

Reverse DataFrame column order

A solution close to what you have already tried is to use:

>>> football[football.columns[::-1]]
losses wins team year
0 5 11 Bears 2010
1 8 8 Bears 2011
2 6 10 Bears 2012
3 1 15 Packers 2011
4 5 11 Packers 2012
5 10 6 Lions 2010
6 6 10 Lions 2011
7 12 4 Lions 2012

football.columns[::-1] reverses the order of the DataFrame's sequence of columns, and football[...] reindexes the DataFrame using this new sequence.

A more succinct way to achieve the same thing is with the iloc indexer:

football.iloc[:, ::-1]

The first : means "take all rows", the ::-1 means step backwards through the columns.

The loc indexer mentioned in @PietroBattiston's answer works in the same way.

Pandas DataFrame reversed rolling window

based on link1, link2, link3 you would get away with df["column_name"][::-1] or something similar

Also: you can use .rolling(num).agg(["mean", "std"]) to go through dataframe once

Working With Series in Reverse Order (Latest First)

In [28]: s = pd.Series([20, 10, 30], ['c', 'a', 'b'])

In [29]: s
c 20
a 10
b 30
dtype: int64

Sorting on the index

In [30]: s.sort_index(ascending=False)
c 20
b 30
a 10
dtype: int64

sorting on the values

In [31]: s.sort()

In [32]: s[::-1]
b 30
c 20
a 10
dtype: int64

Related Topics

Leave a reply
