How to sort a Pandas DataFrame by index?
Dataframes have a sort_index
method which returns a copy by default. Pass inplace=True
to operate in place.
import pandas as pd
df = pd.DataFrame([1, 2, 3, 4, 5], index=[100, 29, 234, 1, 150], columns=['A'])
df.sort_index(inplace=True)
print(df.to_string())
Gives me:
A
1 4
29 2
100 1
150 5
234 3
Sort pandas dataframe on index that is string+digits
Onne idea is convert index
to Series
with Series.str.split
to DataFrame
, convert second column to integers and sorting both columns and this index is used for change order in original df.index
by DataFrame.reindex
:
df1 = df.index.to_series().str.split('_',expand=True)
df1[1] = df1[1].astype(int)
df1 = df1.sort_values([0, 1], ascending=[True, False])
print (df1)
0 1
index
A_100 A 100
A_60 A 60
A_30 A 30
B_100 B 100
B_60 B 60
B_30 B 30
df = df.reindex(df1.index)
print (df)
vals
index
A_100 0
A_60 12
A_30 13
B_100 12
B_60 6
B_30 6
Sort Index by list - Python Pandas
Unexpected NaNs after reindexing are often due to the new index labels not exactly matching the old index labels. For example, if the original index labels contains whitespaces, but the new labels don't, then you'll get NaNs:
import numpy as np
import pandas as pd
df = pd.DataFrame({'col':[1,2,3]}, index=['April ', 'June ', 'May ', ])
print(df)
# col
# April 1
# June 2
# May 3
df2 = df.reindex(['April', 'May', 'June'])
print(df2)
# col
# April NaN
# May NaN
# June NaN
This can be fixed by removing the whitespace to make the labels match:
df.index = df.index.str.strip()
df3 = df.reindex(['April', 'May', 'June'])
print(df3)
# col
# April 1
# May 3
# June 2
How to sort pandas dataframe by custom order on string index
Just reindex
df.reindex(reorderlist)
Out[89]:
Age G Tm Year id
Player
Maurice Baker 25 7 VAN 2004 5335
Adrian Caldwell 31 81 DAL 1997 6169
Ratko Varda 22 60 TOT 2001 13950
Ryan Bowen 34 52 OKC 2009 6141
Cedric Hunter 27 6 CHH 1991 2967
Update info you have multiple players with same name
out = df.iloc[pd.Categorical(df.index,reorderlist).argsort()]
Sort pandas dataframe by index then by alphabetical order
According to the official documentation, you can pass the index name into sort_values
:
df.sort_values(['index','values'])
Output:
values
index
0 a
0 c
1 b
2 d
Fun: You can also sort by values, then sort again by index with a stable algorithm:
df.sort_values('values').sort_index(kind='mergesort')
Pandas: how to sort dataframe by column AND by index
Using lexsort
from numpy may be other way and little faster as well:
df.iloc[np.lexsort((df.index, df.A.values))] # Sort by A.values, then by index
Result:
A
3 2
4 4
6 4
5 5
2 6
Comparing with timeit
:
%%timeit
df.iloc[np.lexsort((df.index, df.A.values))] # Sort by A.values, then by index
Result:
1000 loops, best of 3: 278 µs per loop
With reset index and set index again:
%%timeit
df.reset_index().sort_values(by=['A','index']).set_index('index')
Result:
100 loops, best of 3: 2.09 ms per loop
Sorting pandas dataframe by column index instead of column name
sort_values
is not an indexer but a method. You use it with []
instead of ()
but it doesn't seem to be the problem.
If you want to sort your dataframe by the second column whatever the name, use:
>>> df.sort_values(df.columns[1])
name score
1 Joe 3
0 Mary 10
2 Jessie 13
Pandas DataFrame sorting issues by value and index
IIUC, You could do sort_values
after resetting the index so it sorts on both the Date
col and the index
(Date ascending and Index descending)
lookForwardData=lookForwardData.append(lookForwardDataShell,ignore_index=True)
output = (lookForwardData.reset_index()
.sort_values(['Date','index'],ascending=[True,False]).set_index("index"))
Sort pandas Series both on values and index
You can first sort the index, then sort the values with a stable algorithm:
s.sort_index().sort_values(ascending=False, kind='stable')
output:
f 7
l 6
b 5
d 3
n 3
a 2
g 2
t 1
z 1
dtype: int64
used input:
s = pd.Series({'a': 2, 'b': 5, 'd': 3, 'z': 1, 't': 1, 'g': 2, 'n': 3, 'l': 6, 'f': 7})
Related Topics
Replace Invalid Values with None in Pandas Dataframe
Convert a List of Tuples to a List of Lists
Check If a Given Key Already Exists in a Dictionary and Increment It
How to Initialize the Base (Super) Class
How to Force/Ensure Class Attributes Are a Specific Type
Define a Method Outside of Class Definition
Curses Alternative for Windows
Django Rest Framework Post Nested Objects
Iterate Over Individual Bytes in Python 3
How to Check If There Exists a Process with a Given Pid in Python
Format String Unused Named Arguments
Finding Moving Average from Data Points in Python
How to Quickly Estimate the Distance Between Two (Latitude, Longitude) Points