How to Sort a Dataframe in Python Pandas by Two or More Columns

How to sort a dataFrame in python pandas by two or more columns?

As of the 0.17.0 release, the sort method was deprecated in favor of sort_values. sort was completely removed in the 0.20.0 release. The arguments (and results) remain the same:

df.sort_values(['a', 'b'], ascending=[True, False])

You can use the ascending argument of sort:

df.sort(['a', 'b'], ascending=[True, False])

For example:

In [11]: df1 = pd.DataFrame(np.random.randint(1, 5, (10,2)), columns=['a','b'])

In [12]: df1.sort(['a', 'b'], ascending=[True, False])
Out[12]:
a b
2 1 4
7 1 3
1 1 2
3 1 2
4 3 2
6 4 4
0 4 3
9 4 3
5 4 1
8 4 1

As commented by @renadeen

Sort isn't in place by default! So you should assign result of the sort method to a variable or add inplace=True to method call.

that is, if you want to reuse df1 as a sorted DataFrame:

df1 = df1.sort(['a', 'b'], ascending=[True, False])

or

df1.sort(['a', 'b'], ascending=[True, False], inplace=True)

How to sort a Pandas DataFrame with multiple columns, some in ascending order and others descending order?

The DataFrame.sort_values method can handle this very easily. Just use the ascending argument and provide a list of boolean values.

import pandas as pd

my_df = pd.DataFrame({'col1':['a','a','a','a','b','b','b','b','c','c','c','c'],
'col2':[1,1,2,2,1,1,2,2,1,1,2,2],
'col3':[1,2,1,2,1,2,1,2,1,2,1,2]})

my_df = my_df.sort_values(by=['col1','col2','col3'],
ascending=[False, False, True])

Note that the list provided in the ascending argument must have the same length as the one provided in the by argument.

How to sort dataframe rows by multiple columns

Use sort_values, which can accept a list of sorting targets. In this case it sounds like you want to sort by S/N, then Dis, then Rate:

df = df.sort_values(['S/N', 'Dis', 'Rate'])

# S/N Dis Rate
# 0 332 4.6030 91.204062
# 3 332 9.1985 76.212943
# 6 332 14.4405 77.664282
# 9 332 20.2005 76.725955
# 12 332 25.4780 31.597510
# 15 332 30.6670 74.096975
# 1 445 5.4280 60.233917
# 4 445 9.7345 31.902842
# 7 445 14.6015 36.261851
# 10 445 19.8630 40.705467
# 13 445 24.9050 4.897008
# 16 445 30.0550 35.217889
# ...

Pandas: Sort a dataframe based on multiple columns

You can swap columns in list and also values in ascending parameter:

Explanation:

Order of columns names is order of sorting, first sort descending by Employee_Count and if some duplicates in Employee_Count then sorting by Department only duplicates rows ascending.

df1 = df.sort_values(['Employee_Count', 'Department'], ascending=[False, True])
print (df1)
Department Employee_Count
4 xyz 15
2 bca 11
0 abc 10 <-
1 adc 10 <-
3 cde 9

Or for test if use second False then duplicated rows are sorting descending:

df2 = df.sort_values(['Employee_Count', 'Department',],ascending=[False, False])
print (df2)
Department Employee_Count
4 xyz 15
2 bca 11
1 adc 10 <-
0 abc 10 <-
3 cde 9

How to sort a Pandas DataFrame by multiple columns in Python with ordered number

There is an issue when you write:

ordered = trade_value_df.sort_values(by=['Trade Value'], ascending=False,ignore_index=True)

You are assigning something new to the name ordered, so you're effectively losing the dataframe you had previously assigned to that name.

A possibility is to do all the operations on the same dataframe, rather than have multiple dataframe:

import pandas as pd

df = pd.DataFrame({'Code':['Apple', 'Amazon', 'Facebook', 'Samsung'], 'Volume':[500, 1000, 250, 100], 'Trade Value': [1000, 500, 750, 1500]})

df = df.sort_values(by=['Volume'], ascending=False,ignore_index=True)
df['Volume Order'] = df.index + 1

df = df.sort_values(by=['Trade Value'], ascending=False,ignore_index=True)
df['Trade Order'] = df.index + 1

print(df)
# Code Volume Trade Value Volume Order Trade Order
# 0 Samsung 100 1500 4 1
# 1 Apple 500 1000 2 2
# 2 Facebook 250 750 3 3
# 3 Amazon 1000 500 1 4

Pandas: Sort by sum of 2 columns

This does the trick:

data = {"Column 1": [1, 3, 1], "Column 2": [1, 2, 3]}
df = pd.DataFrame(data)

sorted_indices = (df["Column 1"] + df["Column 2"]).sort_values().index

df.loc[sorted_indices, :]

I just created a series that has the sum of both the columns, sorted it, got the sorted indices, and printed those indices out for the dataframe.

(I changed the data a little so you can see the sorting in action. Using the data you provided, you wouldn't have been able to see the sorted data as it would have been the same as the original one.)

how to sort multiple columns in an order in pandas

For me working if sorting in one sort_values method:

df = df.sort_values(by=["A", "C", 'B'], ascending=[True, False, True], ignore_index=True)
print (df)
A B C
0 a c0 170
1 b c0 170
2 c c0 88
3 d c1 12
4 e c1 28
5 f c2 160
6 g c2 37
7 h c1 12
8 i c2 160
9 j c0 88

If change order of sorting ouput is correct:

df = df.sort_values(by=["B", "C", 'A'], ascending=[True, False, True], ignore_index=True)
print (df)
A B C
0 a c0 170
1 b c0 170
2 c c0 88
3 j c0 88
4 e c1 28
5 d c1 12
6 h c1 12
7 f c2 160
8 i c2 160
9 g c2 37


Related Topics



Leave a reply



Submit