Sorting columns in pandas dataframe based on column name
df = df.reindex(sorted(df.columns), axis=1)
This assumes that sorting the column names will give the order you want. If your column names won't sort lexicographically (e.g., if you want column Q10.3 to appear after Q9.1), you'll need to sort differently, but that has nothing to do with pandas.
Python/Pandas: Sort dataframe columns based on a column name
You can change order of columns like:
data = {'X1': ['11', '12'],
'X2': ['21', '22'],
'X3': ['31', '32']
}
df = pd.DataFrame(data)
df
X1 X2 X3
0 11 21 31
1 12 22 32
df = df.reindex(['X3','X1','X2'], axis=1)
df
X3 X1 X2
0 31 11 21
1 32 12 22
Note : You need to provide desired order.
You can create a function to change order by given column:
def sorter(desired, df):
columns = df.columns.tolist()
columns.remove(desired)
columns.insert(0,desired)
return df.reindex(columns, axis=1)
sorter('X2',df)
X2 X1 X3
0 21 11 31
1 22 12 32
Re-ordering columns in pandas dataframe based on column header names, where the name of the columns are string, contains the numeric number at the end
It will be :
df = df.reindex(sorted(df.columns[1:], key=lambda x: int(x.split('_')[-1][1:])), axis=1)
Sort a subset of columns of a pandas dataframe alphabetically by column name
You can split your your dataframe based on column names, using normal indexing operator []
, sort alphabetically the other columns using sort_index(axis=1)
, and concat
back together:
>>> pd.concat([df[['subject','timepoint']],
df[df.columns.difference(['subject', 'timepoint'])]\
.sort_index(axis=1)],ignore_index=False,axis=1)
subject timepoint a b c d
0 1 1 2 2 2 2
1 1 2 3 3 3 3
2 1 3 4 4 4 4
3 1 4 5 5 5 5
4 1 5 6 6 6 6
5 1 6 7 7 7 7
6 2 1 3 3 3 3
7 2 2 4 4 4 4
8 2 3 1 1 1 1
9 2 4 2 2 2 2
10 2 5 3 3 3 3
11 2 6 4 4 4 4
12 3 1 5 5 5 5
13 3 2 4 4 4 4
14 3 4 5 5 5 5
15 4 1 8 8 8 8
16 4 2 4 4 4 4
17 4 3 5 5 5 5
18 4 4 6 6 6 6
19 4 5 2 2 2 2
20 4 6 3 3 3 3
Reordering Pandas Columns based on Column name
This type of sorting is called natural sorting. (There are more details in Naturally sorting Pandas DataFrame which demonstrates how to sort rows using natsort)
Setup with natsort
import pandas as pd
from natsort import natsorted
df = pd.DataFrame(columns=[f'company_{i}' for i in [5, 2, 3, 4, 1, 10]])
# Before column sort
print(df)
df = df.reindex(natsorted(df.columns), axis=1)
# After column sort
print(df)
Before sort:
Empty DataFrame
Columns: [company_5, company_2, company_3, company_4, company_1, company_10]
Index: []
After sort:
Empty DataFrame
Columns: [company_1, company_2, company_3, company_4, company_5, company_10]
Index: []
Compared to lexicographic sorting with sorted
:
df = df.reindex(sorted(df.columns), axis=1)
Empty DataFrame
Columns: [company_1, company_10, company_2, company_3, company_4, company_5]
Index: []
Sorting pandas dataframe by column index instead of column name
sort_values
is not an indexer but a method. You use it with []
instead of ()
but it doesn't seem to be the problem.
If you want to sort your dataframe by the second column whatever the name, use:
>>> df.sort_values(df.columns[1])
name score
1 Joe 3
0 Mary 10
2 Jessie 13
Pandas: How to sort dataframe on columns with same column labels
You can create temporary helper columns copying the 2 columns by position using iloc
, sort by the temporary helper columns. Finally, drop the temporary helper columns, as follows:
df_test = df_test.assign(A=df_test.iloc[:, 0], B=df_test.iloc[:, 1]).sort_values(by=['A', 'B'], ascending=(False,False)).drop(['A', 'B'], axis=1)
Result:
print(df_test)
Region Region
2 San Francisco 4.0
4 San Francisco 1.0
1 Portland 1.0
0 Peninsula 2.0
3 Los Angeles 3.0
how to sort pandas dataframe from one column
Use sort_values
to sort the df by a specific column's values:
In [18]:
df.sort_values('2')
Out[18]:
0 1 2
4 85.6 January 1.0
3 95.5 February 2.0
7 104.8 March 3.0
0 354.7 April 4.0
8 283.5 May 5.0
6 238.7 June 6.0
5 152.0 July 7.0
1 55.4 August 8.0
11 212.7 September 9.0
10 249.6 October 10.0
9 278.8 November 11.0
2 176.5 December 12.0
If you want to sort by two columns, pass a list of column labels to sort_values
with the column labels ordered according to sort priority. If you use df.sort_values(['2', '0'])
, the result would be sorted by column 2
then column 0
. Granted, this does not really make sense for this example because each value in df['2']
is unique.
Related Topics
How to Locate Element of Credit Card Number Using Selenium Python
How to Add an Image or Icon to a Button Rectangle in Pygame
Why Is It String.Join(List) Instead of List.Join(String)
Purpose of "%Matplotlib Inline"
How to Install Psycopg2 with "Pip" on Python
Making Python Loggers Output All Messages to Stdout in Addition to Log File
How to Find the Location of Python Module Sources
How to Run Python Code from Sublime Text 2
Turn a String into a Valid Filename
Creating a Pandas Dataframe from a Numpy Array: How to Specify the Index Column and Column Headers
How to Get Indices of a Sorted Array in Python
Problem Http Error 403 in Python 3 Web Scraping
Python3: Importerror: No Module Named '_Ctypes' When Using Value from Module Multiprocessing
Loop That Also Accesses Previous and Next Values
How to Get My Program to Sleep for 50 Milliseconds