How to Change the Order of Dataframe Columns

pandas how to swap or reorder columns

Two column Swapping

cols = list(df.columns)
a, b = cols.index('LastName'), cols.index('MiddleName')
cols[b], cols[a] = cols[a], cols[b]
df = df[cols]

Reorder column Swapping (2 swaps)

cols = list(df.columns)
a, b, c, d = cols.index('LastName'), cols.index('MiddleName'), cols.index('Contact'), cols.index('EmployeeID')
cols[a], cols[b], cols[c], cols[d] = cols[b], cols[a], cols[d], cols[c]
df = df[cols]

Swapping Multiple

Now it comes down to how you can play with list slices -

cols = list(df.columns)
cols = cols[1::2] + cols[::2]
df = df[cols]

Set order of columns in pandas dataframe

Just select the order yourself by typing in the column names. Note the double brackets:

frame = frame[['column I want first', 'column I want second'...etc.]]

Change order of columns in pandas dataframes in a loop

I've spent a while on it, it actually gave me a nice puzzle.

It works this way, because in your first loop you modify the existing objects, but in the second loop you actually create new objects and overwrite the old ones; by that the list dfs loses its references to df1 and df2. If you want the code to work in the way that after second loop you'd like to see the changes applied to df1 and df2, you can only use methods, that operate on the original dataframe and do not require overwriting.

I'm not convinced that my way is the optimal one, but that's what I mean:

import numpy as np
import pandas as pd

df1 = pd.DataFrame(np.random.rand(5, 5))
df2 = pd.DataFrame(np.random.rand(5, 5))

dfs = [ df1, df2 ]

for df in dfs:
df.columns = [ 'a', 'b', 'c', 'd', 'e' ]

for df in dfs:
for c in ['e', 'd', 'c', 'b', 'a']:
df.insert(df.shape[1],c+'_new',df[c])
#df.drop(['e', 'd', 'c', 'b', 'a'], axis=1)
for c in [ 'a', 'b', 'c', 'd', 'e' ]:
del df[c]
df.columns = ['e', 'd', 'c', 'b', 'a']

Then calling df1 prints:

           e           d           c           b           a
0 0.550885 0.879557 0.202626 0.218867 0.266057
1 0.344012 0.767083 0.139642 0.685141 0.559385
2 0.271689 0.247322 0.749676 0.903162 0.680389
3 0.643675 0.317681 0.217223 0.776192 0.665542
4 0.480441 0.981850 0.558303 0.780569 0.484447

How to rearrange Pandas column sequence?

def _col_seq_set(df, col_list, seq_list):
''' set dataframe 'df' col_list's sequence by seq_list '''
col_not_in_col_list = [x for x in list(df.columns) if x not in col_list]
for i in range(len(col_list)):
col_not_in_col_list.insert(seq_list[i], col_list[i])

return df[col_not_in_col_list]
DataFrame.col_seq_set = _col_seq_set

Moving a dataframe column and changing column order

Use this :

df = df[['date','A','B','C','D','E','F','G','H','F','I']]

--- Edit

columnsName = list(df.columns)
F, H = columnsName.index('F'), columnsName.index('H')
columnsName[F], columnsName[H] = columnsName[H],columnsName[F]
df = df[columnsName]

How to change the column order according to the category list?

Your pivot seems incorrect, you should fix the parameters and combine with sort_index on axis=1:

df2 = (df
.pivot(index='groups', columns='cats', values='value_name')
.sort_index(axis=1)
)

output:

cats      A+     A    B+     B     -
groups
group1 0.12 0.02 0.25 0.00 0.04
group2 0.30 0.05 0.04 0.09 0.00
group3 NaN NaN NaN NaN 0.13

You can check that you have an ordered CategoricalIndex as column:

df2.columns
CategoricalIndex(['A+', 'A', 'B+', 'B', '-'],
categories=['A+', 'A', 'B+', 'B', 'C', '-'],
ordered=True, dtype='category', name='cats')

Pandas dataframe, change column order using reindex does give expected result in for loop

This is because df in your for loop is a local variable. When you do df.loc[:,'col3']=[5,6], you do a modification to the thing df references, which therefore affects df1. However, doing
df.reindex(['col3','col2','col1'],axis=1) does not modify the original DataFrame but creates a new copy of it, which is then assigned to the local variable df inside the for loop. However, df1 and df2 remain unchanged. To see this, you can try printing df at the end of the for loop. It should print the desired value you want for df2 (with the reindexing)

Python Pandas: Is Order Preserved When Using groupby() and agg()?

See this enhancement issue

The short answer is yes, the groupby will preserve the orderings as passed in. You can prove this by using your example like this:

In [20]: df.sort_index(ascending=False).groupby('A').agg([np.mean, lambda x: x.iloc[1] ])
Out[20]:
B C
mean <lambda> mean <lambda>
A
group1 11.0 10 101 100
group2 17.5 10 175 100
group3 11.0 10 101 100

This is NOT true for resample however as it requires a monotonic index (it WILL work with a non-monotonic index, but will sort it first).

Their is a sort= flag to groupby, but this relates to the sorting of the groups themselves and not the observations within a group.

FYI: df.groupby('A').nth(1) is a safe way to get the 2nd value of a group (as your method above will fail if a group has < 2 elements)



Related Topics



Leave a reply



Submit