pandas how to swap or reorder columns
Two column Swapping
cols = list(df.columns)
a, b = cols.index('LastName'), cols.index('MiddleName')
cols[b], cols[a] = cols[a], cols[b]
df = df[cols]
Reorder column Swapping (2 swaps)
cols = list(df.columns)
a, b, c, d = cols.index('LastName'), cols.index('MiddleName'), cols.index('Contact'), cols.index('EmployeeID')
cols[a], cols[b], cols[c], cols[d] = cols[b], cols[a], cols[d], cols[c]
df = df[cols]
Swapping Multiple
Now it comes down to how you can play with list slices -
cols = list(df.columns)
cols = cols[1::2] + cols[::2]
df = df[cols]
Set order of columns in pandas dataframe
Just select the order yourself by typing in the column names. Note the double brackets:
frame = frame[['column I want first', 'column I want second'...etc.]]
Change order of columns in pandas dataframes in a loop
I've spent a while on it, it actually gave me a nice puzzle.
It works this way, because in your first loop you modify the existing objects, but in the second loop you actually create new objects and overwrite the old ones; by that the list dfs
loses its references to df1
and df2
. If you want the code to work in the way that after second loop you'd like to see the changes applied to df1
and df2
, you can only use methods, that operate on the original dataframe and do not require overwriting.
I'm not convinced that my way is the optimal one, but that's what I mean:
import numpy as np
import pandas as pd
df1 = pd.DataFrame(np.random.rand(5, 5))
df2 = pd.DataFrame(np.random.rand(5, 5))
dfs = [ df1, df2 ]
for df in dfs:
df.columns = [ 'a', 'b', 'c', 'd', 'e' ]
for df in dfs:
for c in ['e', 'd', 'c', 'b', 'a']:
df.insert(df.shape[1],c+'_new',df[c])
#df.drop(['e', 'd', 'c', 'b', 'a'], axis=1)
for c in [ 'a', 'b', 'c', 'd', 'e' ]:
del df[c]
df.columns = ['e', 'd', 'c', 'b', 'a']
Then calling df1
prints:
e d c b a
0 0.550885 0.879557 0.202626 0.218867 0.266057
1 0.344012 0.767083 0.139642 0.685141 0.559385
2 0.271689 0.247322 0.749676 0.903162 0.680389
3 0.643675 0.317681 0.217223 0.776192 0.665542
4 0.480441 0.981850 0.558303 0.780569 0.484447
How to rearrange Pandas column sequence?
def _col_seq_set(df, col_list, seq_list):
''' set dataframe 'df' col_list's sequence by seq_list '''
col_not_in_col_list = [x for x in list(df.columns) if x not in col_list]
for i in range(len(col_list)):
col_not_in_col_list.insert(seq_list[i], col_list[i])
return df[col_not_in_col_list]
DataFrame.col_seq_set = _col_seq_set
Moving a dataframe column and changing column order
Use this :
df = df[['date','A','B','C','D','E','F','G','H','F','I']]
--- Edit
columnsName = list(df.columns)
F, H = columnsName.index('F'), columnsName.index('H')
columnsName[F], columnsName[H] = columnsName[H],columnsName[F]
df = df[columnsName]
How to change the column order according to the category list?
Your pivot
seems incorrect, you should fix the parameters and combine with sort_index
on axis=1
:
df2 = (df
.pivot(index='groups', columns='cats', values='value_name')
.sort_index(axis=1)
)
output:
cats A+ A B+ B -
groups
group1 0.12 0.02 0.25 0.00 0.04
group2 0.30 0.05 0.04 0.09 0.00
group3 NaN NaN NaN NaN 0.13
You can check that you have an ordered CategoricalIndex as column:
df2.columns
CategoricalIndex(['A+', 'A', 'B+', 'B', '-'],
categories=['A+', 'A', 'B+', 'B', 'C', '-'],
ordered=True, dtype='category', name='cats')
Pandas dataframe, change column order using reindex does give expected result in for loop
This is because df in your for loop is a local variable. When you do df.loc[:,'col3']=[5,6]
, you do a modification to the thing df
references, which therefore affects df1
. However, doingdf.reindex(['col3','col2','col1'],axis=1)
does not modify the original DataFrame but creates a new copy of it, which is then assigned to the local variable df
inside the for loop. However, df1
and df2
remain unchanged. To see this, you can try printing df
at the end of the for loop. It should print the desired value you want for df2
(with the reindexing)
Python Pandas: Is Order Preserved When Using groupby() and agg()?
See this enhancement issue
The short answer is yes, the groupby will preserve the orderings as passed in. You can prove this by using your example like this:
In [20]: df.sort_index(ascending=False).groupby('A').agg([np.mean, lambda x: x.iloc[1] ])
Out[20]:
B C
mean <lambda> mean <lambda>
A
group1 11.0 10 101 100
group2 17.5 10 175 100
group3 11.0 10 101 100
This is NOT true for resample however as it requires a monotonic index (it WILL work with a non-monotonic index, but will sort it first).
Their is a sort=
flag to groupby, but this relates to the sorting of the groups themselves and not the observations within a group.
FYI: df.groupby('A').nth(1)
is a safe way to get the 2nd value of a group (as your method above will fail if a group has < 2 elements)
Related Topics
Retrieve Links from Web Page Using Python and Beautifulsoup
How to Find the Cumulative Sum of Numbers in a List
Converting Between Datetime, Timestamp and Datetime64
Is There Any Pythonic Way to Combine Two Dicts (Adding Values For Keys That Appear in Both)
Finding and Replacing Elements in a List
Extract File Name from Path, No Matter What the Os/Path Format
Performant Cartesian Product (Cross Join) With Pandas
Set Value For Particular Cell in Pandas Dataframe Using Index
Assign Output of Os.System to a Variable and Prevent It from Being Displayed on the Screen
Concatenate Strings from Several Rows Using Pandas Groupby
How to Select a Variable by (String) Name
How to Write the Fibonacci Sequence
What Is Getattr() Exactly and How to Use It
Store Output of Subprocess.Popen Call in a String
How to Get the Value of a Variable Given Its Name in a String