Move a Column to First Position in a Data Frame

Move column by name to front of table in pandas

We can use loc to reorder by passing a list:

In [27]:
# get a list of columns
cols = list(df)
# move the column to head of list using index, pop and insert
cols.insert(0, cols.pop(cols.index('Mid')))
cols
Out[27]:
['Mid', 'Net', 'Upper', 'Lower', 'Zsore']
In [28]:
# use ix to reorder
df = df.loc[:, cols]
df
Out[28]:
Mid Net Upper Lower Zsore
Answer_option
More_than_once_a_day 2 0% 0.22% -0.12% 65
Once_a_day 3 0% 0.32% -0.19% 45
Several_times_a_week 4 2% 2.45% 1.10% 78
Once_a_week 6 1% 1.63% -0.40% 65

Another method is to take a reference to the column and reinsert it at the front:

In [39]:
mid = df['Mid']
df.drop(labels=['Mid'], axis=1,inplace = True)
df.insert(0, 'Mid', mid)
df
Out[39]:
Mid Net Upper Lower Zsore
Answer_option
More_than_once_a_day 2 0% 0.22% -0.12% 65
Once_a_day 3 0% 0.32% -0.19% 45
Several_times_a_week 4 2% 2.45% 1.10% 78
Once_a_week 6 1% 1.63% -0.40% 65

You can, with very early versions of Pandas, also use ix to achieve the same results:

df = df.ix[:, cols]

But ix was deprecated from pandas 0.20.0 onwards and was discontinued as of Pandas 1.0.

Move a column to first position in a data frame

I don't know if it's worth adding this as an answer or if a comment would be fine, but I wrote a function called moveme that lets you do what you want to do with the language you describe. You can find the function at this answer: https://stackoverflow.com/a/18540144/1270695

It works on the names of your data.frame and produces a character vector that you can use to reorder your columns:

mydf <- data.frame(matrix(1:12, ncol = 4))
mydf
moveme(names(mydf), "X4 first")
# [1] "X4" "X1" "X2" "X3"
moveme(names(mydf), "X4 first; X1 last")
# [1] "X4" "X2" "X3" "X1"

mydf[moveme(names(mydf), "X4 first")]
# X4 X1 X2 X3
# 1 10 1 4 7
# 2 11 2 5 8
# 3 12 3 6 9

If you're shuffling things around like this, I suggest converting your data.frame to a data.table and using setcolorder (with my moveme function, if you wish) to make the change by reference.


In your question, you also mentioned "I just want to pick one column and move it to the start". If it's an arbitrary column, and not specifically the last one, you could also look at using setdiff.

Imagine you're working with the "mtcars" dataset and want to move the "am" column to the start.

x <- "am"
mtcars[c(x, setdiff(names(mtcars), x))]

move column in pandas dataframe

You can rearrange columns directly by specifying their order:

df = df[['a', 'y', 'b', 'x']]

In the case of larger dataframes where the column titles are dynamic, you can use a list comprehension to select every column not in your target set and then append the target set to the end.

>>> df[[c for c in df if c not in ['b', 'x']] 
+ ['b', 'x']]
a y b x
0 1 -1 2 3
1 2 -2 4 6
2 3 -3 6 9
3 4 -4 8 12

To make it more bullet proof, you can ensure that your target columns are indeed in the dataframe:

cols_at_end = ['b', 'x']
df = df[[c for c in df if c not in cols_at_end]
+ [c for c in cols_at_end if c in df]]

Moving a dataframe column and changing column order

Use this :

df = df[['date','A','B','C','D','E','F','G','H','F','I']]

--- Edit

columnsName = list(df.columns)
F, H = columnsName.index('F'), columnsName.index('H')
columnsName[F], columnsName[H] = columnsName[H],columnsName[F]
df = df[columnsName]

How to move a column in a pandas dataframe

Correcting your error

I'm not sure how to interpret this error because it actually should be
a list, right?

No: colnames[0] and colnames[4] are scalars, not lists. You can't concatenate a scalar with a list. To make them lists, use square brackets:

colnames = [colnames[0]] + [colnames[4]] + colnames[:-1]

You can either use df[[colnames]] or df.reindex(columns=colnames): both necessarily trigger a copy operation as this transformation cannot be processed in place.

Generic solution

But converting arrays to lists and then concatenating lists manually is not only expensive, but prone to error. A related answer has many list-based solutions, but a NumPy-based solution is worthwhile since pd.Index objects are stored as NumPy arrays.

The key here is to modify the NumPy array via slicing rather than concatenation. There are only 2 cases to handle: when the desired position exists after the current position, and vice versa.

import pandas as pd, numpy as np
from string import ascii_uppercase

df = pd.DataFrame(columns=list(ascii_uppercase))

def shifter(df, col_to_shift, pos_to_move):
arr = df.columns.values
idx = df.columns.get_loc(col_to_shift)
if idx == pos_to_move:
pass
elif idx > pos_to_move:
arr[pos_to_move+1: idx+1] = arr[pos_to_move: idx]
else:
arr[idx: pos_to_move] = arr[idx+1: pos_to_move+1]
arr[pos_to_move] = col_to_shift
df = df.reindex(columns=arr)
return df

df = df.pipe(shifter, 'J', 1)

print(df.columns)

Index(['A', 'J', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'K', 'L', 'M', 'N',
'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'],
dtype='object')

Performance benchmarking

Using NumPy slicing is more efficient with a large number of columns versus a list-based method:

n = 10000
df = pd.DataFrame(columns=list(range(n)))

def shifter2(df, col_to_shift, pos_to_move):
cols = df.columns.tolist()
cols.insert(pos_to_move, cols.pop(df.columns.get_loc(col_to_shift)))
df = df.reindex(columns=cols)
return df

%timeit df.pipe(shifter, 590, 5) # 381 µs
%timeit df.pipe(shifter2, 590, 5) # 1.92 ms

Shift column in pandas dataframe up by one?

In [44]: df['gdp'] = df['gdp'].shift(-1)

In [45]: df
Out[45]:
y gdp cap
0 1 3 5
1 2 7 9
2 8 4 2
3 3 7 7
4 6 NaN 7

In [46]: df[:-1]
Out[46]:
y gdp cap
0 1 3 5
1 2 7 9
2 8 4 2
3 3 7 7

Move column level to top in multi column index pandas DataFrame

Here is a generic function to move a column level (by label or index) to the top:

def move_top(df, col, inplace=False):
if col in df.columns.names:
idx = df.columns.names.index(col)
elif isinstance(col, int) and 0 < col < len(df.columns.names):
idx = col
else:
raise IndexError(f'invalid index "{col}"')
order = list(range(len(df.columns.names)))
order.pop(idx)
order = [idx]+order
if inplace:
df.columns = df.columns.reorder_levels(order=order)
else:
return df.reorder_levels(order, axis=1)

move_top(df, 'known_level_name')

output:

known_level_name         x1            
unknown_level_name_0 a1
unknown_level_name_1 b1 b2
unknown_level_name_last c1 c1 c1 c1
0 33 30 23 77
1 10 73 80 33
2 7 54 52 9
3 71 99 22 22
4 83 15 86 40

position or move pandas column to a specific column index

How to proceed:

  1. store the names of columns in a list;
  2. swap the names in that list;
  3. apply the new order on the dataframe.

code:

l = list(df)

i1, i2 = l.index('Col2'), l.index('Col3')
l[i2], l[i1] = l[i1], l[i2]

df = df[l]


Related Topics



Leave a reply



Submit