How to Add New Column to an Dataframe (To the Front Not End)

how do I insert a column at a specific column index in pandas?

see docs: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.insert.html

using loc = 0 will insert at the beginning

df.insert(loc, column, value)

df = pd.DataFrame({'B': [1, 2, 3], 'C': [4, 5, 6]})

df
Out:
B C
0 1 4
1 2 5
2 3 6

idx = 0
new_col = [7, 8, 9] # can be a list, a Series, an array or a scalar
df.insert(loc=idx, column='A', value=new_col)

df
Out:
A B C
0 7 1 4
1 8 2 5
2 9 3 6

Insert a column at the beginning (leftmost end) of a DataFrame

DataFrame.insert

df = pd.DataFrame({'A': ['x'] * 3, 'B': ['x'] * 3})
df

A B
0 x x
1 x x
2 x x

seq = ['a', 'b', 'c']

# This works in-place.
df.insert(0, 'C', seq)
df

C A B
0 a x x
1 b x x
2 c x x

pd.concat

df = pd.concat([pd.Series(seq, index=df.index, name='C'), df], axis=1)
df

C A B
0 a x x
1 b x x
2 c x x

DataFrame.reindex + assign

Reindex first, then assign will remember the position of the original column.

df.reindex(['C', *df.columns], axis=1).assign(C=seq)

C A B
0 a x x
1 b x x
2 c x x

Add column with constant value to pandas dataframe

The reason this puts NaN into a column is because df.index and the Index of your right-hand-side object are different. @zach shows the proper way to assign a new column of zeros. In general, pandas tries to do as much alignment of indices as possible. One downside is that when indices are not aligned you get NaN wherever they aren't aligned. Play around with the reindex and align methods to gain some intuition for alignment works with objects that have partially, totally, and not-aligned-all aligned indices. For example here's how DataFrame.align() works with partially aligned indices:

In [7]: from pandas import DataFrame

In [8]: from numpy.random import randint

In [9]: df = DataFrame({'a': randint(3, size=10)})

In [10]:

In [10]: df
Out[10]:
a
0 0
1 2
2 0
3 1
4 0
5 0
6 0
7 0
8 0
9 0

In [11]: s = df.a[:5]

In [12]: dfa, sa = df.align(s, axis=0)

In [13]: dfa
Out[13]:
a
0 0
1 2
2 0
3 1
4 0
5 0
6 0
7 0
8 0
9 0

In [14]: sa
Out[14]:
0 0
1 2
2 0
3 1
4 0
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
Name: a, dtype: float64

Python: how to add a column to a pandas dataframe between two columns?

You can use insert:

df.insert(4, 'new_col_name', tmp)

Note: The insert method mutates the original DataFrame and does not return a copy.

If you use df = df.insert(4, 'new_col_name', tmp), df will be None.

How to add an empty column to a dataframe?

If I understand correctly, assignment should fill:

>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({"A": [1,2,3], "B": [2,3,4]})
>>> df
A B
0 1 2
1 2 3
2 3 4
>>> df["C"] = ""
>>> df["D"] = np.nan
>>> df
A B C D
0 1 2 NaN
1 2 3 NaN
2 3 4 NaN

move column in pandas dataframe

You can rearrange columns directly by specifying their order:

df = df[['a', 'y', 'b', 'x']]

In the case of larger dataframes where the column titles are dynamic, you can use a list comprehension to select every column not in your target set and then append the target set to the end.

>>> df[[c for c in df if c not in ['b', 'x']] 
+ ['b', 'x']]
a y b x
0 1 -1 2 3
1 2 -2 4 6
2 3 -3 6 9
3 4 -4 8 12

To make it more bullet proof, you can ensure that your target columns are indeed in the dataframe:

cols_at_end = ['b', 'x']
df = df[[c for c in df if c not in cols_at_end]
+ [c for c in cols_at_end if c in df]]


Related Topics



Leave a reply



Submit