how do I insert a column at a specific column index in pandas?
see docs: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.insert.html
using loc = 0 will insert at the beginning
df.insert(loc, column, value)
df = pd.DataFrame({'B': [1, 2, 3], 'C': [4, 5, 6]})
df
Out:
B C
0 1 4
1 2 5
2 3 6
idx = 0
new_col = [7, 8, 9] # can be a list, a Series, an array or a scalar
df.insert(loc=idx, column='A', value=new_col)
df
Out:
A B C
0 7 1 4
1 8 2 5
2 9 3 6
Insert a column at the beginning (leftmost end) of a DataFrame
DataFrame.insert
df = pd.DataFrame({'A': ['x'] * 3, 'B': ['x'] * 3})
df
A B
0 x x
1 x x
2 x x
seq = ['a', 'b', 'c']
# This works in-place.
df.insert(0, 'C', seq)
df
C A B
0 a x x
1 b x x
2 c x x
pd.concat
df = pd.concat([pd.Series(seq, index=df.index, name='C'), df], axis=1)
df
C A B
0 a x x
1 b x x
2 c x x
DataFrame.reindex
+ assign
Reindex first, then assign will remember the position of the original column.
df.reindex(['C', *df.columns], axis=1).assign(C=seq)
C A B
0 a x x
1 b x x
2 c x x
Add column with constant value to pandas dataframe
The reason this puts NaN
into a column is because df.index
and the Index
of your right-hand-side object are different. @zach shows the proper way to assign a new column of zeros. In general, pandas
tries to do as much alignment of indices as possible. One downside is that when indices are not aligned you get NaN
wherever they aren't aligned. Play around with the reindex
and align
methods to gain some intuition for alignment works with objects that have partially, totally, and not-aligned-all aligned indices. For example here's how DataFrame.align()
works with partially aligned indices:
In [7]: from pandas import DataFrame
In [8]: from numpy.random import randint
In [9]: df = DataFrame({'a': randint(3, size=10)})
In [10]:
In [10]: df
Out[10]:
a
0 0
1 2
2 0
3 1
4 0
5 0
6 0
7 0
8 0
9 0
In [11]: s = df.a[:5]
In [12]: dfa, sa = df.align(s, axis=0)
In [13]: dfa
Out[13]:
a
0 0
1 2
2 0
3 1
4 0
5 0
6 0
7 0
8 0
9 0
In [14]: sa
Out[14]:
0 0
1 2
2 0
3 1
4 0
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
Name: a, dtype: float64
Python: how to add a column to a pandas dataframe between two columns?
You can use insert
:
df.insert(4, 'new_col_name', tmp)
Note: The insert
method mutates the original DataFrame and does not return a copy.
If you use df = df.insert(4, 'new_col_name', tmp)
, df
will be None
.
How to add an empty column to a dataframe?
If I understand correctly, assignment should fill:
>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({"A": [1,2,3], "B": [2,3,4]})
>>> df
A B
0 1 2
1 2 3
2 3 4
>>> df["C"] = ""
>>> df["D"] = np.nan
>>> df
A B C D
0 1 2 NaN
1 2 3 NaN
2 3 4 NaN
move column in pandas dataframe
You can rearrange columns directly by specifying their order:
df = df[['a', 'y', 'b', 'x']]
In the case of larger dataframes where the column titles are dynamic, you can use a list comprehension to select every column not in your target set and then append the target set to the end.
>>> df[[c for c in df if c not in ['b', 'x']]
+ ['b', 'x']]
a y b x
0 1 -1 2 3
1 2 -2 4 6
2 3 -3 6 9
3 4 -4 8 12
To make it more bullet proof, you can ensure that your target columns are indeed in the dataframe:
cols_at_end = ['b', 'x']
df = df[[c for c in df if c not in cols_at_end]
+ [c for c in cols_at_end if c in df]]
Related Topics
One-Hot Encoding in [R] | Categorical to Dummy Variables
Use Ggpairs to Create This Plot
How to Export S3 Method So It Is Available in Namespace
Setting Upper and Lower Limits in Rnorm
How to Apply Cross-Hatching to a Polygon Using the Grid Graphical System
R - When Trying to Install Package: Internetopenurl Failed
Rcpp Function Check If Missing Value
Rstudio Shiny List from Checking Rows in Datatables
R: How to Run Some Code on Load of Package
Sparse Matrix to a Data Frame in R
Ggplot Replace Count with Percentage in Geom_Bar
R Grep: Is There an and Operator
Replace Empty Values with Value from Other Column in a Dataframe
How Can a Data Ellipse Be Superimposed on a Ggplot2 Scatterplot
Too Few Periods for Decompose()