How to Add a New Column to an Existing Dataframe

How to add a new column to an existing DataFrame?

Edit 2017

As indicated in the comments and by @Alexander, currently the best method to add the values of a Series as a new column of a DataFrame could be using assign:

df1 = df1.assign(e=pd.Series(np.random.randn(sLength)).values)

Edit 2015

Some reported getting the SettingWithCopyWarning with this code.

However, the code still runs perfectly with the current pandas version 0.16.1.

>>> sLength = len(df1['a'])
>>> df1
a b c d
6 -0.269221 -0.026476 0.997517 1.294385
8 0.917438 0.847941 0.034235 -0.448948

>>> df1['e'] = pd.Series(np.random.randn(sLength), index=df1.index)
>>> df1
a b c d e
6 -0.269221 -0.026476 0.997517 1.294385 1.757167
8 0.917438 0.847941 0.034235 -0.448948 2.228131

>>> pd.version.short_version
'0.16.1'

The SettingWithCopyWarning aims to inform of a possibly invalid assignment on a copy of the Dataframe. It doesn't necessarily say you did it wrong (it can trigger false positives) but from 0.13.0 it let you know there are more adequate methods for the same purpose. Then, if you get the warning, just follow its advise: Try using .loc[row_index,col_indexer] = value instead

>>> df1.loc[:,'f'] = pd.Series(np.random.randn(sLength), index=df1.index)
>>> df1
a b c d e f
6 -0.269221 -0.026476 0.997517 1.294385 1.757167 -0.050927
8 0.917438 0.847941 0.034235 -0.448948 2.228131 0.006109
>>>

In fact, this is currently the more efficient method as described in pandas docs


Original answer:

Use the original df1 indexes to create the series:

df1['e'] = pd.Series(np.random.randn(sLength), index=df1.index)

Inserting a new Column in existing pandas dataframe

It's not working because you already have a column with that name. If you are ok with having duplicate columns then, you can pass allow_duplicates=True.

df.insert(len(df.columns),"Trigger_Type", cat_1, allow_duplicates=True)

Otherwise, you will have to rename the column to something else.

If you want to completely replace the column, you can also use:

df['Trigger_Type'] = cat1

How to assign new column to existing DataFrame in pandas

Pandas assign method returns a new modified dataframe with a new column, it does not modify it in place.

import pandas as pd
df = pd.DataFrame(data = {"test":["mkt1","mkt2","mkt3"],
"test2":["cty1","cty2","cty3"]})
print("Before",df.columns)
df = df.assign(test3="Hello") # <--- Note the variable reassingment
print("After",df.columns)

Adding a new column to DataFrame with different values in different row

Believe what you are looking for is actually answered here: Add column in dataframe from list

myList = [1,2,3,4,5]
print(len(df)) # 50
df['new_col'] = mylist
print(len(df)) # 51

Alternatively, you could set the value of a slice of the list like so:

data['new_col'] = 1
data.loc[2880:5760, 'new_col'] = 0
data.loc[12960:15840, 'new_col'] = 0
data.loc[23040:25920, 'new_col'] = 0

Python Pandas when i add a column in an existing dataframe my new column is not correct

You can directly assign a column to the DataFrame if the length of the list is same as the length of the DataFrame and the values are in the required order

islam_values = [
1.307603e+08,
2.941211e+08,
3.440720e+08,
4.351231e+08,
5.146341e+08,
5.923423e+08,
6.636743e+08,
6.471395e+08,
7.457716e+08,
9.986003e+08,
1.153186e+09,
1.314048e+09,
1.426454e+09,
1.555483e+09,
]

df = pd.DataFrame({'year': list(range(1945, 2011, 5))})
df["islam"] = islam_values

Output

    year    islam
0 1945 1.307603e+08
1 1950 2.941211e+08
2 1955 3.440720e+08
3 1960 4.351231e+08
4 1965 5.146341e+08
5 1970 5.923423e+08
6 1975 6.636743e+08
7 1980 6.471395e+08
8 1985 7.457716e+08
9 1990 9.986003e+08
10 1995 1.153186e+09
11 2000 1.314048e+09
12 2005 1.426454e+09
13 2010 1.555483e+09

add a new column in dataframe

Set new column by scalar by select first value in one element list:

activity = ["sitting"]
a['activity'] = activity[0]

Or remove [] for scalar:

activity = "sitting"
a['activity'] = activity

EDIT:

Use DataFrame.insert for create new column from left side, in position 0:

a.insert(0, 'activity', activity)

what is same like:

a.insert(0, 'activity', "sitting")

adding a new column to existing dataframe and fill with numpy array

Code from https://www.geeksforgeeks.org/adding-new-column-to-existing-dataframe-in-pandas/

Import pandas package

import pandas as pd

Define a dictionary containing data

data = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Height': [5.1, 6.2, 5.1, 5.2]
}

Convert the dictionary into DataFrame

original_df = pd.DataFrame(data)

Using 'Qualification' as the column name and equating it to the list

altered_df = original_df.assign(Qualification = ['Msc', 'MA', 'Msc', 'Msc'])

Observe the result

altered_df


Related Topics



Leave a reply



Submit