How to add a new column to an existing DataFrame?
Edit 2017
As indicated in the comments and by @Alexander, currently the best method to add the values of a Series as a new column of a DataFrame could be using assign
:
df1 = df1.assign(e=pd.Series(np.random.randn(sLength)).values)
Edit 2015
Some reported getting the SettingWithCopyWarning
with this code.
However, the code still runs perfectly with the current pandas version 0.16.1.
>>> sLength = len(df1['a'])
>>> df1
a b c d
6 -0.269221 -0.026476 0.997517 1.294385
8 0.917438 0.847941 0.034235 -0.448948
>>> df1['e'] = pd.Series(np.random.randn(sLength), index=df1.index)
>>> df1
a b c d e
6 -0.269221 -0.026476 0.997517 1.294385 1.757167
8 0.917438 0.847941 0.034235 -0.448948 2.228131
>>> pd.version.short_version
'0.16.1'
The SettingWithCopyWarning
aims to inform of a possibly invalid assignment on a copy of the Dataframe. It doesn't necessarily say you did it wrong (it can trigger false positives) but from 0.13.0 it let you know there are more adequate methods for the same purpose. Then, if you get the warning, just follow its advise: Try using .loc[row_index,col_indexer] = value instead
>>> df1.loc[:,'f'] = pd.Series(np.random.randn(sLength), index=df1.index)
>>> df1
a b c d e f
6 -0.269221 -0.026476 0.997517 1.294385 1.757167 -0.050927
8 0.917438 0.847941 0.034235 -0.448948 2.228131 0.006109
>>>
In fact, this is currently the more efficient method as described in pandas docs
Original answer:
Use the original df1 indexes to create the series:
df1['e'] = pd.Series(np.random.randn(sLength), index=df1.index)
Inserting a new Column in existing pandas dataframe
It's not working because you already have a column with that name. If you are ok with having duplicate columns then, you can pass allow_duplicates=True.
df.insert(len(df.columns),"Trigger_Type", cat_1, allow_duplicates=True)
Otherwise, you will have to rename the column to something else.
If you want to completely replace the column, you can also use:
df['Trigger_Type'] = cat1
How to assign new column to existing DataFrame in pandas
Pandas assign
method returns a new modified dataframe with a new column, it does not modify it in place.
import pandas as pd
df = pd.DataFrame(data = {"test":["mkt1","mkt2","mkt3"],
"test2":["cty1","cty2","cty3"]})
print("Before",df.columns)
df = df.assign(test3="Hello") # <--- Note the variable reassingment
print("After",df.columns)
Adding a new column to DataFrame with different values in different row
Believe what you are looking for is actually answered here: Add column in dataframe from list
myList = [1,2,3,4,5]
print(len(df)) # 50
df['new_col'] = mylist
print(len(df)) # 51
Alternatively, you could set the value of a slice of the list like so:
data['new_col'] = 1
data.loc[2880:5760, 'new_col'] = 0
data.loc[12960:15840, 'new_col'] = 0
data.loc[23040:25920, 'new_col'] = 0
Python Pandas when i add a column in an existing dataframe my new column is not correct
You can directly assign a column to the DataFrame if the length of the list is same as the length of the DataFrame and the values are in the required order
islam_values = [
1.307603e+08,
2.941211e+08,
3.440720e+08,
4.351231e+08,
5.146341e+08,
5.923423e+08,
6.636743e+08,
6.471395e+08,
7.457716e+08,
9.986003e+08,
1.153186e+09,
1.314048e+09,
1.426454e+09,
1.555483e+09,
]
df = pd.DataFrame({'year': list(range(1945, 2011, 5))})
df["islam"] = islam_values
Output
year islam
0 1945 1.307603e+08
1 1950 2.941211e+08
2 1955 3.440720e+08
3 1960 4.351231e+08
4 1965 5.146341e+08
5 1970 5.923423e+08
6 1975 6.636743e+08
7 1980 6.471395e+08
8 1985 7.457716e+08
9 1990 9.986003e+08
10 1995 1.153186e+09
11 2000 1.314048e+09
12 2005 1.426454e+09
13 2010 1.555483e+09
add a new column in dataframe
Set new column by scalar by select first value in one element list:
activity = ["sitting"]
a['activity'] = activity[0]
Or remove []
for scalar:
activity = "sitting"
a['activity'] = activity
EDIT:
Use DataFrame.insert
for create new column from left side, in position 0
:
a.insert(0, 'activity', activity)
what is same like:
a.insert(0, 'activity', "sitting")
adding a new column to existing dataframe and fill with numpy array
Code from https://www.geeksforgeeks.org/adding-new-column-to-existing-dataframe-in-pandas/
Import pandas package
import pandas as pd
Define a dictionary containing data
data = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Height': [5.1, 6.2, 5.1, 5.2]
}
Convert the dictionary into DataFrame
original_df = pd.DataFrame(data)
Using 'Qualification' as the column name and equating it to the list
altered_df = original_df.assign(Qualification = ['Msc', 'MA', 'Msc', 'Msc'])
Observe the result
altered_df
Related Topics
Import a Module from a Relative Path
How to Sort a List of Objects Based on an Attribute of the Objects
How to Create a Tuple With Only One Element
Add Scrolling to a Platformer in Pygame
Bare Asterisk in Function Arguments
How to Write the Fibonacci Sequence
Reference Template Variable Within Jinja Expression
What Is the Python Keyword "With" Used For
How to Get Indices of N Maximum Values in a Numpy Array
Why Does This Iterative List-Growing Code Give Indexerror: List Assignment Index Out of Range
How to Expand the Output Display to See More Columns of a Pandas Dataframe
Wait Until Page Is Loaded With Selenium Webdriver For Python
How to Add a New Column to an Existing Dataframe
How to Print a Single Backslash
Annotate Bars With Values on Pandas Bar Plots
Typeerror: Method() Takes 1 Positional Argument But 2 Were Given