How to set the value of a pandas column as list
Not easy, one possible solution is create helper Series
:
df.loc[df.col1 == 1, 'new_col'] = pd.Series([['a', 'b']] * len(df))
print (df)
col1 col2 new_col
0 1 4 [a, b]
1 2 5 NaN
2 3 6 NaN
Another solution, if need set missing values to empty list too is use list comprehension:
#df['new_col'] = [['a', 'b'] if x == 1 else np.nan for x in df['col1']]
df['new_col'] = [['a', 'b'] if x == 1 else [] for x in df['col1']]
print (df)
col1 col2 new_col
0 1 4 [a, b]
1 2 5 []
2 3 6 []
But then you lose the vectorised functionality which goes with using NumPy arrays held in contiguous memory blocks.
set list as value in a column of a pandas dataframe
You'd have to do:
df['new_col'] = [my_list] * len(df)
Example:
In [13]:
df = pd.DataFrame(np.random.randn(5,3), columns=list('abc'))
df
Out[13]:
a b c
0 -0.010414 1.859791 0.184692
1 -0.818050 -0.287306 -1.390080
2 -0.054434 0.106212 1.542137
3 -0.226433 0.390355 0.437592
4 -0.204653 -2.388690 0.106218
In [17]:
df['b'] = [[234]] * len(df)
df
Out[17]:
a b c
0 -0.010414 [234] 0.184692
1 -0.818050 [234] -1.390080
2 -0.054434 [234] 1.542137
3 -0.226433 [234] 0.437592
4 -0.204653 [234] 0.106218
Note that dfs are optimised for scalar values, storing non scalar values defeats the point in my opinion as filtering, looking up, getting and setting become problematic to the point that it becomes a pain
Pandas replace column values with a list
This should do it for you:
# Find the name of the column by index
n = df.columns[1]
# Drop that column
df.drop(n, axis = 1, inplace = True)
# Put whatever series you want in its place
df[n] = newCol
...where [1]
can be whatever the index is, axis = 1
should not change.
This answers your question very literally where you asked to drop a column and then add one back in. But the reality is that there is no need to drop the column if you just replace it with newCol
.
Set value to an entire column of a pandas dataframe
Python can do unexpected things when new objects are defined from existing ones. You stated in a comment above that your dataframe is defined along the lines of df = df_all.loc[df_all['issueid']==specific_id,:]
. In this case, df
is really just a stand-in for the rows stored in the df_all
object: a new object is NOT created in memory.
To avoid these issues altogether, I often have to remind myself to use the copy
module, which explicitly forces objects to be copied in memory so that methods called on the new objects are not applied to the source object. I had the same problem as you, and avoided it using the deepcopy
function.
In your case, this should get rid of the warning message:
from copy import deepcopy
df = deepcopy(df_all.loc[df_all['issueid']==specific_id,:])
df['industry'] = 'yyy'
EDIT: Also see David M.'s excellent comment below!
df = df_all.loc[df_all['issueid']==specific_id,:].copy()
df['industry'] = 'yyy'
How to set values based on a list in Pandas (python)
Your question still doesn't seem to have enough information to find the real problem. This quick example shows that your attempt can work just fine:
import pandas as pd
df = pd.DataFrame({'x': [4, 5, 6], 'month': [1, 2, 3]})
some_list = [2, 3]
df[df['month'].isin(some_list)] = 99
df
Out[13]:
month x
0 1 4
1 99 99
2 99 99
...suggesting that your problem is more likely because you've mixed up the types of your variables. Currently the only thing I can suggest is only doing the assignment to specific columns, as you may be trying to assign an int value to a datetime column or something, e.g.:
df = pd.DataFrame({'x': [4, 5, 6], 'month': [1, 2, 3]})
some_list = [2, 3]
df.loc[df['month'].isin(some_list), 'x'] = 99
df
Out[14]:
month x
0 1 4
1 2 99
2 3 99
Change column value based on list pandas python
I would suggest using np.where and isin for your problem, likewise:
order_data['churn in 2019?'] = np.where(order_data['Customer_id'].isin(churn_customers_2019), 'Y', 'N')
Pandas/Python: Set value of one column based on value in another column
one way to do this would be to use indexing with .loc
.
Example
In the absence of an example dataframe, I'll make one up here:
import numpy as np
import pandas as pd
df = pd.DataFrame({'c1': list('abcdefg')})
df.loc[5, 'c1'] = 'Value'
>>> df
c1
0 a
1 b
2 c
3 d
4 e
5 Value
6 g
Assuming you wanted to create a new column c2
, equivalent to c1
except where c1
is Value
, in which case, you would like to assign it to 10:
First, you could create a new column c2
, and set it to equivalent as c1
, using one of the following two lines (they essentially do the same thing):
df = df.assign(c2 = df['c1'])
# OR:
df['c2'] = df['c1']
Then, find all the indices where c1
is equal to 'Value'
using .loc
, and assign your desired value in c2
at those indices:
df.loc[df['c1'] == 'Value', 'c2'] = 10
And you end up with this:
>>> df
c1 c2
0 a a
1 b b
2 c c
3 d d
4 e e
5 Value 10
6 g g
If, as you suggested in your question, you would perhaps sometimes just want to replace the values in the column you already have, rather than create a new column, then just skip the column creation, and do the following:
df['c1'].loc[df['c1'] == 'Value'] = 10
# or:
df.loc[df['c1'] == 'Value', 'c1'] = 10
Giving you:
>>> df
c1
0 a
1 b
2 c
3 d
4 e
5 10
6 g
python pandas: set a value of column based on another value of a column in a list
You need isin
for mask:
lst_n = ['rv', 'ag', 'rg']
df.loc[df['n'].isin(lst_n), 'class'] = 'class_a'
print (df)
f1 f2 class n
0 weekly_return 0.155796 ab weekly
1 monthly_return 0.153907 ab monthly
2 volume_ratio 0.123844 NaN volume
3 margin_selling_balance 0.115411 ad margin
4 margin_debt_balance 0.107883 ae margin
5 rv_ratio 0.077373 class_a rv
Another solution with Series.mask
:
df['class'] = df['class'].mask(df.n.isin(lst_n), 'class_a')
print (df)
f1 f2 class n
0 weekly_return 0.155796 ab weekly
1 monthly_return 0.153907 ab monthly
2 volume_ratio 0.123844 NaN volume
3 margin_selling_balance 0.115411 ad margin
4 margin_debt_balance 0.107883 ae margin
5 rv_ratio 0.077373 class_a rv
Related Topics
Datetime Dtypes in Pandas Read_Csv
Numpy "Where" with Multiple Conditions
How to Access the Child Classes of an Object in Django Without Knowing the Name of the Child Class
What Does a . in an Import Statement in Python Mean
Scatter Plot and Color Mapping in Python
Python: Importing a Sub‑Package or Sub‑Module
How to Compute the Intersection Point of Two Lines
How to Sort Unicode Strings Alphabetically in Python
Round to 5 (Or Other Number) in Python
Unicodedecodeerror: 'Ascii' Codec Can't Decode Byte 0Xef in Position 1
Difference Between Two Dates in Python
Start a Function at Given Time
"Importerror: No Module Named Site" on Windows
How to Remove Specific Elements in a Numpy Array
Plotting a 2D Heatmap with Matplotlib