How to Set the Value of a Pandas Column as List

How to set the value of a pandas column as list

Not easy, one possible solution is create helper Series:

df.loc[df.col1 == 1, 'new_col'] = pd.Series([['a', 'b']] * len(df))
print (df)
col1 col2 new_col
0 1 4 [a, b]
1 2 5 NaN
2 3 6 NaN

Another solution, if need set missing values to empty list too is use list comprehension:

#df['new_col'] = [['a', 'b'] if x == 1 else np.nan for x in df['col1']]

df['new_col'] = [['a', 'b'] if x == 1 else [] for x in df['col1']]
print (df)
col1 col2 new_col
0 1 4 [a, b]
1 2 5 []
2 3 6 []

But then you lose the vectorised functionality which goes with using NumPy arrays held in contiguous memory blocks.

set list as value in a column of a pandas dataframe

You'd have to do:

df['new_col'] = [my_list] * len(df)

Example:

In [13]:
df = pd.DataFrame(np.random.randn(5,3), columns=list('abc'))
df

Out[13]:
a b c
0 -0.010414 1.859791 0.184692
1 -0.818050 -0.287306 -1.390080
2 -0.054434 0.106212 1.542137
3 -0.226433 0.390355 0.437592
4 -0.204653 -2.388690 0.106218

In [17]:
df['b'] = [[234]] * len(df)
df

Out[17]:
a b c
0 -0.010414 [234] 0.184692
1 -0.818050 [234] -1.390080
2 -0.054434 [234] 1.542137
3 -0.226433 [234] 0.437592
4 -0.204653 [234] 0.106218

Note that dfs are optimised for scalar values, storing non scalar values defeats the point in my opinion as filtering, looking up, getting and setting become problematic to the point that it becomes a pain

Pandas replace column values with a list

This should do it for you:

# Find the name of the column by index
n = df.columns[1]

# Drop that column
df.drop(n, axis = 1, inplace = True)

# Put whatever series you want in its place
df[n] = newCol

...where [1] can be whatever the index is, axis = 1 should not change.

This answers your question very literally where you asked to drop a column and then add one back in. But the reality is that there is no need to drop the column if you just replace it with newCol.

Set value to an entire column of a pandas dataframe

Python can do unexpected things when new objects are defined from existing ones. You stated in a comment above that your dataframe is defined along the lines of df = df_all.loc[df_all['issueid']==specific_id,:]. In this case, df is really just a stand-in for the rows stored in the df_all object: a new object is NOT created in memory.

To avoid these issues altogether, I often have to remind myself to use the copy module, which explicitly forces objects to be copied in memory so that methods called on the new objects are not applied to the source object. I had the same problem as you, and avoided it using the deepcopy function.

In your case, this should get rid of the warning message:

from copy import deepcopy
df = deepcopy(df_all.loc[df_all['issueid']==specific_id,:])
df['industry'] = 'yyy'

EDIT: Also see David M.'s excellent comment below!

df = df_all.loc[df_all['issueid']==specific_id,:].copy()
df['industry'] = 'yyy'

How to set values based on a list in Pandas (python)

Your question still doesn't seem to have enough information to find the real problem. This quick example shows that your attempt can work just fine:

import pandas as pd
df = pd.DataFrame({'x': [4, 5, 6], 'month': [1, 2, 3]})
some_list = [2, 3]
df[df['month'].isin(some_list)] = 99
df
Out[13]:
month x
0 1 4
1 99 99
2 99 99

...suggesting that your problem is more likely because you've mixed up the types of your variables. Currently the only thing I can suggest is only doing the assignment to specific columns, as you may be trying to assign an int value to a datetime column or something, e.g.:

df = pd.DataFrame({'x': [4, 5, 6], 'month': [1, 2, 3]})
some_list = [2, 3]
df.loc[df['month'].isin(some_list), 'x'] = 99
df
Out[14]:
month x
0 1 4
1 2 99
2 3 99

Change column value based on list pandas python

I would suggest using np.where and isin for your problem, likewise:

order_data['churn in 2019?'] = np.where(order_data['Customer_id'].isin(churn_customers_2019), 'Y', 'N')

Pandas/Python: Set value of one column based on value in another column

one way to do this would be to use indexing with .loc.

Example

In the absence of an example dataframe, I'll make one up here:

import numpy as np
import pandas as pd

df = pd.DataFrame({'c1': list('abcdefg')})
df.loc[5, 'c1'] = 'Value'

>>> df
c1
0 a
1 b
2 c
3 d
4 e
5 Value
6 g

Assuming you wanted to create a new column c2, equivalent to c1 except where c1 is Value, in which case, you would like to assign it to 10:

First, you could create a new column c2, and set it to equivalent as c1, using one of the following two lines (they essentially do the same thing):

df = df.assign(c2 = df['c1'])
# OR:
df['c2'] = df['c1']

Then, find all the indices where c1 is equal to 'Value' using .loc, and assign your desired value in c2 at those indices:

df.loc[df['c1'] == 'Value', 'c2'] = 10

And you end up with this:

>>> df
c1 c2
0 a a
1 b b
2 c c
3 d d
4 e e
5 Value 10
6 g g

If, as you suggested in your question, you would perhaps sometimes just want to replace the values in the column you already have, rather than create a new column, then just skip the column creation, and do the following:

df['c1'].loc[df['c1'] == 'Value'] = 10
# or:
df.loc[df['c1'] == 'Value', 'c1'] = 10

Giving you:

>>> df
c1
0 a
1 b
2 c
3 d
4 e
5 10
6 g

python pandas: set a value of column based on another value of a column in a list

You need isin for mask:

lst_n = ['rv', 'ag', 'rg']
df.loc[df['n'].isin(lst_n), 'class'] = 'class_a'
print (df)
f1 f2 class n
0 weekly_return 0.155796 ab weekly
1 monthly_return 0.153907 ab monthly
2 volume_ratio 0.123844 NaN volume
3 margin_selling_balance 0.115411 ad margin
4 margin_debt_balance 0.107883 ae margin
5 rv_ratio 0.077373 class_a rv

Another solution with Series.mask:

df['class'] = df['class'].mask(df.n.isin(lst_n), 'class_a')
print (df)
f1 f2 class n
0 weekly_return 0.155796 ab weekly
1 monthly_return 0.153907 ab monthly
2 volume_ratio 0.123844 NaN volume
3 margin_selling_balance 0.115411 ad margin
4 margin_debt_balance 0.107883 ae margin
5 rv_ratio 0.077373 class_a rv


Related Topics



Leave a reply



Submit