Pandas/Python: Set Value of One Column Based on Value in Another Column

Change one value based on another value in pandas

One option is to use Python's slicing and indexing features to logically evaluate the places where your condition holds and overwrite the data there.

Assuming you can load your data directly into pandas with pandas.read_csv then the following code might be helpful for you.

import pandas
df = pandas.read_csv("test.csv")
df.loc[df.ID == 103, 'FirstName'] = "Matt"
df.loc[df.ID == 103, 'LastName'] = "Jones"

As mentioned in the comments, you can also do the assignment to both columns in one shot:

df.loc[df.ID == 103, ['FirstName', 'LastName']] = 'Matt', 'Jones'

Note that you'll need pandas version 0.11 or newer to make use of loc for overwrite assignment operations. Indeed, for older versions like 0.8 (despite what critics of chained assignment may say), chained assignment is the correct way to do it, hence why it's useful to know about even if it should be avoided in more modern versions of pandas.


Another way to do it is to use what is called chained assignment. The behavior of this is less stable and so it is not considered the best solution (it is explicitly discouraged in the docs), but it is useful to know about:

import pandas
df = pandas.read_csv("test.csv")
df['FirstName'][df.ID == 103] = "Matt"
df['LastName'][df.ID == 103] = "Jones"

Set value of one Pandas column based on value in another column

one way to do this would be to use indexing with .loc.

Example

In the absence of an example dataframe, I'll make one up here:

import numpy as np
import pandas as pd

df = pd.DataFrame({'c1': list('abcdefg')})
df.loc[5, 'c1'] = 'Value'

>>> df
c1
0 a
1 b
2 c
3 d
4 e
5 Value
6 g

Assuming you wanted to create a new column c2, equivalent to c1 except where c1 is Value, in which case, you would like to assign it to 10:

First, you could create a new column c2, and set it to equivalent as c1, using one of the following two lines (they essentially do the same thing):

df = df.assign(c2 = df['c1'])
# OR:
df['c2'] = df['c1']

Then, find all the indices where c1 is equal to 'Value' using .loc, and assign your desired value in c2 at those indices:

df.loc[df['c1'] == 'Value', 'c2'] = 10

And you end up with this:

>>> df
c1 c2
0 a a
1 b b
2 c c
3 d d
4 e e
5 Value 10
6 g g

If, as you suggested in your question, you would perhaps sometimes just want to replace the values in the column you already have, rather than create a new column, then just skip the column creation, and do the following:

df['c1'].loc[df['c1'] == 'Value'] = 10
# or:
df.loc[df['c1'] == 'Value', 'c1'] = 10

Giving you:

>>> df
c1
0 a
1 b
2 c
3 d
4 e
5 10
6 g

assign one column value to another column based on condition in pandas

Based on the answers to this similar question, you can do the following:

  • Using np.where:

    df['column2'] = np.where((df['column2'] == 'Null') | (df['column2'] == 0), df['column1'], df['column2'])
  • Instead, using only pandas and Python:

    df['column2'][(df['column2'] == 0) | (df['column2'] == 'Null')] = df['column1']

Pandas: How to set values from another column based on conditions column-wise

Use indexing and map to replace letters:

df.iloc[:, 2:] = df.apply(lambda x: x[2:].map(x[:2]), axis=1)
print(df)

# Output:
A B i j y z
0 1 2 2 1 1 1
1 2 3 3 2 3 3

Setup:

df = pd.DataFrame({'A': [1, 2], 'B': [2, 3], 'i': ['B', 'B'],
'j': ['A', 'A'], 'y': ['A', 'B'], 'z': ['A', 'B']})
print(df)

# Output:
A B i j y z
0 1 2 B A A A
1 2 3 B A B B

Details:

For each row, apply the following function over index axis so x contains the whole row at each iteration:

Map the value from the third column (x[2:] <- i, j, y, z) to the index from the two first columns (x[:2] <- A, B) like a dictionary (a Series can act as dictionary, check the map method)

For the first iteration:

A    1  # <- index A
B 2 # <- index B
i B # <- value B
j A # <- value A
y A # <- value A
z A # <- value A
Name: 0, dtype: object

Change Column value based on part of another column using pandas

Try loc assignment:

df.loc[pd.to_datetime(df['Time']).dt.hour == 2, 'Value'] = 30

Or:

df.loc[df['Time'].str[:2] == '02', 'Value'] = 30

Conditionally fill column values based on another columns value in pandas

You probably want to do

df['Normalized'] = np.where(df['Currency'] == '$', df['Budget'] * 0.78125, df['Budget'])

Extract column value based on another column in Pandas

You could use loc to get series which satisfying your condition and then iloc to get first element:

In [2]: df
Out[2]:
A B
0 p1 1
1 p1 2
2 p3 3
3 p2 4

In [3]: df.loc[df['B'] == 3, 'A']
Out[3]:
2 p3
Name: A, dtype: object

In [4]: df.loc[df['B'] == 3, 'A'].iloc[0]
Out[4]: 'p3'


Related Topics



Leave a reply



Submit