Set Value For Particular Cell in Pandas Dataframe Using Index

Set value for particular cell in pandas DataFrame using index

RukTech's answer, df.set_value('C', 'x', 10), is far and away faster than the options I've suggested below. However, it has been slated for deprecation.

Going forward, the recommended method is .iat/.at.


Why df.xs('C')['x']=10 does not work:

df.xs('C') by default, returns a new dataframe with a copy of the data, so

df.xs('C')['x']=10

modifies this new dataframe only.

df['x'] returns a view of the df dataframe, so

df['x']['C'] = 10

modifies df itself.

Warning: It is sometimes difficult to predict if an operation returns a copy or a view. For this reason the docs recommend avoiding assignments with "chained indexing".


So the recommended alternative is

df.at['C', 'x'] = 10

which does modify df.


In [18]: %timeit df.set_value('C', 'x', 10)
100000 loops, best of 3: 2.9 µs per loop

In [20]: %timeit df['x']['C'] = 10
100000 loops, best of 3: 6.31 µs per loop

In [81]: %timeit df.at['C', 'x'] = 10
100000 loops, best of 3: 9.2 µs per loop

Set value for particular cell in pandas DataFrame with iloc

For mixed position and index, use .ix. BUT you need to make sure that your index is not of integer, otherwise it will cause confusions.

df.ix[0, 'COL_NAME'] = x

Update:

Alternatively, try

df.iloc[0, df.columns.get_loc('COL_NAME')] = x

Example:

import pandas as pd
import numpy as np

# your data
# ========================
np.random.seed(0)
df = pd.DataFrame(np.random.randn(10, 2), columns=['col1', 'col2'], index=np.random.randint(1,100,10)).sort_index()

print(df)


col1 col2
10 1.7641 0.4002
24 0.1440 1.4543
29 0.3131 -0.8541
32 0.9501 -0.1514
33 1.8676 -0.9773
36 0.7610 0.1217
56 1.4941 -0.2052
58 0.9787 2.2409
75 -0.1032 0.4106
76 0.4439 0.3337

# .iloc with get_loc
# ===================================
df.iloc[0, df.columns.get_loc('col2')] = 100

df

col1 col2
10 1.7641 100.0000
24 0.1440 1.4543
29 0.3131 -0.8541
32 0.9501 -0.1514
33 1.8676 -0.9773
36 0.7610 0.1217
56 1.4941 -0.2052
58 0.9787 2.2409
75 -0.1032 0.4106
76 0.4439 0.3337

Pandas: Get cell value by row index and column name

Use .loc to get rows by label and .iloc to get rows by position:

>>> df.loc[3, 'age']
23

>>> df.iloc[2, df.columns.get_loc('age')]
23

More about Indexing and selecting data

How set a particular cell value in pandas?

Use pd.DataFrame.iat to reference and/or assign to the ordinal location of a single cell.

ZEROS = np.zeros((4,4), dtype=np.int)

df = pd.DataFrame(ZEROS, columns=['A1','B1','C1','D1'])
df.iat[2,3] = 32
df

A1 B1 C1 D1
0 0 0 0 0
1 0 0 0 0
2 0 0 0 32
3 0 0 0 0

You could also use iloc however, iloc can also take array like input. This makes iloc more flexible but also requires more overhead. Therefore, if it is only a single cell you want to change... use iat


Also see this post for more information

loc/iloc/at/iat/set_value

Set value for particular cell in pandas DataFrame

You could go for the ordinal indexes (they are always unique) like so:

In [13]: df.iloc[3, 1] = 100

In [14]: df
Out[14]:
x y
A 1 5
B 4 6
C 0 3
C 5 100

how to set value for a specific cell in a dataframe

Mixing label and positional indexing requires a little extra work since:

  • loc / at are designed for labels;
  • iloc / iat are designed for positional indexing.

You can convert the row positional index to a label:

df.at[df.index[0], 'B'] = 10

Or you can convert the column label to a positional index:

df.iat[0, df.columns.get_loc('B')] = 10

Note: at / iat should be preferred to loc / iloc for fast scalar access / setting.

set value to new column according to specific index

You could use index.isin + where:

df['pred_date'] = df['date'].where(df.index.isin([1,2,3]))

Output:

   date  label  pred_date
1 1.1 1 1.1
2 2.1 0 2.1
3 3.1 1 3.1
4 4.1 1 NaN

i need to return a value from a dataframe cell as a variable not a series

.values[0] will do what OP wants.

Assuming one wants to obtain the value 30, the following will do the work

df.loc[df['State'] == 2, 'Ah-Step'].values[0]

print(df)

[Out]: 30.0

So, in OP's specific case, the operation 30+3.7 could be done as follows

df.loc[df['State'] == 2, 'Ah-Step'].values[0] + df['Ah-Step'].loc[df['State']==3].values[0]

[Out]: 33.7

Index a pandas dataframe based on row string value conditionally containing some row specific regex

Well, I will let you timeit the code below:

First concat the "regex" serie to the original DF:

df = pd.DataFrame(["a", "a", "b", "c", "de", "de"], columns=["value"])
regex = pd.Series(["a|b|c", "a", "d|e", "c", "c|a", "f|e"], name="regex" )
df = pd.concat([df, regex], axis=1)
df

Result:



Leave a reply



Submit