Conditional Replace Pandas
.ix
indexer works okay for pandas version prior to 0.20.0, but since pandas 0.20.0, the .ix
indexer is deprecated, so you should avoid using it. Instead, you can use .loc
or iloc
indexers. You can solve this problem by:
mask = df.my_channel > 20000
column_name = 'my_channel'
df.loc[mask, column_name] = 0
Or, in one line,
df.loc[df.my_channel > 20000, 'my_channel'] = 0
mask
helps you to select the rows in which df.my_channel > 20000
is True
, while df.loc[mask, column_name] = 0
sets the value 0 to the selected rows where mask
holds in the column which name is column_name
.
Update:
In this case, you should use loc
because if you use iloc
, you will get a NotImplementedError
telling you that iLocation based boolean indexing on an integer type is not available.
Pandas: Conditionally replace values based on other columns values
Now my goal is for each add_rd in the event column, the associated
NaN-value in the environment column should be replaced with a string
RD.
As per @Zero's comment, use pd.DataFrame.loc
and Boolean indexing:
df.loc[df['event'].eq('add_rd') & df['environment'].isnull(), 'environment'] = 'RD'
Conditionally replace certain values in my dataframe with other values in the dataframe in R
Using your sample data:
library(dplyr)
library(tidyr)
df %>%
replace(. == "NA", NA_character_) %>%
group_by(studyID) %>%
fill(c(q1,q11,q2B,q2C,q2a,q4,q9),.direction = "down")
This gives us:
# A tibble: 9 x 9
# Groups: studyID [4]
studyID effect q1 q11 q2B q2C q2a q4 q9
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 s100 All.outcomes low NA low high low low NA
2 s100 Study.1..Effect.1 low high low high low low low
3 s100 Study.1..Effect.2 low high low high low low low
4 s101 All.outcomes low low low high low low low
5 s102 All.outcomes low low low high low high low
6 s104 All.outcomes low NA low high low low NA
7 s104 Study.1..Effect.1 low low low high low low low
8 s104 Study.2..Effect.1 low high low high low low high
9 s104 Study.3..Effect.1 low low low high low low low
How to conditionally replace values in a dataframe?
Use Series.map
only for values filtered by boolean indexing
:
#added values for match
lookup_dict = {'13:00:00':1, '16:00:00':2, '23:00:00':0}
m = df['XX'] == -1
df.loc[m, 'XX'] = df.loc[m, 'Time'].map(lookup_dict)
print (df)
XX Date Time
0 0 2016-05-01 19:00:00
1 1 2016-05-01 18:00:00
2 3 2016-05-01 17:00:00
3 2 2016-05-01 16:00:00
4 5 2016-05-01 15:00:00
5 7 2016-05-01 14:00:00
6 1 2016-05-01 13:00:00
7 6 2016-05-01 12:00:00
Replace all values in a data frame, conditionally
library(tidyverse)
df <- data.frame(
var1 = c(2L, 3L, 5L),
var2 = c(3L, 6L, 3L),
var3 = c(5L, 8L, 7L),
var4 = c(8L, 7L, 4L)
)
df %>%
mutate(across(.fns = ~ . >= 4)) %>%
summarise(across(.fns = ~ sum(.)/length(.) ))
#> var1 var2 var3 var4
#> 1 0.3333333 0.3333333 1 1
Pandas DataFrame: replace all values in a column, based on condition
You need to select that column:
In [41]:
df.loc[df['First Season'] > 1990, 'First Season'] = 1
df
Out[41]:
Team First Season Total Games
0 Dallas Cowboys 1960 894
1 Chicago Bears 1920 1357
2 Green Bay Packers 1921 1339
3 Miami Dolphins 1966 792
4 Baltimore Ravens 1 326
5 San Franciso 49ers 1950 1003
So the syntax here is:
df.loc[<mask>(here mask is generating the labels to index) , <optional column(s)> ]
You can check the docs and also the 10 minutes to pandas which shows the semantics
EDIT
If you want to generate a boolean indicator then you can just use the boolean condition to generate a boolean Series and cast the dtype to int
this will convert True
and False
to 1
and 0
respectively:
In [43]:
df['First Season'] = (df['First Season'] > 1990).astype(int)
df
Out[43]:
Team First Season Total Games
0 Dallas Cowboys 0 894
1 Chicago Bears 0 1357
2 Green Bay Packers 0 1339
3 Miami Dolphins 0 792
4 Baltimore Ravens 1 326
5 San Franciso 49ers 0 1003
Conditional replacement of values in dataframe with NA
Using the code that GenesRus handed me, I was able to modify the code to select the trials that I want:
trialdata_filter <- trialdata %>%
mutate(direction= as.logical(direction)) %>%
mutate(is.special = case_when(direction == FALSE & Y > 180 ~ TRUE, direction == TRUE & Y <20 ~ TRUE, TRUE ~ FALSE)) %>%
group_by(bartrial) %>%
filter(!any(is.special[1:25] == TRUE))
Thanks for the help!
Related Topics
Using Data.Table Package Inside My Own Package
How R Formats Posixct With Fractional Seconds
How to Change the Order of Facet Labels in Ggplot (Custom Facet Wrap Labels)
Create Sequence of Repeated Values, in Sequence
Scatterplot With Too Many Points
Remove Na Values from a Vector
How to Get a Vertical Geom_Vline to an X-Axis of Class Date
Generate N Random Integers That Sum to M in R
Apply Multiple Functions to Multiple Columns in Data.Table
What Does the Dot Mean in R - Personal Preference, Naming Convention or More
Pass Arguments to Dplyr Functions
Reshape Multiple Values At Once
How to Install Packages in Latest Version of Rstudio and R Version.3.1.1
Aggregate a Dataframe on a Given Column and Display Another Column