R: replace all values in a dataframe lower than a threshold with NA
Try this:
df[df<minval]=NA
df < minval
creates a boolean matrix, which is used to select the values you want to replace with NA
.
Replace all values lower than threshold in R
pmax
is a good candidate for this
> pmax(x, 1)
[1] 1 1 1 2 3
Use dplyr to change all values above threshold to NA
We can assign the output to the original object to make those changes as the %>%
will not do the output printed on the console.
df <- df %>%
mutate(across(everything(), ~ ifelse(. > 8, NA, .)))
Or another option is %<>%
operator from magrittr
library(magrittr)
df %<>%
mutate(across(everything(), ~ ifelse(. > 8, NA, .)))
Replace columns less than a threshold to 0
dplyr:
df <- mutate_all(df, funs(ifelse(. < 0.5, 0, .)))
base R:
df[df < 0.05] <- 0
Replacing values bigger than threshold with 0 in specified range of columns in R dataframe
Dplyr (use the latest version) has a nice "across()" function to be used with mutate. Just be sure to update your dplyr package as it is quite recent
library(dplyr)
df1 %>% mutate(across(where(is.numeric), function(x) ifelse(x >= 10, 0, x)))
ID string1 S_2018_p S_2019_p S_2020_p S_2021_p string2
1: a1 x2 3 3 0 4 si
2: a2 g3 5 5 4 0 q2
3: a3 n2 0 6 0 3 oq
4: a4 m3 3 0 9 8 mx
5: a5 2w 9 1 0 5 ix
6: a6 ps2 0 4 7 4 p2
7: a7 kg2 6 0 9 6 2q
Replace values in a dataframe column that are below a certain threshold with NaN
np.where
df['A'] = np.where(df['A']<=cutoff , np.nan, df['A'])
Using rollmean filtering out NA with threshold
1) Define a function which returns NaN if there are thresh
or more NA's in its input and returns mean of the non-NA's otherwise. Then use it with rollapply
. Convert that to a data frame if desired using as.data.frame
but since the data is entirely numeric leaving it as a matrix may be sufficient.
w <- 5
thresh <- w/2
Mean <- function(x, thresh) if (sum(is.na(x)) > thresh) NaN else mean(x,na.rm=TRUE)
rollapply(df, w, Mean, thresh = thresh, fill = NA)
2) Another possibility is to check if there are more than thresh NA's in each cell and if so return NaN and otherwise return the rolling mean. Again use as.data.frame
on the result if a data frame is needed. (1) has the advantage over this one that it only calls roll*
once instead of twice.
w <- 5
thresh <- w/2
ifelse(rollsum(is.na(df), w, fill = NA) > thresh, NaN,
rollmean(df, w, na.rm = TRUE, fill = NA))
Related Topics
Change Level of Multiple Factor Variables
R: Arranging Multiple Plots Together Using Gridextra
Define All Functions in One .R File, Call Them from Another .R File. How, If Possible
How to Remove "Rows" with a Na Value
How to Build a Dendrogram from a Directory Tree
Using R and Plot.Ly - How to Script Saving My Output as a Webpage
How to Rename a Variable in R Without Copying the Object
Ggplot/Mapping Us Counties - Problems with Visualization Shapes in R
Shiny: Plot Results in Popup Window
Using If Else Conditions on Vectors
R: Multiple Linear Regression Model and Prediction Model
Options for Deploying R Models in Production
Figures Captions and Labels in Knitr
Read Lines by Number from a Large File
Adding Lagged Variables to an Lm Model