Replace all particular values in a data frame
Like this:
> df[df==""]<-NA
> df
A B
1 <NA> 12
2 xyz <NA>
3 jkl 100
Replace all specific values in data.frame with values from another data.frame sequentially R
With base R
, we can use max.col
to return the last
column index for each row, where the 'Age' columns are not .
, cbind
with sequence of rows to return a row/column index, extract the elements and change the 'Age' column in 'df1', where the 'Age' is .
df1$Age <- ifelse(df1$Age == ".", df2[-1][cbind(seq_len(nrow(df2)),
max.col(df2[-1] != ".", "last"))], df1$Age)
df1 <- type.convert(df1, as.is = TRUE)
-output
df1
# Sample Age
#1 1 50
#2 2 49
#3 3 30
or using tidyverse
by reshaping into 'long' format and then do a join after slice
ing the last row grouped by 'Sample'
library(dplyr)
library(tidyr)
df2 %>%
mutate(across(starts_with('Age'), as.integer)) %>%
pivot_longer(cols = starts_with('Age'), values_drop_na = TRUE) %>%
group_by(Sample) %>%
slice_tail(n = 1) %>%
ungroup %>%
select(-name) %>%
right_join(df1) %>%
transmute(Sample, Age = coalesce(as.integer(Age), value))
-output
# A tibble: 3 x 2
# Sample Age
# <int> <int>
#1 1 50
#2 2 49
#3 3 30
data
df1 <- structure(list(Sample = 1:3, Age = c("50", ".", ".")),
class = "data.frame",
row.names = c(NA,
-3L))
df2 <- structure(list(Sample = 1:3, Age_1 = c(40L, 35L, 30L), Age_2 = c("42",
"49", "."), Age_3 = c("44", ".", ".")), class = "data.frame",
row.names = c(NA,
-3L))
Replacing all values based on specific value in column dataframe
Based on 1) my initial idea of using multiplication instead of replace and 2) riding on @piRSquared's syntax together with 3) modification to exclude first column for operation, you can use:
df.iloc[:-1, 1:] *= df.iloc[-1, 1:]
Test run:
data = {'number': {0: '1', 1: '2', 2: '3', 3: 'result'},
'error1': {0: 0.0, 1: 1.0, 2: 0.0, 3: 0.5},
'error2': {0: 0.0, 1: 1.0, 2: 1.0, 3: 0.6},
'error2040': {0: 1.0, 1: 1.0, 2: 0.0, 3: 0.001}}
df = pd.DataFrame(data)
print(df)
number error1 error2 error2040
0 1 0.0 0.0 1.000
1 2 1.0 1.0 1.000
2 3 0.0 1.0 0.000
3 result 0.5 0.6 0.001
df.iloc[:-1, 1:] *= df.iloc[-1, 1:]
print(df)
number error1 error2 error2040
0 1 0.0 0.0 0.001
1 2 0.5 0.6 0.001
2 3 0.0 0.6 0.0
3 result 0.5 0.6 0.001
Replace specific values in a dataframe column using Pandas
A clean syntax for this kind of "find and replace" uses a dict, as
df.Num_of_employees = df.Num_of_employees.replace({"10-Jan": "1-10",
"Nov-50": "11-50"})
I'm trying to replace a specific value in my dataframe
You assign only the day of week
column into sales
, so it makes sense you get only one column. Try:
sales["day of week"]=sales["day of week"].replace(0, "Thru")
If it doesn't work (because day of week
is an object
type column), try:
sales["day of week"]=sales["day of week"].replace('0', "Thru")
Pandas DataFrame: replace all values in a column, based on condition
You need to select that column:
In [41]:
df.loc[df['First Season'] > 1990, 'First Season'] = 1
df
Out[41]:
Team First Season Total Games
0 Dallas Cowboys 1960 894
1 Chicago Bears 1920 1357
2 Green Bay Packers 1921 1339
3 Miami Dolphins 1966 792
4 Baltimore Ravens 1 326
5 San Franciso 49ers 1950 1003
So the syntax here is:
df.loc[<mask>(here mask is generating the labels to index) , <optional column(s)> ]
You can check the docs and also the 10 minutes to pandas which shows the semantics
EDIT
If you want to generate a boolean indicator then you can just use the boolean condition to generate a boolean Series and cast the dtype to int
this will convert True
and False
to 1
and 0
respectively:
In [43]:
df['First Season'] = (df['First Season'] > 1990).astype(int)
df
Out[43]:
Team First Season Total Games
0 Dallas Cowboys 0 894
1 Chicago Bears 0 1357
2 Green Bay Packers 0 1339
3 Miami Dolphins 0 792
4 Baltimore Ravens 1 326
5 San Franciso 49ers 0 1003
Replace values from rows with specific values from another dataframe in R
You could use dplyr
:
df %>%
group_by(ID) %>%
mutate(Min_Range_New = ifelse(is.na(Range), NA, min(Range, na.rm=TRUE)))
which returns
ID Range Min_Range Min_Range_New
<dbl> <dbl> <dbl> <dbl>
1 1 10 10 10
2 1 15 10 10
3 1 20 10 10
4 2 30 30 30
5 2 35 30 30
6 3 40 40 40
7 3 45 40 40
8 3 50 40 40
9 3 NA NA NA
10 4 NA NA NA
11 4 NA NA NA
Related Topics
Collapsing Rows Where Some Are All Na, Others Are Disjoint With Some Nas
Rename Multiple Columns by Names
Convert Data.Frame Column Format from Character to Factor
Overlay Histogram With Density Curve
How to Count Runs in a Sequence
Unlist Data Frame Column Preserving Information from Other Column
How to Number/Label Data-Table by Group-Number from Group_By
Applying a Function to Every Row of a Table Using Dplyr
Wrap Long Axis Labels Via Labeller=Label_Wrap in Ggplot2
How to Extract Plot Axes' Ranges For a Ggplot2 Object
Create New Variables With Mutate_At While Keeping the Original Ones
Repeat Rows of a Data.Frame N Times