Replace NA in column with value in adjacent column
It didn't work because status was a factor. When you mix factor with numeric then numeric is the least restrictive. By forcing status to be character you get the results you're after and the column is now a character vector:
TEST$UNIT[is.na(TEST$UNIT)] <- as.character(TEST$STATUS[is.na(TEST$UNIT)])
## UNIT STATUS TERMINATED START STOP
## 1 ACTIVE ACTIVE 1999-07-06 2007-04-23 2008-12-05
## 2 INACTIVE INACTIVE 2008-12-05 2008-12-06 4712-12-31
## 3 200 ACTIVE 2000-08-18 2004-06-01 2007-01-31
## 4 200 ACTIVE 2000-08-18 2007-02-01 2008-04-18
## 5 200 INACTIVE 2000-08-18 2008-04-19 2010-11-28
## 6 200 ACTIVE 2008-08-18 2010-11-29 2010-12-29
## 7 200 INACTIVE 2008-08-18 2010-12-30 4712-12-31
## 8 300 ACTIVE 2006-09-19 2007-10-29 2008-02-04
## 9 300 ACTIVE 2006-09-19 2008-02-05 2008-06-29
## 10 300 ACTIVE 2006-09-19 2008-06-30 2009-02-06
## 11 300 INACTIVE 1999-03-15 2009-02-07 4712-12-31
Replace NA in row with value in adjacent row (not only one row)
Here's an approach using dplyr
. First, I identify the columns with no NAs. Then I use the cumulative count of those to define groups. Within those groups, I paste all the rows' values (excluding NA's) together.
library(dplyr)
df1 %>%
rowwise() %>% mutate(full = sum(is.na(c_across()))) %>% ungroup() %>%
group_by(group = cumsum(full == 0)) %>%
summarize(across(.fns = ~paste0(na.omit(.x), collapse = ""))) %>%
select(-group, -full)
# A tibble: 2 × 5
V1 V2 V3 V4 V5
<chr> <chr> <chr> <chr> <chr>
1 a bfj cg di e
2 a1 b1 c1f1 d1g1 e1
How to replace with values with adjacent column using pandas
You need to fill the column with the second-after column, one way is to fillna
specifying the value
parameter:
df.A.fillna(value=df.C, inplace=True)
df.B.fillna(value=df.D, inplace=True)
If for some reason you have a lot of columns and wants to keep filling NaN
using values on the second-after column then use a for loop on the first n-2
columns
columns = ['A', 'B', 'C', 'D']
for i in range(len(columns)-2):
df[columns[i]].fillna(df[columns[i+2]], inplace=True)
Replace NA in row with value in adjacent row ROW not column
You could make use of zoo::na.locf
for this. It takes the most recent non-NA
value and fill all NA
values on the way:
library(dplyr)
library(zoo)
df %>%
mutate(V1 = zoo::na.locf(V1)) %>%
group_by(V1) %>%
summarise(V2 = paste0(V2, collapse = " "))
# A tibble: 4 x 2
V1 V2
<chr> <chr>
1 c1 a
2 c2 b c d
3 c3 e f
4 c4 g
Replace NA values when they are in two adjacent columns
You can also use the following solution. In the following solution we iterate over each row and detect corresponding index or indices that is (are) equal to Na
then if there were more that one index we replace it with 0
otherwise the row will remain as it:
library(dplyr)
library(purrr)
df %>%
pmap_df(., ~ {ind <- which(c(...) == "Na");
if(length(ind) > 1) {
replace(c(...), ind, "0")
} else {
c(...)
}
}
) %>%
mutate(across(ID, as.integer))
# A tibble: 10 x 3
ID Rep1 Rep2
<int> <chr> <chr>
1 1 6 8
2 2 5 4
3 3 3 4
4 4 0 0
5 5 Na 3
6 6 9 Na
7 7 4 6
8 8 0 0
9 9 Na 2
10 10 2 1
P.S = I almost went crazy as why I could not get it to work only to realize your NA
s are in fact Na
.
Replacing NA in column with values in adjacent column
perform similar steps for other column too!
df$E <- ifelse(is.na(df$E), ifelse(df$C-df$D <0,"small decrease","small increase"), df$E)
Related Topics
Calculate Cumulative Sum (Cumsum) by Group
Special Variables in Ggplot (..Count.., ..Density.., etc.)
How to Change the Order of Facet Labels in Ggplot (Custom Facet Wrap Labels)
R on Macos Error: Vector Memory Exhausted (Limit Reached)
Plotting Lines and the Group Aesthetic in Ggplot2
How to Create an R Function Programmatically
Nested Facets in Ggplot2 Spanning Groups
How to Subtract Months from a Date in R
Create New Variables With Mutate_At While Keeping the Original Ones
Chopping a String into a Vector of Fixed Width Character Elements
Dplyr Mutate Rowsums Calculations or Custom Functions
Converting Multiple Columns from Character to Numeric Format in R
Count the Number of All Words in a String
Dplyr Filter: Get Rows With Minimum of Variable, But Only the First If Multiple Minima