Move NAs to the end of each column in a data frame
After completely misunderstanding the question, here is my final answer:
# named after beetroot for being the first to ever need this functionality
beetroot <- function(x) {
# count NA
num.na <- sum(is.na(x))
# remove NA
x <- x[!is.na(x)]
# glue the number of NAs at the end
x <- c(x, rep(NA, num.na))
return(x)
}
# apply beetroot over each column in the dataframe
as.data.frame(lapply(df, beetroot))
It will count the NAs, remove the NAs, and glue NAs at the bottom for each column in the data frame.
How to move NA to the top of the column of an R data.frame?
Perhaps, something like this?
DF$F <- c(rep(NA, sum(is.na(DF$F))), na.omit(DF$F))
Add all the NA
's first and then append all the non-NA values.
Move NA to the bottom
A simple base R solution could be:
> df <- data.frame(aaa=c(1,2,3,4,NA,6,7),
+ bbb=c(1,9,5,NA,3,NA,9),
+ ccc=c(NA,3,NA,4,8,NA,2))
> ok <- complete.cases(df)
> rbind(df[ok,], df[!ok,])
aaa bbb ccc
2 2 9 3
7 7 9 2
1 1 1 NA
3 3 5 NA
4 4 NA 4
5 NA 3 8
6 6 NA NA
And to select only some columns :
> ok <- complete.cases(df[, c("bbb","ccc")])
> rbind(df[ok,], df[!ok,])
aaa bbb ccc
2 2 9 3
5 NA 3 8
7 7 9 2
1 1 1 NA
3 3 5 NA
4 4 NA 4
6 6 NA NA
How to shift single NA to bottom of column in R
Here is a method using is.na
and subsetting [
. Starting with this dataset.
example=data.frame(x=c(1,NA,3),y=c(NA,5,6))
example
x y
1 1 NA
2 NA 5
3 3 6
you run through each variable with lapply
and take variables that not missing, and append to these the missing values at the end. Then feed this result back into the original dataset using example[] <-
, which maintains the data.frame structure.
example[] <- lapply(example, function(x) c(x[!is.na(x)], x[is.na(x)]))
example
x y
1 1 5
2 3 6
3 NA NA
We can also use the newer (R 3.3.3) grouping
function like this
example[] <- lapply(example, function(x) x[grouping(is.na(x))])
or order
example[] <- lapply(example, function(x) x[order(is.na(x))])
In the last two, the key is to order on is.na
rather than the elements themselves. This way you preserve the original order of the non-empty elements.
How to move cells with a value row-wise to the left in a dataframe
yourdata[]<-t(apply(yourdata,1,function(x){
c(x[!is.na(x)],x[is.na(x)])}))
should work : for each row, it replaces the row by a vector that consists of, first, the value that are not NA, then the NA values.
r- how to shift a varying number of NAs from the bottom to the top of columns in a dataframe
Try
dat[] <- apply(dat,2, function(x) c(x[is.na(x)], x[!is.na(x)]))
dat
# V1 V2 V3 V4 V5 V6
#1 1 NA 3 4 NA 6
#2 6 2 4 3 NA 1
#3 1 5 3 4 5 6
Or a better method would be
dat[] <- lapply(dat, function(x) c(x[is.na(x)], x[!is.na(x)]))
Or using data.table
(suggested by @David Arenburg)
library(data.table)
setDT(dat)[, names(dat) := lapply(.SD, function(x)
c(x[is.na(x)], x[!is.na(x)]))]
Dropping all left NAs in a dataframe and left shifting the cleaned rows
I don't think you can do this without a loop.
dat <- as.data.frame(rbind(c(NA,NA,1,3,5,NA,NA,NA), c(NA,1:3,6:8,NA), c(1:7,NA)))
dat[3,2] <- NA
# V1 V2 V3 V4 V5 V6 V7 V8
# 1 NA NA 1 3 5 NA NA NA
# 2 NA 1 2 3 6 7 8 NA
# 3 1 NA 3 4 5 6 7 NA
t(apply(dat, 1, function(x) {
if (is.na(x[1])) {
y <- x[-seq_len(which.min(is.na(x))-1)]
length(y) <- length(x)
y
} else x
}))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#[1,] 1 3 5 NA NA NA NA NA
#[2,] 1 2 3 6 7 8 NA NA
#[3,] 1 NA 3 4 5 6 7 NA
Then turn the matrix into a data.frame if you must.
Remove leading NAs to align data
We can loop over the columns (lapply(..
) and apply na.trim
. Then, pad NAs at the end of the each of the list
elements by assigning length
as the maximum length from the list
elements.
library(zoo)
lst <- lapply(df, na.trim)
df[] <- lapply(lst, `length<-`, max(lengths(lst)))
df
# var1 var2 var3 var4
#1 1 6 8 5
#2 2 2 6 NA
## 3 4 3 2
#4 4 7 7 6
#5 5 3 NA 2
#6 6 NA NA 9
#7 7 NA NA NA
#8 8 NA NA NA
#9 9 NA NA NA
#10 10 NA NA NA
Or as @G.Grothendieck mentioned in the comments
replace(df, TRUE, do.call("merge", lapply(lst, zoo)))
How to move elements of a column up to top of dataframe in R
We could loop across
the columns, order
based on the NA
elements and then filter
only rows having at least one non-NA
library(dplyr)
df1 %>%
mutate(across(everything(), ~ .x[order(is.na(.x))])) %>%
filter(if_any(everything(), complete.cases))
-output
A B C
1 a1 b1 c1
2 a2 <NA> c2
Or using base R
df1[] <- lapply(df1, \(x) x[order(is.na(x))])
df1[rowSums(!is.na(df1)) > 0,]
A B C
1 a1 b1 c1
2 a2 <NA> c2
data
df1 <- structure(list(A = c("a1", "a2", NA, NA, NA), B = c(NA, NA, "b1",
NA, NA), C = c(NA, NA, NA, "c1", "c2")), class = "data.frame",
row.names = c(NA,
-5L))
Related Topics
Empty Factors in "By" Data.Table
How to Display Emojis in Ggplot2 Using Emo Package in R
Create Parametric R Markdown Documentation
How to Get Multiple Ggplot2 Scale_Fill_Gradientn with Same Scale
Generate Ggplot2 Boxplot with Different Colours for Multiple Groups
Implementation of Standard Recycling Rules
Get the Index of the Values of One Vector in Another
Using R to Download Newest Files from Ftp-Server
Ggplot: How to Set Default Color for All Geoms
Is There a Fast Estimation of Simple Regression (A Regression Line with Only Intercept and Slope)
Ggplot2 Overlay of Barplot and Line Plot
Adding an Repeated Index for Factors in Data Frame