R: How to Check If All Columns in a Data.Frame Are the Same

R: how to check if all columns in a data.frame are the same

You could check if the number of unique variable vectors is equal to one:

length(unique(as.list(df))) == 1
# [1] FALSE
length(unique(as.list(df2))) == 1
# [1] TRUE

Another way could be to check if each variable is identical to the first variable:

all(sapply(df, identical, df[,1]))
# [1] FALSE
all(sapply(df2, identical, df2[,1]))
# [1] TRUE

How to check whether all values in grouped columns are the same?

We may use n_distinct to check for frequency of unique elements in the group, convert to logical (== 1) and then to binary with as.integer or +

library(dplyr)
df %>%
group_by(id, category) %>%
mutate(same = +(n_distinct(yes) == 1)) %>%
ungroup

Or using data.table

library(data.table)
setDT(df)[, same := +(uniqueN(yes) == 1), by = .(id, category)]

How to test if column names of multiple dataframes are same

We place the datasets in a list, loop over the list with lapply, get the column names, convert it to a single case, get the unique and check if the length is 1

length(unique(lapply(lst1, function(x) sort(toupper(names(x)))))) == 1
#[1] TRUE

data

lst1 <- list(mtcars, mtcars, mtcars)

Checking for identical columns in a data frame in R

You could use identical

identical(DT[['A']],DT[['B']])

To find whether a column exists in data frame or not

Assuming that the name of your data frame is dat and that your column name to check is "d", you can use the %in% operator:

if("d" %in% colnames(dat))
{
cat("Yep, it's in there!\n");
}

check if every row in two columns has the same sign

You can compare their sign which will handle case for 0 as well.

df$comparison <- sign(df$V1) == sign(df$V2)
df

# V1 V2 comparison
#1 -1.0 2.3 FALSE
#2 3.6 2.0 TRUE
#3 -2.0 -4.0 TRUE
#4 0.0 4.0 FALSE
#5 0.0 0.0 TRUE

data

df <- structure(list(V1 = c(-1, 3.6, -2, 0, 0), V2 = c(2.3, 2, -4, 
4, 0)), class = "data.frame", row.names = c(NA, -5L))

Finding all columns in data frame of a certain value by row

library(data.table)

# dummy data
# use setDT(df) if yours isn't a datatable already
df <- data.table(id = 1:3
, a = c(4,4,0)
, b = c(0,4,0)
, c = c(4,0,4)
); df
id a b c
1: 1 4 0 4
2: 2 4 4 0
3: 3 0 0 4

# find 1st & last column with target value
df[, .(id
, first = apply(.SD, 1, \(i) names(df)[min(which(i==4))])
, last = apply(.SD, 1, \(i) names(df)[max(which(i==4))])
)
]

Checking all columns in data frame for missing values in R

The anyNA function is built for this. You can apply it to all columns of a data frame with sapply(books, anyNA). To count NA values, akrun's suggestion of colSums(is.na(books)) is good.



Related Topics



Leave a reply



Submit