R: how to check if all columns in a data.frame are the same
You could check if the number of unique variable vectors is equal to one:
length(unique(as.list(df))) == 1
# [1] FALSE
length(unique(as.list(df2))) == 1
# [1] TRUE
Another way could be to check if each variable is identical to the first variable:
all(sapply(df, identical, df[,1]))
# [1] FALSE
all(sapply(df2, identical, df2[,1]))
# [1] TRUE
How to check whether all values in grouped columns are the same?
We may use n_distinct
to check for frequency of unique elements in the group, convert to logical (== 1
) and then to binary with as.integer
or +
library(dplyr)
df %>%
group_by(id, category) %>%
mutate(same = +(n_distinct(yes) == 1)) %>%
ungroup
Or using data.table
library(data.table)
setDT(df)[, same := +(uniqueN(yes) == 1), by = .(id, category)]
How to test if column names of multiple dataframes are same
We place the datasets in a list
, loop over the list
with lapply
, get the column names, convert it to a single case, get the unique
and check if the length
is 1
length(unique(lapply(lst1, function(x) sort(toupper(names(x)))))) == 1
#[1] TRUE
data
lst1 <- list(mtcars, mtcars, mtcars)
Checking for identical columns in a data frame in R
You could use identical
identical(DT[['A']],DT[['B']])
To find whether a column exists in data frame or not
Assuming that the name of your data frame is dat
and that your column name to check is "d"
, you can use the %in%
operator:
if("d" %in% colnames(dat))
{
cat("Yep, it's in there!\n");
}
check if every row in two columns has the same sign
You can compare their sign
which will handle case for 0 as well.
df$comparison <- sign(df$V1) == sign(df$V2)
df
# V1 V2 comparison
#1 -1.0 2.3 FALSE
#2 3.6 2.0 TRUE
#3 -2.0 -4.0 TRUE
#4 0.0 4.0 FALSE
#5 0.0 0.0 TRUE
data
df <- structure(list(V1 = c(-1, 3.6, -2, 0, 0), V2 = c(2.3, 2, -4,
4, 0)), class = "data.frame", row.names = c(NA, -5L))
Finding all columns in data frame of a certain value by row
library(data.table)
# dummy data
# use setDT(df) if yours isn't a datatable already
df <- data.table(id = 1:3
, a = c(4,4,0)
, b = c(0,4,0)
, c = c(4,0,4)
); df
id a b c
1: 1 4 0 4
2: 2 4 4 0
3: 3 0 0 4
# find 1st & last column with target value
df[, .(id
, first = apply(.SD, 1, \(i) names(df)[min(which(i==4))])
, last = apply(.SD, 1, \(i) names(df)[max(which(i==4))])
)
]
Checking all columns in data frame for missing values in R
The anyNA
function is built for this. You can apply it to all columns of a data frame with sapply(books, anyNA)
. To count NA
values, akrun's suggestion of colSums(is.na(books))
is good.
Related Topics
How to Get Rowsums for Selected Columns in R
Find All Combinations of a Set of Numbers That Add Up to a Certain Total
How to Find the Difference in Value in Every Two Consecutive Rows in R
Changing from Upper to Lower Case in Several Data Frames
Reshaping Multiple Sets of Measurement Columns (Wide Format) into Single Columns (Long Format)
Combine Two Data Frames by Rows (Rbind) When They Have Different Sets of Columns
Counting the Number of Elements With the Values of X in a Vector
Expand Ranges Defined by "From" and "To" Columns
Filter Rows Which Contain a Certain String
R Collapse Multiple Rows into 1 Row - Same Columns
Deleting Rows in R Based on Values Over Multiple Columns
Convert Dataframe Column to 1 or 0 for "True"/"False" Values and Assign to Dataframe
R Markdown - Changing Font Size and Font Type in HTML Output
Order Bars in Ggplot2 Bar Graph