Counting Unique Values Across Variables (Columns) in R

Counting unique values across variables (columns) in R

The trick is to use 'apply' and assign each row to a variable (e.g. x). You can then write a custom function, in this case one that uses 'unique' and 'length' to get the answer that you want.

df <- data.frame('2012'=c(3,5,6), '2009'=c(1,3,7), '2006'=c(4,2,3), '2003'=c(4,2,5), '2000'=c(1,3,6))

df$nunique = apply(df, 1, function(x) {length(unique(x))})

Count unique values across columns in R

Withtidyverse, first convert factor columns to character, use map2 and split partners to individual vector of strings and then count unique values combining with names using n_distinct.

library(tidyverse)

df %>%
  mutate_all(as.character) %>%
  mutate(uniquecounts = map2_dbl(names, partners, 
                       ~ n_distinct(c(.x, str_split(.y, ", ")[[1]]))))


#    names                    partners uniquecounts
#1    John  Mary, Ashley, John, Kate            4
#2    Mary Charlie, John, Mary, John            3
#3 Charlie               Kate, Marcy            3
#4   David              Mary, Claire            3

With same logic in base R

df[] <- lapply(df, as.character)
as.numeric(mapply(function(x, y) length(unique(c(x, y))), 
          df$names, strsplit(df$partners, ", ")))
#[1] 4 3 3 3

R - Count unique/distinct values in two columns together per group

You can subset the data from cur_data() and unlist the data to get a vector. Use n_distinct to count number of unique values.

library(dplyr)

df %>%
  group_by(ID) %>%
  mutate(Count = n_distinct(unlist(select(cur_data(), 
                   Party, Party2013)), na.rm = TRUE)) %>%
  ungroup


#     ID  Wave Party Party2013 Count
#  <int> <int> <chr> <chr>     <int>
#1     1     1 A     A             2
#2     1     2 A     NA            2
#3     1     3 B     NA            2
#4     1     4 B     NA            2
#5     2     1 A     C             3
#6     2     2 B     NA            3
#7     2     3 B     NA            3
#8     2     4 B     NA            3

data

It is easier to help if you provide data in a reproducible format

df <- structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), Wave = c(1L, 
2L, 3L, 4L, 1L, 2L, 3L, 4L), Party = c("A", "A", "B", "B", "A", 
"B", "B", "B"), Party2013 = c("A", NA, NA, NA, "C", NA, NA, NA
)), class = "data.frame", row.names = c(NA, -8L))

Group by and count unique values in several columns in R

Here's an approach using dplyr::across, which is a handy way to calculate across multiple columns:

my_data <- data.frame(
  city = c(rep("A", 3), rep("B", 3)),
  col1 = 1:6,
  col2 = 0,
  col3 = c(1:3, 4, 4, 4),
  col4 = 1:2
)

library(dplyr)
my_data %>%
  group_by(city) %>%
  summarize(across(col1:col4, n_distinct))

# A tibble: 2 x 5
  city   col1  col2  col3  col4
* <chr> <int> <int> <int> <int>
1 A         3     1     3     2
2 B         3     1     1     2

How to count unique values over multiple columns using R?

You could unlist and use table to get count in base R :

stack(table(unlist(df)))
#Same as
#stack(table(as.matrix(df)))

If you prefer tidyverse get data in long format using pivot_longer and count.

df %>%
  tidyr::pivot_longer(cols = everything()) %>%
  dplyr::count(value)

# A tibble: 5 x 2
#  value                 n
#  <chr>             <int>
#1 home,leisure          1
#2 home,work             3
#3 leisure,work          1
#4 work,home             3
#5 work,home,leisure     1

data

df <- structure(list(X1 = c("home,work", "leisure,work", "home,leisure"
), X2 = c("work,home", "work,home,leisure", "work,home"), X3 = c("home,work", 
"work,home", "home,work")), class = "data.frame", row.names = c(NA, -3L))

How to count unique values a column in R

We can use n_distinct() from dplyr to count the number of unique values for a column in a data frame.

textFile <- "id var1
111   A
109   A
112   A
111   A
108   A"

df <- read.table(text = textFile,header = TRUE)
library(dplyr)
df %>% summarise(count = n_distinct(id))

...and the output:

> df %>% summarise(count = n_distinct(id))
  count
1     4

We can also summarise the counts within one or more by_group() columns.

textFile <- "id var1
111   A
109   A
112   A
111   A
108   A
201   B
202   B
202   B
111   B
112   B
109   B"

df <- read.table(text = textFile,header = TRUE)
df %>%  group_by(var1) %>% summarise(count = n_distinct(id))

...and the output:

`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 2 x 2
  var1  count
  <chr> <int>
1 A         4
2 B         5

Find the count of unique values in all columns in a dataframe without including NA values (R)

You can use dplyr::n_distinct with na.rm = T:

library(dplyr)
sapply(dat, n_distinct, na.rm = T)
#map_dbl(dat, n_distinct, na.rm = T)

#nat_country         age 
#          3           8

In base R, you can use na.omit as well:

sapply(dat, \(x) length(unique(na.omit(x))))
#nat_country         age 
#          3           8

Counting Unique Values Across Variables (Columns) in R