﻿ How to Find Common Elements from Multiple Vectors - ITCodar

# How to Find Common Elements from Multiple Vectors

## How to find common elements from multiple vectors?

``intersect(intersect(a,b),c)``

will do the job.

EDIT: More cleverly, and more conveniently if you have a lot of arguments:

``Reduce(intersect, list(a,b,c))``

## Find common element between multiple vectors (no integer elements)

You could make them into hash table and count them out. As soon as you found them again, bump counter. If counter for particular item is the same as number of vectors, you got yourself an intersection. No need to pre-sort vector of pairs, define weak or string ordering etc.

Along the lines:

``#include <iostream>#include <vector>#include <list>#include <unordered_map>using Qpair = uint32_t; // should be std::pair<int, int> or similarusing Qpairs = std::vector<Qpair>;int intersections(const std::list<Qpairs>& allpairs) {    std::unordered_map<Qpair, int> m; // element vs counter    auto count = allpairs.size(); // number of vectors to scan    for(const auto& pairs: allpairs) { // loop over all vectors        for (const auto& p : pairs) { // loop over elements in particular vector            m[p] += 1;                // and count them        }    }    int total_count = 0; // how many common elements are here    for (const auto& e : m) {        if (e.second == count) {            ++total_count;            // you could add e.first to output vector as well        }    }    return total_count;}int main() {    Qpairs v1{ 4, 2, 6, 8, 9 };    Qpairs v2{ 1, 3, 8, 9, 4 };    Qpairs v3{ 2, 8, 9, 5, 0 };    std::list<Qpairs> l{ v1, v2, v3 };    auto q = intersections(l);    std::cout << q << '\n';    return 0;}``

## R: how to find common elements with the same indices in multiple vectors

We can do this using a simple comparison check:

``x == y``

and subsetting x by it: `x[x==y]`. Then the question is how to best loop it over the combinations.

Here, I'll use `outer` to take the all by all output of each combination of the list of vectors, and call a Vectorized anonymous function on it.

``v1 <- c(1, 99, 10, 11, 23)v2 <- c(1, 99, 10, 23, 11)v3 <- c(2, 4, 10, 13, 23)l = list(v1,v2,v3)outer(l,l,Vectorize(function(x,y){x[x==y]}))     [,1]      [,2]      [,3]     [1,] Numeric,5 Numeric,3 Numeric,2[2,] Numeric,3 Numeric,5 10       [3,] Numeric,2 10        Numeric,5``

if you look in the output matrix, each cell is the overlap of the indexed lists:

``output[1,2][]  1 99 10``

## Find the common elements from multiple vectors which appear at least in percentage of them

I think this would work. We use the `table` function to do most of the heavy lifting.

``find_perc <- function(..., perc = .75){    list_len <- length(list(...)) # how many vectors    tab_it <- table(c(...)) # tabulate all the names    tab_it_perc <- tab_it / list_len # calculate the frequencies    names(tab_it_perc[tab_it_perc >= perc]) # return those with freq >= perc}> find_perc(a, b, c, d) "Greg"   "Mark"   "Mathew"> find_perc(a, b, c, d, perc = .5) "Greg"   "Igor"   "Kate"   "Mark"   "Mary"   "Mathew" "Robin"  "Tobias"``

## How to find common elements from multiple vectors and from a matrix?

We can use `Map`

`` Map(intersect, split(c, row(c)), list(intersect(a,b)))``

## Finding Index of Common Elements in Multiple Vectors in R

``a <- c(5,2); b <- c(5,3); d <- c(4,5)mylist = list(a = a, b = b, d = d)  #OR  mylist = mget(c("a", "b", "d"))common_values = Reduce(intersect, mylist)lapply(mylist, function(x) which(x %in% common_values))#\$a# 1#\$b# 1#\$d# 2``

It is not clear how you want to address when there can be more than one common value, but here is one way

``a = 1:3b = 2:4d = c(2, 7, 3, 5)mylist = mget(c("a", "b", "d"))common_values = Reduce(intersect, mylist)lapply(mylist, function(x)    sapply(setNames(common_values, common_values), function(y)        which(x %in% y)))#\$a#2 3 #2 3 #\$b#2 3 #1 2 #\$d#2 3 #1 3 ``

## How to find elements common in at least 2 vectors?

It is much simpler than a lot of people are making it look. This should be very efficient.

1. Put everything into a vector:

``x <- unlist(list(a, b, c, d, e))``
2. Look for duplicates

``unique(x[duplicated(x)])#  2 3 1 4 8``

and `sort` if needed.

Note: In case there can be duplicates within a list element (which your example does not seem to implicate), then replace `x` with `x <- unlist(lapply(list(a, b, c, d, e), unique))`

Edit: as the OP has expressed interest in a more general solution where n >= 2, I would do:

``which(tabulate(x) >= n)``

if the data is only made of natural integers (1, 2, etc.) as in the example. If not:

``f <- table(x)names(f)[f >= n]``

This is now not too far from James solution but it avoids the costly-ish `sort`. And it is miles faster than computing all possible combinations.