Create a Co-Occurrence Matrix from Dummy-Coded Observations

Create a co-occurrence matrix from dummy-coded observations

This will do the trick:

X <- as.matrix(X)
out <- crossprod(X)  # Same as: t(X) %*% X
diag(out) <- 0       # (b/c you don't count co-occurrences of an aspect with itself)
out
#      [,1] [,2] [,3] [,4]
# [1,]    0    0    1    0
# [2,]    0    0    2    1
# [3,]    1    2    0    1
# [4,]    0    1    1    0

To get the results into a data.frame exactly like the one you showed, you can then do something like:

nms <- paste("X", 1:4, sep="")
dimnames(out) <- list(nms, nms)
out <- as.data.frame(out)

Constructing a co-occurrence matrix in python pandas

It's a simple linear algebra, you multiply matrix with its transpose (your example contains strings, don't forget to convert them to integer):

>>> df_asint = df.astype(int)
>>> coocc = df_asint.T.dot(df_asint)
>>> coocc
       Dop  Snack  Trans
Dop      4      2      3
Snack    2      3      2
Trans    3      2      4

if, as in R answer, you want to reset diagonal, you can use numpy's fill_diagonal:

>>> import numpy as np
>>> np.fill_diagonal(coocc.values, 0)
>>> coocc
       Dop  Snack  Trans
Dop      0      2      3
Snack    2      0      2
Trans    3      2      0

finding the number of co-occurences of multiple binary variables in R

Try crossprod

> crossprod(df)
   v1 v2 v3
v1  2  2  2
v2  2  4  4
v3  2  4  6

Convert dummy-coded matrix to adjacency matrix

You may make use of outer function.

count1s <- function(x, y) colSums(x == 1 & y == 1)
n <- 1:ncol(data)
mat <- outer(n, n, function(x, y) count1s(data[, x], data[, y]))
diag(mat) <- 0
dimnames(mat) <- list(colnames(data), colnames(data))
mat

#  A B C D
#A 0 1 2 0
#B 1 0 0 1
#C 2 0 0 0
#D 0 1 0 0

Convert co-occurrence dataframe to square matrix

What you described in words sounded like ordinary matrix multiplication forllowed by setting the diag to 0:

temp <- t(as.matrix(d)) %*% as.matrix(d)
diag(temp) <- 0


> temp
  A B C D E F G H
A 0 6 1 0 0 0 0 3
B 6 0 1 0 0 0 0 3
C 0 1 0 0 0 0 0 0
D 0 0 0 0 0 0 0 0
E 0 0 0 0 0 0 0 0
F 0 0 0 0 0 0 0 0
G 0 0 0 0 0 0 0 0
H 3 3 0 0 0 0 0 0

The tcrossprod function is probably even faster, but either of these methods will surely out-perform your nested loop.

How to create a logical AND contingency table in R?

sapply(df, function(x) sapply(df, function(y) sum(x * y)))
#OR
t(df) %*% as.matrix(df)
#      typeA typeB typeC
#typeA     4     3     2
#typeB     3     4     2
#typeC     2     2     4

Co-occurence (matrix) of values based on group and time

Here you go:

library(data.table)
library(magrittr)
options(stringsAsFactors = F)

dat <- read.table(text="Group ID Time
Trx1 A 1980
Trx1 B 1980
Trx1 C 1980
Trx2 E 1980
Trx2 B 1980
Trx3 B 1981
Trx3 C 1981
Trx4 C 1983
Trx4 E 1983
Trx4 B 1983
Trx5 F 1984
Trx5 B 1984
Trx5 C 1984
Trx6 A 1986", header=T) 

str(dat)
dat = as.data.table(dat)

priorYears = 3
unqIDs = unique(dat$ID)


results = data.table(ID = character(), year = numeric(), total = numeric(), diff = numeric(), repeatSum = numeric())

for(i in 1:nrow(dat)){

  endYear = dat$Time[i] 
  startYear = endYear - priorYears
  this.ID = dat$ID[i]
  this.group = dat$Group[i]

  #Dates filtering
  subset.DT = dat[dat$Time >= startYear & dat$Time < endYear] 

  # Keep projects where my current ID collaborated 
  groupsToKeep = subset.DT$Group[subset.DT$ID == this.ID] %>% unique
  subset.DT = subset.DT[subset.DT$Group %in% groupsToKeep,]


  # Calculations
  unqMembers = unique(subset.DT$ID) %>% .[. != this.ID]
  currentMembers = dat$ID[dat$Group == this.group] %>% .[. != this.ID]

  total = length(which(subset.DT$ID != this.ID))
  diff = length(unqMembers)
  repeatSum = sum(table(subset.DT$ID)[currentMembers], na.rm = T)

  # Add results
  results = rbind(results, data.frame(ID = this.ID, year = endYear, total, diff, repeatSum))

}`

Create a Co-Occurrence Matrix from Dummy-Coded Observations