How to count occurrences combinations in data.table in R
You just need to add by=list(a,b)
.
DT1[,count_combination_in_dt2:=nrow(DT2[J(a,b),nomatch=0]), by=list(a,b)]
DT1
##
## a b count_combination_in_dt2
## 1: 3 8 3
## 2: 2 3 1
EDIT: Some more details: In your original version, you used DT2[DT1, nomatch=0]
(because you used all a, b
combinations. If you want to use J(a,b)
for each a, b
combination separately, you need to use the by
argument. The data.table
is then grouped by a, b
and the nrow(...)
is evaluated within each group.
How to find all combinations in column and count occurrences in data
If I have understood you correctly, you need to group_by
PersonID
and paste
the all the unique
Animal
s in the group and count the number of occurrence of their combination which can be done counting the number of rows in the group (n()
) and dividing it by number of distinct values (n_distinct
).
library(dplyr)
df %>%
group_by(PersonID) %>%
summarise(AnimalComb = paste(unique(Animal), collapse = ""),
CountbyID = n() / n_distinct(Animal))
# PersonID AnimalComb CountbyID
# <int> <chr> <dbl>
#1 1 DogBird 1
#2 2 SnakeSpider 1
#3 3 Cat 1
#4 4 CatDog 1
R data table unique record count based on all combination of a given list of values from 2 columns
In base R, you can do:
data.frame(table(dt))
Var1 Var2 Freq
1 Col1Value1 Col2Value1 1
2 Col1Value2 Col2Value1 1
3 Col1Value3 Col2Value1 1
4 Col1Value1 Col2Value2 1
5 Col1Value2 Col2Value2 0
6 Col1Value3 Col2Value2 1
7 Col1Value1 Col2Value3 1
8 Col1Value2 Col2Value3 1
9 Col1Value3 Col2Value3 1
Counting the number of occurrences of a combination of values in r
I found this temporary solution (thanks to iod's solution on the first example using group_by and mutate).
df1 %>% filter(is.na(old_fase) | old_fase == "Finished") %>% # indicates the beginning of a new proces
group_by(id) %>%
mutate(occurrence = row_number()) %>%
select(id, time, occurrence) %>%
left_join(df1, ., by = c("id", "time")) %>%
fill(occurrence)
count number of combinations by group
Create a "combination" column in summarise
, we can count this column afterwards.
An easy way to count the category is to order them at the beginning, then in this case they will all be in the same order.
library(dplyr)
dd %>%
group_by(id) %>%
arrange(id, cat) %>%
summarize(combination = paste0(cat, collapse = "-"), .groups = "drop") %>%
count(combination)
# A tibble: 3 x 2
combination n
<chr> <int>
1 c-d-f 1
2 c-f 2
3 d-f 2
Count occurrences of factors across multiple columns in grouped dataframe
You can stack col1
& col2
together, count the number of each combination, and then transform the table to a wide form.
library(dplyr)
library(tidyr)
df %>%
pivot_longer(col1:col2) %>%
count(grp, name, value) %>%
pivot_wider(grp, names_from = c(name, value), names_sort = TRUE,
values_from = n, values_fill = 0)
# A tibble: 3 x 6
grp col1_A col1_B col2_B col2_C col2_D
<chr> <int> <int> <int> <int> <int>
1 a 1 2 2 0 1
2 b 2 0 0 2 0
3 c 1 2 0 2 1
A base
solution (Thank @GKi to refine the code):
table(cbind(df["grp"], col=do.call(paste0, stack(df[-1])[2:1])))
col
grp col1A col1B col2B col2C col2D
a 1 2 2 0 1
b 2 0 0 2 0
c 1 2 0 2 1
How to count number of occurrences of data combinations and save in a matrix in R?
You could initialize the matrix with zeros instead of NA
and then increment the matrix value like this:
pre = c(3,1,3,2,2,4,3,5,3,4,6,5,6,5,4,5,6,6,5,6,5,7,6,7,7,7,4,8,4,8,8,4,4,8,3,9,8,6,9,8)
post = c(4,3,5,3,4,6,5,6,5,4,5,6,6,5,6,5,7,6,7,7,7,4,8,4,8,8,4,4,8,3,9,8,6,9,8,8,9,7,9,9)
df = data.frame(pre,post)
matrix = matrix(0, nrow=20, ncol=20)
colnames(matrix) = seq(1,20,1)
rownames(matrix) = seq(1,20,1)
for (i in 1:40){matrix[df$post[i], df$pre[i]] = matrix[df$post[i], df$pre[i]] + 1}
By the way, the setting of the matrix colnames
and rownames
is not needed if you don't need it for any other reasons.
Related Topics
Extract Survival Probabilities in Survfit by Groups
Plotting Average of Multiple Variables in Time-Series Using Ggplot
Using R to Fit a Sigmoidal Curve
Access Data.Table Columns with Strings
Using 'Fread' to Import CSV File from an Archive into 'R' Without Extracting to Disk
How to Summarizing Data Statistics Using R
Colons Equals Operator in R? New Syntax
Ggplot Object Not Found Error When Adding Layer with Different Data
How Do Add a Column in a Data Frame in R
Rcmdr Launch Error in Yosemite (Os X 10.10)
R: How to Select Files in Directory Which Satisfy Conditions Both on the Beginning and End of Name
How to Use Aggregate Function in R
How to Replace Numeric Codes with Value Labels from a Lookup Table
Shiny: Switching Between Reactive Data Sets with Rhandsontable
How to Loop Through a Folder of CSV Files in R