How to get frequency counts on two variables in R?
As we are only getting the frequency of 'value', 'value_x' based on the non-NA 'id', subset
based on the non-NA elements, select
the columns of interest, get the table
and convert to data.frame
as.data.frame(table(subset(frequency.data.frame,
select = c('value', 'value_x'), !is.na(id))))
The tidyverse
syntax for the above solution would be
library(dplyr)
frequency.data.frame %>%
filter(!is.na(id)) %>%
count(var1 = value, var2 = value_x)
Make a table of variable counts in R
You could do:
as.data.frame(t(apply(sample[-1], 2, function(x) table(factor(x, 1:6)))))
#> 1 2 3 4 5 6
#> home 0 1 1 1 2 1
#> office 1 2 1 0 1 1
#> other 2 0 1 1 1 1
frequency counts with categorical variables
We can use table
from base R
after subsetting the columns of interest
table(df1[c('A', 'B')])
count frequency of variable dependent on other variable in an R dataframe
Try this dplyr
solution with summarise()
and n()
:
library(dplyr)
df %>% group_by(samples,source) %>% summarise(N=n())
Output:
# A tibble: 4 x 3
# Groups: samples [2]
samples source N
<chr> <chr> <int>
1 45fe.K2 f 2
2 45fe.K2 o 1
3 45hi.K1 k 1
4 45hi.K1 o 1
And a base R
solution would be creating a indicator variable N
with ones and then aggregate()
:
#Data
df$N <- 1
#Code
aggregate(N~samples+source,df,sum)
Output:
samples source N
1 45fe.K2 f 2
2 45hi.K1 k 1
3 45fe.K2 o 1
4 45hi.K1 o 1
How to get the frequency( count) of Variable C when Variables A and B are mentioned together?
n()
returns you the number of cases of that particular combination in group_by
. As you showed two different outputs, I'm not sure exactly how you got them, and so, not sure how to interpret your %s.
Without a reproducible example, it's hard to help you fully. But if I got it right, you're on the right track. I'd just be careful with counting inside different group settings.
There is definitely a more cleaver way of doing it, but I'd break it down in two steps, as in the code below, to not mess with different count numbers given different grouping variables
library(dplyr)
## Crete some fake data
set.seed(101)
df <-
data.frame("Q6" = sample(8:10, size = 50, replace = TRUE),
"Q9" = round(rnorm(n = 50, mean = 32, sd = 2), digits = 0),
"Q11" = sample(1:2, size = 50, replace = TRUE))
## Then summarise the number of occurrences
## based on combinations of Q6 and Q9
## i.e. how many times that combination of Q6 and Q9 happened
out1 <-
df %>%
group_by(Q6, Q9) %>%
summarise(n_q6_q9 = n())
## Then count the number of Y/N (your Q11) by combinations of Q6 and Q9
## i.e. how many Y or N for each Q6~Q9 combination
out2 <-
df %>%
group_by(Q6, Q9, Q11) %>%
summarise(n_q11 = n())
## Merge them and calculate the percentage
out_final <-
left_join(out2, out1, by = c("Q6", "Q9")) %>% ## Note order of out2 and out1
mutate(per = paste0(round(n_q11/n_q6_q9 * 100, digits = 2), "%"))
# %>% ## Not sure if you need to arrange it?
# group_by(Q6, Q9) %>%
# arrange(per)
Related Topics
How to Export Multiple Data.Frame to Multiple Excel Worksheets
What Is the Width Argument in Position_Dodge
R on Macos Error: Vector Memory Exhausted (Limit Reached)
How to Perform Natural (Lexicographic) Sorting in R
How to Send an Email With Attachment from R in Windows
How to Print When Using %Dopar%
Detect At Least One Match Between Each Data Frame Row and Values in Vector
Group by Multiple Columns in Dplyr, Using String Vector Input
Read All Worksheets in an Excel Workbook into an R List With Data.Frames
How to Merge Color, Line Style and Shape Legends in Ggplot
Dplyr Mutate Rowsums Calculations or Custom Functions
Horizontal/Vertical Line in Plotly
How to Delete Rows from a Dataframe That Contain N*Na