How to Create a Prop.Table() for a Three Dimension Table

Calculations proportions of a three way table

Is this what you are chasing?

The data:

test <- structure(c(42L, 151L, 313L, 69L, 22L, 46L, 47L, 24L, 17L, 36L, 108L, 16L), .Dim = c(2L, 2L, 3L), .Dimnames = list(c("0", "1" ), c("female", "male"), c("adult", "child", "unknown")), class = "table")

Get the column percentages (2) within each sub-table (3):

prop.table(test,c(2,3))

Results in:

, , adult

     female      male
0 0.2176166 0.8193717
1 0.7823834 0.1806283

, , child

     female      male
0 0.3235294 0.6619718
1 0.6764706 0.3380282

, , unknown

     female      male
0 0.3207547 0.8709677
1 0.6792453 0.1290323

How can I create a prop.table from a dataset with multiple variables?

With sapply and tapply you can do :

cols <- 4:10
t(sapply(df[cols], function(x) tapply(x, df$poverty_t, sum)/sum(x)))

#                  -1         0         1          2
#n_fem      0.3333333 0.3333333 0.2500000 0.08333333
#n_male     0.5000000 0.2000000 0.3000000 0.00000000
#n_Sec_Edu  0.0000000 0.3333333 0.6666667 0.00000000
#n_High_Edu 0.5000000 0.3750000 0.1250000 0.00000000
#n_emp      0.3333333 0.4444444 0.2222222 0.00000000
#n_noemp    0.3333333 0.0000000 0.3333333 0.33333333
#n_stud     0.4166667 0.1666667 0.3333333 0.08333333

Working with Three-Dimensional Frequency Tables in R

You are very close, but you need to sum over 2 margins. I'm re-arranging your example so the "vote" is at the end as in your original question:

> tab <- xtabs(~cyl+gear+I(mpg>20), mtcars)
> prop.table(tab, 1:2)
, , I(mpg > 20) = FALSE

   gear
cyl   3   4   5
  4 0.0 0.0 0.0
  6 0.5 0.5 1.0
  8 1.0     1.0

, , I(mpg > 20) = TRUE

   gear
cyl   3   4   5
  4 1.0 1.0 1.0
  6 0.5 0.5 0.0
  8 0.0     0.0

> prop.table(tab, 1:2)[ , , 2] # Proportion TRUE for each combo
   gear
cyl   3   4 5
  4 1.0 1.0 1
  6 0.5 0.5 0
  8 0.0 NaN 0

All 4 cylinder cars get over 20mpg and no 8 cylinder cars do. To get a data frame:

> as.data.frame.table(prop.table(tab, 1:2)[ , , 2])
  cyl gear Freq
1   4    3  1.0
2   6    3  0.5
3   8    3  0.0
4   4    4  1.0
5   6    4  0.5
6   8    4  NaN
7   4    5  1.0
8   6    5  0.0
9   8    5  0.0

s3 is there a way to combine prop.table for character variables?

Maybe this is what you wanted. Values in each category sum up to 100%.

lis <- sapply( Student1995, function(x) t( sapply( x, table ) ) )

sapply( lis, function(x) colSums(prop.table(x)) )
$alcohol
                  Not  Once.or.Twice.a.week          Once.a.month
                  0.0                   0.6                   0.4
          Once.a.week More.than.once.a.week
                  0.0                   0.0

$drugs
       Not Tried.once Occasional    Regular
       0.8        0.2        0.0        0.0

$smoke
       Not Occasional    Regular
       0.6        0.2        0.2

$sport
Not.regular     Regular
        0.4         0.6

and the whole summary...

prop.table( table(as.vector( sapply( Student1995, unlist ))) )

                 Not          Not regular           Occasional
                0.35                 0.10                 0.05
        Once a month Once or Twice a week              Regular
                0.10                 0.15                 0.20
          Tried once
                0.05

Create proportion matrix for multivariate categorical data

Tried to put logic at appropriate points in code sequence.

props <- data.frame(Cat = sort(unique(c(data))) )  # Just the Cat column
#Now fill in the entries
# the entries will be obtained with table function
    apply(data, 2, table)  # run `table(.)` over the columns individually
[[1]]

0 1 2   # these are actually character valued names
3 4 3   # while these are the count values

[[2]]

 2  3  6  8 10 
 2  4  1  2  1 

[[3]]

2 3 4 5 6 
2 3 1 3 1

Now iterate over that list to fill in values that match the Cat column:

 props2 <- cbind(props,  # using dfrm first argument returns dataframe object 
    lapply( apply(data, 2, table) ,   # irregular results are a list
                function(col) { # first make a named vector of zeros
                             x <- setNames(rep(0,length(props$Cat)), props$Cat)
                             # could have skipped that step by using `tabulate`
                             # then fill with values using names as indices
                             x[names(col)] <- col # values to matching names
                             x}) )
props2
#-------------
   Cat V1 V2 V3
0    0  3  0  0
1    1  4  0  0
2    2  3  2  2
3    3  0  4  3
4    4  0  0  1
5    5  0  0  3
6    6  0  1  1
8    8  0  2  0
10  10  0  1  0
#---
# now just "proportionalize" those counts
props2[2:4] <- prop.table(data.matrix(props2[2:4]), margin=2)
props2
 #-------------
   Cat  V1  V2  V3
0    0 0.3 0.0 0.0
1    1 0.4 0.0 0.0
2    2 0.3 0.2 0.2
3    3 0.0 0.4 0.3
4    4 0.0 0.0 0.1
5    5 0.0 0.0 0.3
6    6 0.0 0.1 0.1
8    8 0.0 0.2 0.0
10  10 0.0 0.1 0.0

How to Calculate the percent difference in a three dimensional array using R apply()

You should always provide reproducible data. Printouts can leave out important details regarding the structure. Use dput(HEC1) and copy the results into your question like this:

HEC1 <- structure(c(32L, 53L, 10L, 3L, 11L, 50L, 10L, 30L, 10L, 25L, 
7L, 5L, 3L, 15L, 7L, 8L, 36L, 66L, 16L, 4L, 9L, 34L, 7L, 64L, 
5L, 29L, 7L, 5L, 2L, 14L, 7L, 8L), .Dim = c(4L, 4L, 2L), .Dimnames = list(
    c("Black", "Brown", "Red", "Blond"), c("Brown", "Blue", "Hazel", 
    "Green"), c("Male", "Female")))

Now we need to be clear on what you are computing. Your desired percentage was based on the total number of brown-eyed, black haired individuals, but this is misleading since the sample sizes of males and females is not the same:

margin.table(HEC1, 3)
  Male Female 
   279    313

If instead we compute the tables separately and subtract them:

Male.pct <- prop.table(HEC1[, , 1]) * 100
round(Male.pct, 2)
#       Brown  Blue Hazel Green
# Black 11.47  3.94  3.58  1.08
# Brown 19.00 17.92  8.96  5.38
# Red    3.58  3.58  2.51  2.51
# Blond  1.08 10.75  1.79  2.87

Female.pct <- prop.table(HEC1[, , 2]) * 100
round(Female.pct, 2)
#       Brown  Blue Hazel Green
# Black 11.50  2.88  1.60  0.64
# Brown 21.09 10.86  9.27  4.47
# Red    5.11  2.24  2.24  2.24
# Blond  1.28 20.45  1.60  2.56

round(Male.pct - Female.pct, 2)
#       Brown  Blue Hazel Green
# Black -0.03  1.07  1.99  0.44
# Brown -2.09  7.06 -0.30  0.90
# Red   -1.53  1.35  0.27  0.27
# Blond -0.20 -9.69  0.19  0.31

Your original apply function computed the proportions a third way, over the entire array combining both tables. You should have used:

apply(prop.table(HEC1, 3), 1:2, diff) * 100
#            Brown      Blue      Hazel      Green
# Black 0.03206339 -1.067253 -1.9867853 -0.4362912
# Brown 2.08984621 -7.058527  0.3046022 -0.9035006
# Red   1.52759170 -1.347808 -0.2725388 -0.2725388
# Blond 0.20268645  9.694596 -0.1946706 -0.3114730

The diff function subtracts the second value from the first (Female - Male) so the negative signs are reversed compared to Male - Female above.

R: How do I sort a prop.table by one of the two percent dimensions?

If we wanted to order based on the 2nd row, subset the 2nd row (A1[2,] by specifying the row index on i), order the vector and use that as column index to reorder in j

A2 <- A1[,order(A1[2,])]
A2
#       8    1    5   13    9    6   11    2    7    4   12   10    3
#  -1 1.00 0.90 0.80 0.75 0.65 0.60 0.50 0.30 0.30 0.25 0.15 0.10 0.00
#  1  0.00 0.10 0.20 0.25 0.35 0.40 0.50 0.70 0.70 0.75 0.85 0.90 1.00

colSums(A2)
# 8  1  5 13  9  6 11  2  7  4 12 10  3 
# 1  1  1  1  1  1  1  1  1  1  1  1  1