R Reshape2 'Aggregation Function Missing: Defaulting to Length'

R reshape2 'Aggregation function missing: defaulting to length'

Thanks to @akrun who pointed it out.

Well, there's a high chance that your data has duplicate row that look either like this:

student    test    score
Adam      Exam1     80
Adam      Exam1     85
Adam      Exam2     90
John      Exam1     70
John      Exam2     60

Or like this:

student   class     test    score
Adam      Biology   Exam1     80
Adam      Theology  Exam1     85
Adam      Theology  Exam2     90
John      Biology   Exam1     70
John      Theology  Exam2     60

When you cast it like this: dcast(data, student + class ~ test, value.var='score')

Reshaping data in R with multiple variable levels - aggregate function missing warning

The data.table package extended dcast with rowid and allowing multiple value.var, so...

library(data.table)
dcast(setDT(DF), id ~ rowid(id), value.var=setdiff(names(DF), "id"))

   id visit.date_1 visit.date_2 visit.id_1 visit.id_2 bill.num_1 bill.num_2 dx.code_1 dx.code_2 FY_1 FY_2 Dx.num_1 Dx.num_2
1:  1       1/2/12       3/4/12        203        506       1234       4567       409       512 2012 2013        1        1
2:  2       5/6/18       5/6/18        222        222       3452       3452       488       122 2018 2018        1        2
3:  3       2/9/14         <NA>        567         NA       6798         NA       923        NA 2014   NA        1       NA

Can dcast be used without an aggregate function?

I don't think there is a way to do it directly but we can add in an additional column which will help us out

df2 <- structure(list(id = c("A", "B", "C", "A", "B", "C", "C"), cat = c("SS", 
"SS", "SS", "SV", "SV", "SV", "SV"), val = c(220L, 222L, 223L, 
224L, 225L, 220L, 1L)), .Names = c("id", "cat", "val"), class = "data.frame", row.names = c(NA, 
-7L))

library(reshape2)
library(plyr)
# Add a variable for how many times the id*cat combination has occured
tmp <- ddply(df2, .(id, cat), transform, newid = paste(id, seq_along(cat)))
# Aggregate using this newid and toss in the id so we don't lose it
out <- dcast(tmp, id + newid ~ cat, value.var = "val")
# Remove newid if we want
out <- out[,-which(colnames(out) == "newid")]
> out
#  id  SS  SV
#1  A 220 224
#2  B 222 225
#3  C 223 220
#4  C  NA   1

dcast for numeric and character columns in R - returning length by default

We can specify length in fun.aggregate if the length is needed

library(data.table)
dcast(setDT(data), zip + date + calories ~ data_source, 
       value.var=c("user","price"), length)

Based on the data showed, there are no duplicates, so it would work

dcast(setDT(data), zip + date + calories ~ data_source, value.var=c("user","price"))

If there are duplicates, make a correction to have unique combinations by adding rowid for the grouping variable

dcast(setDT(data), rowid(zip, date, calories) + zip + date + calories 
          ~ data_source, value.var=c("user","price"))

R Reshape2 'Aggregation Function Missing: Defaulting to Length'