Faster ways to calculate frequencies and cast from long to wide
You don't need ddply
for this. The dcast
from reshape2
is sufficient:
dat <- data.frame(
id = c(rep(1, 4), 2),
week = c(1:3, 1, 3)
)
library(reshape2)
dcast(dat, id~week, fun.aggregate=length)
id 1 2 3
1 1 2 1 1
2 2 0 0 1
Edit : For a base R solution (other than table
- as posted by Joshua Uhlrich), try xtabs
:
xtabs(~id+week, data=dat)
week
id 1 2 3
1 2 1 1
2 0 0 1
Count from long to wide format
In case you need it as a data.frame, here's an option with data.table
library(data.table)
setDT(df)
dcast(df, id ~ text, fun.aggregate = length)
# id arrange stock
# 1: 1 1 2
# 2: 2 2 0
Easy way to convert long to wide format with counts
You can accomplish this with a simple table()
statement. You can play with setting factor levels to get your responses the way you want.
sample.data$Decision <- factor(x = sample.data$Decision,
levels = c("Referred","Approved","Declined"))
table(Case = sample.data$Case,sample.data$Decision)
Case Referred Approved Declined
1 3 1 0
2 1 0 1
3 2 0 1
4 0 1 0
5 0 0 1
long to wide format aggregate R tidyverse
Not really sure how you get the 3 count for GENEa
and READSb
, but assuming you want the count, you can try the following:
library(tidyverse)
df <- tibble(
READS = rep(c("READa", "READb", "READc"), each = 3),
GENE = rep(c("GENEa", "GENEb", "GENEc"), each = 3),
COMMENT = rep(c("CommentA", "CommentA", "CommentA"), each = 3)
)
df
#> # A tibble: 9 x 3
#> READS GENE COMMENT
#> <chr> <chr> <chr>
#> 1 READa GENEa CommentA
#> 2 READa GENEa CommentA
#> 3 READa GENEa CommentA
#> 4 READb GENEb CommentA
#> 5 READb GENEb CommentA
#> 6 READb GENEb CommentA
#> 7 READc GENEc CommentA
#> 8 READc GENEc CommentA
#> 9 READc GENEc CommentA
df %>%
count(READS, GENE) %>%
pivot_wider(
names_from = GENE, values_from = n,
values_fill = list(n = 0)
)
#> # A tibble: 3 x 4
#> READS GENEa GENEb GENEc
#> <chr> <int> <int> <int>
#> 1 READa 3 0 0
#> 2 READb 0 3 0
#> 3 READc 0 0 3
Created on 2019-12-13 by the reprex package (v0.3.0)
Many Hot encoder in R
Using tidyverse
:
df %>%
mutate(week = paste("week", week, sep = "")) %>%
group_by(id, week) %>%
summarise(n = n()) %>%
ungroup() %>%
spread(key = week, value = n) %>%
mutate_all(funs(replace(., is.na(.), 0)))
# A tibble: 5 x 6
id week1 week2 week3 week4 week5
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 222. 0. 0. 0. 1. 0.
2 264. 0. 0. 1. 0. 1.
3 277. 0. 1. 0. 0. 0.
4 345. 1. 2. 0. 0. 1.
5 351. 0. 1. 0. 0. 0.
reshape two column data to sparse matrix in r long to wide
You can do something like this:
library(tidyverse)
dat <- tribble(~"ID", ~"Click",
1, "A",
1, "B",
1, "E",
2, "A",
2, "Q",
3, "B",
3, "D",
3, "F")
table(dat)
#> ID A B D E F Q
#> 1 1 1 0 1 0 0
#> 2 1 0 0 0 0 1
#> 3 0 1 1 0 1 0
Created on 2019-02-25 by the reprex package (v0.2.1)
EDIT: To clarify my post you don't need library(tidyverse)
or to build your data with tribble()
the function you are looking for is table()
Reshape data in R, cast function arguments
The OP asked for help with the arguments to the cast()
function of the reshape
package. However, the reshape
package was superseded by the reshape2
package from the same package author. According to the package description, the reshape2
package is
A Reboot of the Reshape Package
Using reshape2
, the desired result can be produced with
reshape2::dcast(wc, PARENT_MOL_CHEMBL_ID ~ TARGET_TYPE, fun.aggregate = length,
value.var = "TARGET_TYPE")
# PARENT_MOL_CHEMBL_ID ABL EGFR TP53
#1 C10 1 1 0
#2 C939 0 0 1
BTW: The data.table
package has implemented (and enhanced) dcast()
as well. So, the same result can be produced with
data.table::dcast(wc, PARENT_MOL_CHEMBL_ID ~ TARGET_TYPE, fun.aggregate = length,
value.var = "TARGET_TYPE")
Additional columns
The OP mentioned other columns in the data frame which should be shown together with the spread or wide data. Unfortunately, the OP hasn't supplied particular sample data, so we have to consider two use cases.
Case 1: Additional columns go along with the id column
The data could look like
wc
# PARENT_MOL_CHEMBL_ID TARGET_TYPE extra_col1
#1 C10 ABL a
#2 C10 EGFR a
#3 C939 TP53 b
Note that the values in extra_col1
are in line with PARENT_MOL_CHEMBL_ID
.
This is an easy case, because the formula in dcast()
accepts ...
which represents all other variables not used in the formula:
reshape2::dcast(wc, ... ~ TARGET_TYPE, fun.aggregate = length,
value.var = "TARGET_TYPE")
# PARENT_MOL_CHEMBL_ID extra_col1 ABL EGFR TP53
#1 C10 a 1 1 0
#2 C939 b 0 0 1
The resulting data.frame does contain all other columns.
Case2: Additional columns don't go along with the id column
Now, another column is added:
wc
# PARENT_MOL_CHEMBL_ID TARGET_TYPE extra_col1 extra_col2
#1 C10 ABL a 1
#2 C10 EGFR a 2
#3 C939 TP53 b 3
Note that extra_col2
has two different values for C10
. This will cause the simple approach to fail. So, a two step approach has to be implemented: reshaping first and joining afterwards with the original data frame. The data.table
package is used for both steps, now:
library(data.table)
# reshape from long to wide, result has only one row per id column
wide <- dcast(setDT(wc), PARENT_MOL_CHEMBL_ID ~ TARGET_TYPE, fun.aggregate = length,
value.var = "TARGET_TYPE")
# right join, i.e., all rows of wc are included
wide[wc, on = "PARENT_MOL_CHEMBL_ID"]
# PARENT_MOL_CHEMBL_ID ABL EGFR TP53 TARGET_TYPE extra_col1 extra_col2
#1: C10 1 1 0 ABL a 1
#2: C10 1 1 0 EGFR a 2
#3: C939 0 0 1 TP53 b 3
The result shows the aggregated values in wide format together with any other columns.
Manipulation of data frame using Group by or Aggregate in R
A simple way to do this is :table(df)
R aggregating a column values into rows
library(reshape2) # or you could use data.table's dcast function
dcast(df, ID + Zoo ~ Last_date)
# ID Zoo Feb_2018 Jan_2018 Nov_2017 Oct_2017
# 1 ABC-DEF DENVER 0 0 3 2
# 2 HG-IJK MEMPHIS 0 1 0 0
# 3 JK-LMO MEMPHIS 1 0 0 0
This gives a warning about not specifying the value var or aggregation function. You can be a little more verbose to avoid the warning
dcast(df, ID + Zoo ~ Last_date, value.var = 'Last_date', length)
Data used
df <- data.table::fread("
ID Zoo Last_date
ABC-DEF DENVER Oct_2017
ABC-DEF DENVER Oct_2017
ABC-DEF DENVER Nov_2017
ABC-DEF DENVER Nov_2017
ABC-DEF DENVER Nov_2017
HG-IJK MEMPHIS Jan_2018
JK-LMO MEMPHIS Feb_2018
")
How to convert data from rows into a specific columns and count them up in R?
We can use table
from base R
table(df1)
If there are many columns, subset the dataset by selecting those specific columns and then apply the table
table(df1[c("PLAYER", "SURFACE")])
Related Topics
How to Get Rowsums for Selected Columns in R
How to Delete Rows Where All the Columns Are Zero
Mapping Columns/Rows from One Dataframe to Another Based on Row Number
Minimum (Or Maximum) Value of Each Row Across Multiple Columns
Filter Data.Frame Rows by a Logical Condition
Is R'S Apply Family More Than Syntactic Sugar
Plot Two Graphs in Same Plot in R
Determine Path of the Executing Script
Replace Na With Previous or Next Value, by Group, Using Dplyr
R Collapse Multiple Rows into 1 Row - Same Columns
Conditionally Replace Values of Subset of Rows With Column Name in R Using Only Tidy
Ggplot2: Setting Geom_Bar Baseline to 1 Instead of Zero
How to Generate the First N Terms in the Series:
Count Number of Rows Within Each Group
Merge 2 Data Frames in a Loop for Each Column in One of Them