Count number of rows per group and add result to original data frame
Using data.table
:
library(data.table)
dt = as.data.table(df)
# or coerce to data.table by reference:
# setDT(df)
dt[ , count := .N, by = .(name, type)]
For pre-data.table 1.8.2
alternative, see edit history.
Using dplyr
:
library(dplyr)
df %>%
group_by(name, type) %>%
mutate(count = n())
Or simply:
add_count(df, name, type)
Using plyr
:
plyr::ddply(df, .(name, type), transform, count = length(num))
How to calculate number of rows per group in pandas dataframe and add it to original data
You are looking for a transform
:
df['window_count'] = df.groupby(['ID','CHAMBER_TYPE','COMMODITY_CODE','DELIVERY_TYPE','DAY'])['ID'].transform('size')
By the way, there is no 'CHAMBER_TYPE'
columns in your sample data.
count number of rows in a data frame in R based on group
Here's an example that shows how table(.)
(or, more closely matching your desired output, data.frame(table(.))
does what it sounds like you are asking for.
Note also how to share reproducible sample data in a way that others can copy and paste into their session.
Here's the (reproducible) sample data:
mydf <- structure(list(ID = c(110L, 111L, 121L, 131L, 141L),
MONTH.YEAR = c("JAN. 2012", "JAN. 2012",
"FEB. 2012", "FEB. 2012",
"MAR. 2012"),
VALUE = c(1000L, 2000L, 3000L, 4000L, 5000L)),
.Names = c("ID", "MONTH.YEAR", "VALUE"),
class = "data.frame", row.names = c(NA, -5L))
mydf
# ID MONTH.YEAR VALUE
# 1 110 JAN. 2012 1000
# 2 111 JAN. 2012 2000
# 3 121 FEB. 2012 3000
# 4 131 FEB. 2012 4000
# 5 141 MAR. 2012 5000
Here's the calculation of the number of rows per group, in two output display formats:
table(mydf$MONTH.YEAR)
#
# FEB. 2012 JAN. 2012 MAR. 2012
# 2 2 1
data.frame(table(mydf$MONTH.YEAR))
# Var1 Freq
# 1 FEB. 2012 2
# 2 JAN. 2012 2
# 3 MAR. 2012 1
Add a column that count number of rows until the first 1, by group in R
df <- data.frame(Group=c(1,1,1,1,2,2),
var1=c(1,0,0,1,1,1),
var2=c(0,0,1,1,0,0),
var3=c(0,1,0,0,0,1))
This works for any number of variables as long as the structure is the same as in the example (i.e. Group + many variables that are 0 or 1)
df %>%
mutate(rownr = row_number()) %>%
pivot_longer(-c(Group, rownr)) %>%
group_by(Group, name) %>%
mutate(out = cumsum(value != 1 & (cumsum(value) < 1)) + 1,
out = ifelse(max(out) > n(), 0, max(out))) %>%
pivot_wider(names_from = c(name, name), values_from = c(value, out)) %>%
select(-rownr)
Returns:
Group value_var1 value_var2 value_var3 out_var1 out_var2 out_var3
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 0 0 1 3 2
2 1 0 0 1 1 3 2
3 1 0 1 0 1 3 2
4 1 1 1 0 1 3 2
5 2 1 0 0 1 0 2
6 2 1 0 1 1 0 2
Pandas, group by count and add count to original dataframe?
IIUC
In [247]: df['count'] = df.groupby('kind').transform('count')
In [248]: df
Out[248]:
kind msg count
0 aaa aaa text 1 3
1 aaa aaa text 2 3
2 aaa aaa text 3 3
3 bb bb text 1 4
4 bb bb text 2 4
5 bb bb text 3 4
6 bb bb text 4 4
7 cccc cccc text 1 2
8 cccc cccc text 2 2
9 dd dd text 1 1
10 e e text 1 1
11 fff fff text 1 1
sorting:
In [249]: df.sort_values('count', ascending=False)
Out[249]:
kind msg count
3 bb bb text 1 4
4 bb bb text 2 4
5 bb bb text 3 4
6 bb bb text 4 4
0 aaa aaa text 1 3
1 aaa aaa text 2 3
2 aaa aaa text 3 3
7 cccc cccc text 1 2
8 cccc cccc text 2 2
9 dd dd text 1 1
10 e e text 1 1
11 fff fff text 1 1
Count number of rows within each group
Current best practice (tidyverse) is:
require(dplyr)
df1 %>% count(Year, Month)
Count rows in data table with certain values by group
You can solve it as follows:
cols <- c("number_of_offices", "number_of_apartments")
df[, (cols) := .(sum(Type == "office"), sum(Type == "apartment")), Property]
# Property Type number_of_offices number_of_apartments
# 1: 1 apartment 1 1
# 2: 1 office 1 1
# 3: 2 office 2 0
# 4: 2 office 2 0
# 5: 3 apartment 1 2
# 6: 3 apartment 1 2
# 7: 3 office 1 2
Count observations of distinct values per group and add a new column of counts for each value
Or without any additional library, you can just use table:
table(df$group,df$letter)
As you seem to work with data.table, you can also use dcast()
dcast(df, group~letter,length)
Related Topics
How to Remove Na from a Factor Variable (And from a Ggplot Chart)
How to Show Code But Hide Output in Rmarkdown
Error in Confusion Matrix:The Data and Reference Factors Must Have the Same Number of Levels
Formatting Decimal Places in R
Controlling Number of Decimal Digits in Print Output in R
How to Name Variables on the Fly
Filter Rows Which Contain a Certain String
Numeric Comparison Difficulty in R
R: How to Get the Percentage Change from Two Different Columns
Creating a for Loop to Subset Data on R
Counting Unique Values Across Variables (Columns) in R
Split Data Frame String Column into Multiple Columns
Aggregating by Unique Identifier and Concatenating Related Values into a String