Add Column with Counts of Another

Python pandas: Add a column to my dataframe that counts a variable

Call transform this will return a Series aligned with the original df:

In [223]:

df['count'] = df.groupby('group')['group'].transform('count')
df
Out[223]:
    org  group  count
0  org1      1      2
1  org2      1      2
2  org3      2      1
3  org4      3      3
4  org5      3      3
5  org6      3      3

adding a column to df that counts occurrence of a value in another column

You can use the following solution:

library(dplyr)

df %>%
  group_by(id) %>%
  add_count(name = "id_occurrence")

# A tibble: 10 x 3
# Groups:   id [5]
       id places   id_occurrence
    <dbl> <chr>            <int>
 1 204850 kitchen              3
 2 204850 kitchen              3
 3 204850 garden               3
 4 312512 salon                2
 5 312512 salon                2
 6 452452 salon                1
 7 285421 bathroom             1
 8 758412 garden               3
 9 758412 bathroom             3
10 758412 garden               3

Add column with counts of another

You may try ave:

# first, convert 'gender' to class character
df$gender <- as.character(df$gender)

df$count <- as.numeric(ave(df$gender, df$gender, FUN = length))
df
#   gender age count
# 1      m  18     4
# 2      f  14     2
# 3      m  18     4
# 4      m  18     4
# 5      m  15     4
# 6      f  15     2

Update following @flodel's comment - thanks!

df <- transform(df, count = ave(age, gender, FUN = length))

How to add columns to my data frame, including the counts of another columns , for two different column, in R?

Here is a tidyverse approach. You can group_by your ID column, and count rows that is not NA.

library(tidyverse)

df %>% 
  group_by(ID, l) %>% 
  summarize(n.x = sum(!is.na(x)), n.y = sum(!is.na(y)), .groups = "drop")

# A tibble: 4 x 4
     ID l       n.x   n.y
  <int> <chr> <int> <int>
1     1 s         5     4
2     2 ss        3     2
3     3 m         7     3
4     4 mm        2     2

Adding a column to df that contains count of a value of a different column in the df?

Use transform() if you want to map back:

df['New_Col'] = df.groupby('Day')['Day'].transform('count')

Or you can use map, and also, value_counts():

df['New_Col'] = df['Day'].map(df['Day'].value_counts())

Output:

       Day  New_Col
0  Morning        2
1      Day        4
2    Night        2
3    Night        2
4      Day        4
5  Morning        2
6      Day        4
7      Day        4

Add column with numbers based on count of value in other column in Pandas

Use groupby_cumcount:

df['colB'] = df.groupby('colA').cumcount().add(1)
print(df)

# Output
   colA  colB
0  BJ02     1
1  BJ02     2
2  CJ02     1
3  CJ03     1
4  CJ02     2
5  DJ01     1
6  DJ02     1
7  DJ07     1
8  DJ07     2
9  DJ07     3

Suggested by @HenryEcker, use zfill:

df['colB'] = df.groupby('colA').cumcount().add(1).astype(str).str.zfill(3)
print(df)

# Output:
   colA colB
0  BJ02  001
1  BJ02  002
2  CJ02  001
3  CJ03  001
4  CJ02  002
5  DJ01  001
6  DJ02  001
7  DJ07  001
8  DJ07  002
9  DJ07  003

SQL : create column that count occurences of an other column values

You want count(*) as a window function:

select t.*, count(*) over (partition by name) as name_count
from t;

Add column with counts of another, depending on another column

Using data.table you could do something like the following:

library(data.table)
setDT(df)
merge(df, df[, WeeklyAT := .N, by = .(Contact.ID, Week)])

       Contact.ID       Date     Time Week Attendance X.WeeklyAT WeeklyAT
 1:          A 2012-10-06 18:54:48   44         30         *2        2
 2:          A 2012-10-08 20:50:18   44         30         *2        2
 3:          A 2013-05-24 20:18:44   21         30         *1        1
 4:          B 2012-11-15 16:58:15   46         40         *1        1
 5:          B 2013-01-09 10:57:02    2         40         *3        3
 6:          B 2013-01-11 17:31:22    2         40         *3        3
 7:          B 2013-01-14 18:37:00    2         40         *3        3
 8:          C 2013-02-22 17:46:07    8          5         *1        1
 9:          C 2013-02-27 11:21:00    9          5         *1        1
10:          D 2012-10-28 14:48:33   43         12         *1        1

EDIT:

Apparently dplyrcan do something very similar:

library(dplyr)
merge(df, 
      df %>% group_by(Contact.ID, Week) %>% summarise(WeeklyAT = n()))

How to create new column that counts and reset based on a string value in another column

You can shift() the Trend column to get trending indexes and then cumsum() within the trending groups:

trending = df.Trend.eq(df.Trend.shift())
df['Counter'] = trending.groupby(trending).cumsum().add(1)