Count Unique Values Per Groups with Pandas

Count unique values per groups with Pandas

You need nunique:

df = df.groupby('domain')['ID'].nunique()

print (df)
domain
'facebook.com'    1
'google.com'      1
'twitter.com'     2
'vk.com'          3
Name: ID, dtype: int64

If you need to strip ' characters:

df = df.ID.groupby([df.domain.str.strip("'")]).nunique()
print (df)
domain
facebook.com    1
google.com      1
twitter.com     2
vk.com          3
Name: ID, dtype: int64

Or as Jon Clements commented:

df.groupby(df.domain.str.strip("'"))['ID'].nunique()

You can retain the column name like this:

df = df.groupby(by='domain', as_index=False).agg({'ID': pd.Series.nunique})
print(df)
    domain  ID
0       fb   1
1      ggl   1
2  twitter   2
3       vk   3

The difference is that nunique() returns a Series and agg() returns a DataFrame.

Count unique values using pandas groupby

I think you can use SeriesGroupBy.nunique:

print (df.groupby('param')['group'].nunique())
param
a    2
b    1
Name: group, dtype: int64

Another solution with unique, then create new df by DataFrame.from_records, reshape to Series by stack and last value_counts:

a = df[df.param.notnull()].groupby('group')['param'].unique()
print (pd.DataFrame.from_records(a.values.tolist()).stack().value_counts())
a    2
b    1
dtype: int64

Python group by and count distinct values in a column and create delimited list

You can use str.len in your code:

df3 = (df.groupby('company')['product']
         .apply(lambda x: list(x.unique()))
         .reset_index()
         .assign(count=lambda d: d['product'].str.len())  ## added line
      )

output:

     company            product  count
0     Amazon           [E-comm]      1
1   Facebook     [Social Media]      1
2     Google  [Search, Android]      2
3  Microsoft        [OS, X-box]      2

Count of unique values per group as new column with pandas

`GroupBy.transform('nunique')`

On v0.23.4, your solution works for me.

df['ncount'] = df.groupby('mID')['uID'].transform('nunique')
df
      uID mID  ncount
0   James   A       5
1   Henry   B       2
2     Abe   A       5
3   James   B       2
4   Henry   A       5
5   Brian   A       5
6  Claude   A       5
7   James   C       1

`GroupBy.nunique` + `pd.Series.map`

Additionally, with your existing solution, you could map the series back to mID:

df['ncount'] = df.mID.map(df.groupby('mID')['uID'].nunique())
df
      uID mID  ncount
0   James   A       5
1   Henry   B       2
2     Abe   A       5
3   James   B       2
4   Henry   A       5
5   Brian   A       5
6  Claude   A       5
7   James   C       1

Count the number of unique values per group

Looks like you want transform + nunique;

df['a_b_3'] = df.groupby('_a')['_b'].transform('nunique')        
df
   _a  _b  a_b_3
0   1   3      3
1   1   4      3
2   1   5      3
3   2   3      1
4   2   3      1
5   3   3      2
6   3   9      2

This is effectively groupby + nunique + map:

v = df.groupby('_a')['_b'].nunique()
df['a_b_3'] = df['_a'].map(v)

df
   _a  _b  a_b_3
0   1   3      3
1   1   4      3
2   1   5      3
3   2   3      1
4   2   3      1
5   3   3      2
6   3   9      2

Pandas groupby and count unique value of column

We can drop all lines with start=='P1', then groupby id and count unique finish:

(df[df['start'].ne('P1')]       # drop rows with `start` == 'P1'
   .groupby('id')               # group by `id`
   ['finish'].nunique()         # count unique `finish`
   .reset_index(name='result')  # match the output
)

Output:

  id  result
0  A       3
1  B       1

Pandas count average number of unique numbers across groups

IIUC, first you want to aggregate over each age and household:

agg = (df.groupby(['age_group', 'household_key'])
         .agg({'DAY': 'nunique'})
      )

and then groupby again for the mean, e.g.,

agg.groupby('age_group').mean()

will give you the mean for each age_group across the household_key.

Count Unique Values Per Groups with Pandas