How to Do a Conditional Count After Groupby on a Pandas Dataframe

How to do a conditional count after groupby on a Pandas Dataframe?

I think you need add condition first:

#if need also category c with no values of 'one'
df11=df.groupby('key1')['key2'].apply(lambda x: (x=='one').sum()).reset_index(name='count')
print (df11)
  key1  count
0    a      2
1    b      1
2    c      0

Or use categorical with key1, then missing value is added by size:

df['key1'] = df['key1'].astype('category')
df1 = df[df['key2'] == 'one'].groupby(['key1']).size().reset_index(name='count') 
print (df1)
  key1  count
0    a      2
1    b      1
2    c      0

If need all combinations:

df2 = df.groupby(['key1', 'key2']).size().reset_index(name='count') 
print (df2)
  key1 key2  count
0    a  one      2
1    a  two      1
2    b  one      1
3    b  two      1
4    c  two      1

df3 = df.groupby(['key1', 'key2']).size().unstack(fill_value=0)
print (df3)
key2  one  two
key1          
a       2    1
b       1    1
c       0    1

Conditional counts in pandas group by

You can try replace your 2 lines with .count() to .sum(), as follows:

d['Zero_Balance_days'] = (x['Balance'] < 0).sum() 
d['Over_Credit_days'] = (x['Balance'] > x['Max Credit']).sum()

.count() returns number of non-NA/null observations in the Series of boolean index while both True/False are not NA/null and will be counted as well.

.sum() returns the sum of entries of True since True is interpreted as 1 while False is interpreted as 0 in the summation.

pandas dataframe groupby conditional count on multi-level column

The only ways to select MultiIndex columns from a groupby is with a list of tuples or a MultiIndex (as indicated by the Error Message):

So, instead of [('exp0', 'rdn')] it needs to be [[('exp0', 'rdn')]], then it just needs to be a valid column name like ('exp0', 'rnd0'), for example.

df['exp0_cnt', 'rdn0'] = (
    df.groupby([('grp1', 'cat')])[[('exp0', 'rnd0')]]
        #                                   ^ need to use valid column name
        #                         ^ needs to be a list of tuples
        .transform(lambda x: x.gt(50).sum())  # Some function that works
)

*I've also changed the apply function, because it seems to be missing the lambda so I made a guess as to an equivalent:

.apply(sum(x > 50 for x in df[(('exp0', 'rdn'))])

To transform since it's being assigned back to the DataFrame:

.transform(lambda x: x.gt(50).sum())

df:

experiments exp0      exp1      grp1 grp2 exp0_cnt
rnd_runs    rnd0 rnd1 rnd0 rnd1  cat cat2     rdn0
0             66   92   98   17    A    C        2
1             83   57   86   97    A    C        2  # 2 values over 50 (in group)
2             96   47   73   32    B    C        1
3             46   96   25   83    B    B        1  # 1 values over 50 (in group)

Please Note: This means that a SeriesGroupBy cannot be created by selecting MultiIndex columns, only DataFrameGroupBy operations.

type(df.groupby([('grp1', 'cat')])[[('exp0', 'rnd0')]])
# <class 'pandas.core.groupby.generic.DataFrameGroupBy'>

This will exclude a few operations like SeriesGroupBy.unique

df.groupby([('grp1', 'cat')])[[('exp0', 'rnd0')]].unique()

AttributeError: 'DataFrameGroupBy' object has no attribute 'unique'

However, we can force a SeriesGroupBy by Selecting the Series from the DataFrame and grouping by the Series values directly:

df[('exp0', 'rnd0')].groupby(df[('grp1', 'cat')]).unique()
#  ^ select specific column  ^ pass the Series to groupby directly

(grp1, cat)
A    [66, 83]
B    [96, 46]
Name: (exp0, rnd0), dtype: object

How can I use pandas groupby.count() for a condition

You can do it like this:

df_Grouped = df.groupby(['Ticker']).agg({'Trade Results': [('Count', 'count'), ('Profitable', lambda x: len(x[x>0]))]}).reset_index()

Output:

                 Count Profitable
0  BTC             3          2
1  ETH             3          2
2  LTC             3          2

Conditional Counting in Groupby Pandas

Using groupby.agg with a dictionary of calculations:

from collections import OrderedDict

df.columns=['ticker', 'date', 'accuracy']

groupers = OrderedDict([('mean', np.mean),
                        ('>_0.20_pct', lambda x: (x > 0.20).sum()/len(x)),
                        ('>_0.50_pct', lambda x: (x > 0.50).sum()/len(x)),
                        ('>_0.70_pct', lambda x: (x > 0.70).sum()/len(x))])

res = df.groupby('ticker')['accuracy'].agg(groupers)

print(res)

            mean  >_0.20_pct  >_0.50_pct  >_0.70_pct
ticker                                              
AAAP    0.806244    1.000000    0.666667    0.666667
AAL     0.298683    0.666667    0.000000    0.000000
ZAYO    0.164886    0.333333    0.000000    0.000000
ZBH     0.103811    0.000000    0.000000    0.000000

Conditionally count values in a pandas groupby object

I think you need:

np.random.seed(6)

N = 15
master_lso = pd.DataFrame({'lsoa11': np.random.randint(4, size=N),
                           'TOTAL_FLOOR_AREA': np.random.choice([0,30,40,50], size=N)})
master_lso['lsoa11'] = 'a' + master_lso['lsoa11'].astype(str)
print (master_lso)
    TOTAL_FLOOR_AREA lsoa11
0                 40     a2
1                 50     a1
2                 30     a3
3                  0     a0
4                 40     a2
5                  0     a1
6                 30     a3
7                  0     a2
8                 40     a0
9                  0     a2
10                 0     a1
11                50     a1
12                50     a3
13                40     a1
14                30     a1

First filter rows by condition by boolean indexing - it is faster before grouping, because less rows.

df = master_lso[master_lso['TOTAL_FLOOR_AREA'] > 30]
print (df)
    TOTAL_FLOOR_AREA lsoa11
0                 40     a2
1                 50     a1
4                 40     a2
8                 40     a0
11                50     a1
12                50     a3
13                40     a1

Then groupby and aggregate size:

df1 = df.groupby('lsoa11')['TOTAL_FLOOR_AREA'].size().reset_index(name='Count')
print (df1)
  lsoa11  Count
0     a0      1
1     a1      3
2     a2      2
3     a3      1

How to Do a Conditional Count After Groupby on a Pandas Dataframe