Group by in Group by and Average

Group by columns under conditions to calculate average

Use DataFrame.pivot_table with helper column new by copy like ColB, then flatten MultiIndex and add ouput to new DataFrame created by aggregate sum:

df1 = (df.assign(new=df['ColB'])
.pivot_table(index=['ColA', 'ColB'],
columns='new',
values=['interval','duration'],
fill_value=0,
aggfunc='mean'))
df1.columns = df1.columns.map(lambda x: f'{x[0]}{x[1]}')
df = (df.groupby(['ColA','ColB'])['Counter']
.sum()
.to_frame(name='SumCounter')
.join(df1).reset_index())
print (df)
ColA ColB SumCounter durationSD durationUD intervalSD intervalUD
0 A SD 3 2.5 0.0 3.5 0
1 A UD 10 0.0 2.0 0.0 1
2 B SD 32 2.0 0.0 3.5 0
3 B UD 4 0.0 1.5 0.0 2

SQL AVG applied to more than one by group?

This is more simply done in separate rows with union all:

SELECT 'a' as which, a, AVG(c)
FROM mytable
GROUP BY a
UNION ALL
SELECT 'b' as which, b, AVG(c)
FROM mytable
GROUP BY b;

If you really want them side-by-side, the query is quite a bit more complicated:

SELECT MAX(a) as a, MAX(a_avg_c) as a_avg_c,
MAX(b) as b, MAX(b_avg_c) as b_avg_c
FROM ((SELECT a, AVG(c) as a_avg_c, null as b, null as b_avg_c,
ROW_NUMBER() OVER (ORDER BY a) as seqnum
FROM mytable
GROUP BY a
) UNION ALL
(SELECT null, null, b, AVG(c) as b_avc_c,
ROW_NUMBER() OVER (ORDER BY b) as seqnum
FROM mytable
GROUP BY b
)
) ab
GROUP BY seqnum;

This is more complicated because SQL treats a row as a single entity. You actually want columns on each row that are entirely unrelated to each other. So, this version creates a "relation" by assigning a sequential value and then aggregating by that value to get what you want.

Calculating the AVG value per GROUP in the GROUP BY Clause

The grouping happens on the values that get spit out of datepart(hour, ...). You're already filtering on that value so you know they're going to range between 6 and 18. That's all that the grouping is going to see.

Now of course the datepart() function does what you're looking for in that it looks at the clock and gives the hour component of the time. If you want your group to coincide with HH:00:00 to HH:59:59.997 then you're in luck.

I've already noted in comments that you probably meant to filter your range from 6 to 17 and that your query will probably perform better if you change that and compare your raw CallTime value against a static range instead. Your reasoning looks correct to me. And because your reasoning is correct, you don't need the inner query (derived table) at all.

Also if WaitDuration is an integer then you're going to be doing decimal division in your output. You'd need to cast to decimal in that case or change the divisor a decimal value like 60.00.

SQL - AVG and Group by

If you just need left part of product, cast to int and then aggregate using resultant value and date.

select date, 
cast(product as int) as product,
avg(price) as Price
from table1
group by date, cast(product as int)

Result:

date        product Price
--------------------------
05-12-17 1 30
06-12-17 1 25

DEMO


Update:

If product is of varchar datatype, use cast twice.

select date, 
cast(cast(product as dec(3,1)) as int) as product,
avg(price) as Price
from table1
group by date, cast(cast(product as dec(3,1)) as int)

Varchar() Datatype DEMO



Related Topics



Leave a reply



Submit