Why Can't You Mix Aggregate Values and Non-Aggregate Values in a Single Select

Why can't you mix Aggregate values and Non-Aggregate values in a single SELECT?

Aggregates doesn't work on a complete result, they only work on a group in a result.

Consider a table containing:

Person   Pet
-------- --------
Amy Cat
Amy Dog
Amy Canary
Dave Dog
Susan Snake
Susan Spider

If you use a query that groups on Person, it will divide the data into these groups:

Amy:
Amy Cat
Amy Dog
Amy Canary
Dave:
Dave Dog
Susan:
Susan Snake
Susan Spider

If you use an aggreage, for exmple the count aggregate, it will produce one result for each group:

Amy:
Amy Cat
Amy Dog
Amy Canary count(*) = 3
Dave:
Dave Dog count(*) = 1
Susan:
Susan Snake
Susan Spider count(*) = 2

So, the query select Person, count(*) from People group by Person gives you one record for each group:

Amy    3
Dave 1
Susan 2

If you try to get the Pet field in the result also, that doesn't work because there may be multiple values for that field in each group.

(Some databases, like MySQL, does allow that anyway, and just returns any random value from within the group, and it's your responsibility to know if the result is sensible or not.)

If you use an aggregate, but doesn't specify any grouping, the query will still be grouped, and the entire result is a single group. So the query select count(*) from Person will create a single group containing all records, and the aggregate can count the records in that group. The result contains one row from each group, and as there is only one group, there will be one row in the result.

Combined aggregated and non-aggregate query in SQL

You can use ROW_NUMBER for the career-year:

SELECT player, year, games,
cyear = ROW_NUMBER () OVER (PARTITION BY player ORDER BY year),
gamesPerMax = 1.0 * games / MAX(games) OVER (PARTITION BY player)
FROM dbo.TableName

Demo

Have a look at the powerful OVER clause.

Can I use non-aggregate columns with group by?

You can't get the Id of the row that MAX found, because there might not be only one id with the maximum age.

How to combine aggregate with nonaggregated in mysql query

When MySQL only_full_group_by mode is on, it means this means that if you GROUP BY by some column, then you can only select one of two things , the column you group by and an aggregate function like MAX(), MIN()...;
If you do not want to change your sql_mode (that may result in problems in the future ),
below query should work for you.

    SELECT max(name) as name,
max(team) as team,
max(timestamp) as timestamp
FROM competitors join location using(competitor_id)
where competitor_id in (
select distinct competitor_id from competitors )
group by competitor_id
order by timestamp DESC ;

http://www.sqlfiddle.com/#!9/2cc8a6/1

Refrence link: SELECT list is not in GROUP BY clause and contains nonaggregated column .... incompatible with sql_mode=only_full_group_by

Using a pandas dataframe how to aggregate and groupby and bring in non aggregated/groupby columns

you can do it by sort_values depending on the case, drop_duplicates and keep last of first, then merge. You do the merge only on month and you specify suffixes to rename the column week that is in both dataframe.

new_df = df[['month', 'high_temp', 'week']].sort_values('high_temp').drop_duplicates('month', keep='last')\
.merge(df[['month', 'low_temp', 'week']].sort_values('low_temp').drop_duplicates('month', keep='first'),
on='month', suffixes=('_high_temp', '_low_temp'))

print (new_df)
month high_temp week_high_temp low_temp week_low_temp
0 jan 20 wk2 4 wk1
1 feb 30 wk1 23 wk1

How to think of aggregate functions in terms of sets

Aggregate functions are computed after
a) Joins
b) filtering rows based on the where clause.

Picturize your data set once the join is completed and the rows have been filtered based on the where clause in your query. The group by clause will now subdivide your data set into distinct groups based on the columns specified in your where clause. All the rows in a given group will have the same value for all the columns specified in your where clause.

Once the rows in the original data set have been classified into groups, you can only query for (ie select the columns) that are common to a group. In your second example where you have grouped by c.user_id, there will be one group for each distinct user_id in your collections table. If you do not have a 'HAVING' clause in your query, your query will return one row per group. You can think (only think) of each group as a row containing columns. These columns would represent things that are common to the entire group like count(*), SUM, MAX, MIN etc.. The column name on which the groups are formed is also the same for each group!

The having clause is a like a where clause for groups. It filters out entire groups that do not satisfy the predicate in the having clause.



Related Topics



Leave a reply



Submit