Snowflake: "SQL Compilation Error:... Is Not a Valid Group by Expression"

Snowflake: SQL compilation error:... is not a valid group by expression

The commented out code is not correct. The reason is that the window function is parsed "after" the group by, and there is no product_id or retail_price after the group by.

This is easily fixed:

select city, state,
count(*) as city_sale_cnt,
count(*) over (partition by state) as state_sale_cnt,
sum(count(product_id)) over (partition by state) as ,
sum(retail_price) as city_price,
sum(sum(retail_price)) over ( partition by state) as state_price
from sales
group by 1, 2;

At first, using window functions in an aggregation query looks a bit confusing -- the nested aggregation functions look awkward. I find, though that it is pretty easy to get used to the syntax, once you have used it a few times.

line_item.value is not a valid group by expression

If you need to have some additional "total" (of any kind, supported functions are listed in the documentation) column along with your original columns, you need to use analytical function here.

This way you need to put all the column to aggregate by in partition by <column list> clause and you do not need a group by.

select
line_item.value:product_id::varchar as id,
discount_allocation.value:amount::decimal(36,2) as discount,
sum(discount)
over(
partition by
line_item.value:product_id,
discount_allocation.value:amount
) as total
from RAW_DATA_BUFFER,
lateral flatten(input=>src:line_items) line_item,
lateral flatten(input=>line_item.value:discount_allocations) discount_allocation
where record_type='shopify_order'
limit 10;

Also it's possible to calculate totals on totals with

select
display_cols,
sum(column_to_aggregate) as aggregated_column_to_aggregate,
sum(sum(column_to_aggregate))
over(partition by columns_to_total_by) as total_column_to_aggregate
from your_table
group by group_by_cols

where the first sum calculates total grouped by group_by_cols and the second (outer) sum calculates an analytical function (totals) on aggregated result by columns_to_total_by. One notice: both display_cols and columns_to_total_by should be in group_by_cols columns since group by is executed first).

Running total doesnt work Snowflake: error SQL compilation error:... is not a valid group by expression

You can use something like following (remove group by) -

SNOWFLAKE1#COMPUTE_WH@TEST_DB.PUBLIC>select closedate,sum(RECURRING_REVENUE_AMOUNTS) over (partition by closedate order
by closedate rows between unbounded preceding and current row) running_total from
OPPORTUNITY_OVERVIEW;
+------------+---------------+
| CLOSEDATE | RUNNING_TOTAL |
|------------+---------------|
| 2021-03-26 | 2000 |
| 2021-03-21 | 2944 |
| 2021-03-25 | 3444 |
| 2021-03-20 | 2300 |
| 2021-03-24 | 2444 |
| 2021-03-17 | 10200 |
| 2021-03-23 | 44323 |
| 2021-03-19 | 300 |
| 2021-03-22 | 2333 |
| 2021-03-18 | 30200 |
+------------+---------------+
10 Row(s) produced. Time Elapsed: 0.236s

Also refer to following which mentions partition is now always compatible with GROUP BY -
https://docs.snowflake.com/en/sql-reference/functions-analytic.html#general-tips

Using columns in window function that are not in Group By, I get : SQL compilation error: [COLUMN_A] is not a valid group by expression

The first_value() makes sense because there is only one value per group. However, the max() does not make sense.

I wonder if this does what you really want:

SELECT DISTINCT Col_A, Col_B,
FIRST_VALUE(Col_C) OVER (PARTITION BY Col_A, Col_B
ORDER BY Col_TimeStamp ASC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
),
MAX(Col_D) OVER (PARTITION BY Col_A, Col_B)
FROM mytable;

This returns the overall max of Col_D for each group.

Invalid group by expression error when using any_value with max and window function in Snowflake

When you added your QUALIFY clause you dropped the GROUP BY clause from your SQL, aggregate function like MAX, need all selections to be aggregate function OR to have a GROUP BY clause.

So if you only want the best row per the grouping clause, which you note, you aggregate functions need to be explicitly windowed. Thus

SELECT 
data_index::string AS data_index,
comp_id::string AS comp_id,
account_id::string AS account_id,
comp_version::string AS comp_version,
name::string AS name,
description::string AS description,
MAX(object_dict:"startshape-type")OVER(PARTITION BY comp_id)[0]::string AS startshape_type,
MAX(object_dict:"startshape-connector-type")OVER (PARTITION BY comp_id)[0]::string AS startshape_connector_type ,
MAX(object_dict:"startshape-action-type")OVER (PARTITION BY comp_id)[0]::string AS statshape_action_type,
MAX(object_dict:"overrides-enabled")OVER(PARTITION BY comp_id)[0]::string AS overrides_enabled,
FROM COMP_DATA
QUALIFY 1 = ROW_NUMBER() OVER (PARTITION BY comp_id ORDER BY comp_version DESC);

There is a small chance you will need to add a set of brackets around those MAX's like

(MAX(object_dict:"overrides-enabled")OVER(PARTITION BY comp_id))[0]::string AS overrides_enabled,

But I suspect it will work out of the box. And I assumed you don't want the row_number so pushed it into the qualify (because it will always be the value 1)

Not a GROUP BY expression error

The error says it all, you're not grouping by MEMBERS.MEMBER_ID and MEMBERS.MEMBER_NAME.

SELECT MEMBERS.MEMBER_ID, MEMBERS.MEMBER_NAME
, COUNT(personal_training_sessions.session_id)
FROM MEMBERS
JOIN personal_training_sessions
ON personal_training_sessions.member_id = members.member_id
GROUP BY MEMBERS.MEMBER_ID, MEMBERS.MEMBER_NAME

You want the count of personal sessions per member, so you need to group by the member information.

The basic (of course it can get a lot more complex) GROUP BY, SELECT query is:

SELECT <column 1>, <column n>
, <aggregate function 1>, <aggregate function n>
FROM <table_name>
GROUP BY <column 1>, <column n>

An aggregate function being, as Ken White says, something like MIN(), MAX(), COUNT() etc. You GROUP BY all the columns that are not aggregated.

This will only work as intended if your MEMBERS table is unique on MEMBER_ID, but based on your query I suspect it is. To clarify what I mean, if your table is not unique on MEMBER_ID then you're not counting the number of sessions per MEMBER_ID but the number of sessions per MEMBER_ID and per MEMBER_NAME. If they're in a 1:1 relationship then it's effectively the same thing but if you can have multiple MEMBER_NAMEs per MEMBER_ID then it's not.



Related Topics



Leave a reply



Submit