Grouping but with keeping all non-NULL values
When using a GROUP BY then the aggregate functions can be used for columns that aren't in the GROUP BY.
In this case I assume you want to use MAX, to get only a 1 or a NULL.
SUM or COUNT can also be used to surround a CASE WHEN
.
But then those would return a total.
SELECT
Name,
MAX(CASE WHEN Intolerance = 'Lactose' THEN 1 END) AS Lactose,
MAX(CASE WHEN Intolerance = 'Gluten' THEN 1 END) AS Gluten
FROM Table
GROUP BY Name
ORDER BY Name
Or if you don't want to see NULL's?
Then let the CASE return a varchar instead of a number.
SELECT
Name,
MAX(CASE WHEN Intolerance = 'Lactose' THEN '1' ELSE '' END) AS Lactose,
MAX(CASE WHEN Intolerance = 'Gluten' THEN '1' ELSE '' END) AS Gluten
FROM Table
GROUP BY Name
ORDER BY Name
How do I group by all non-null values and do not group null values
Try this:
SELECT itemname, SUM(whatever)
FROM tab
WHERE itemname IS NOT NULL
GROUP BY itemname
UNION ALL
SELECT itemname, whatever
FROM tab
WHERE itemname IS NULL
group by and select non null value if present
From your sample data I think that you don't need d
in the group by
clause.
So get its max:
select
a, b, c,
max(d) d,
count(distinct e) as something
from tableX
where f between '2019-07-01 00:00:00' and '2019-07-01 23:59:59.999'
group by a, b, c
Group by Only when no NULL Values are present on another Column
Put the condition in the HAVING clause:
select v.id, v.title, v.description, v.UserId, v.createdAt, v.updatedAt, min(usercerid) usercerid
from ViewName v
group by v.id, v.title, v.description, v.UserId, v.createdAt, v.updatedAt
having sum(v.usercerid is null) = 0
You must group by all the columns that you select.
I used min(usercerid)
as the output column although it's not obvious that you want it even in the results. If you don't need it remove it.
GROUP BY not NULL values
Turns out, I can just put the NULL
check in the GROUP BY
clause:
SELECT
any(Y) AS Y,
any(X) AS X
FROM my_table
GROUP BY COALESCE(Y, CAST(reflect("java.util.UUID", "randomUUID") AS STRING));
My version of Hive doesn't support IFNULL()
so COALESCE()
is a good alternative. My version Hive also doesn't support UUID()
so I called reflect()
to get unique id.
group by not-null values
I think the following does what you want:
SELECT *, (To_days(date_expires)-TO_DAYS(NOW())) as dayDiff, COUNT(id) AS change_count
FROM mytable
GROUP BY (case when source_id is null then id else source_id end)
HAVING dayDiff < 4
ORDER BY (case when source_id is null then 1 else 0 end), date_created DESC
It does a conditional group by
so the NULL sourceids will not be grouped. It then puts them last using logic in order by
.
I didn't understand what you meant by last occurrence. Now I think I do:
SELECT coalesce(s.id, mytable.id) as id,
max(case when s.maxid is not null and s.maxid = myable.id then mytable.name
when s.maxid is null then NULL
else mytable.name
end) as name,
(To_days(date_expires)-TO_DAYS(NOW())) as dayDiff, COUNT(id) AS change_count
FROM mytable left outer join
(select source_id, MAX(id) as maxid
from mytable
where source_id is not null
group by source_id
) s
on mytable.id = s.maxid
GROUP BY (case when source_id is null then id else source_id end)
HAVING dayDiff < 4
ORDER BY (case when source_id is null then 1 else 0 end), date_created DESC
This joins in the information from the latest record (based on highest id).
GROUP BY - do not group NULL
Perhaps you should add something to the null columns to make them unique and group on that? I was looking for some sort of sequence to use instead of UUID() but this might work just as well.
SELECT `table1`.*,
IFNULL(ancestor,UUID()) as unq_ancestor
GROUP_CONCAT(id SEPARATOR ',') AS `children_ids`
FROM `table1`
WHERE (enabled = 1)
GROUP BY unq_ancestor
MySQL get first non null value after group by
Try using MAX
, like this:
SELECT
email,
MAX(`name`)
FROM
(
SELECT
email,
`name`
FROM
multiple_tables_and_unions
) AS emails
GROUP BY email
Pandas Grouping by Id and getting non-NaN values
This should do what you what:
df.groupby('salesforce_id').first().reset_index(drop=True)
That will merge all the columns into one, keeping only the non-NaN value for each run (unless there are no non-NaN values in all the columns for that row; then the value in the final merged column will be NaN).
Related Topics
How to Convert This SQL Select to Linq Query
Convert Utc Milliseconds to Datetime in SQL Server
Best Way to Iterate Through Columns in a SQL Table
Mysql - Left Join Takes Too Long, How to Optimize Query
Regex to Filter for Numers With and Without Dots
How to Sum Up Time Field in SQL Server
Sql 0 Results for 'Not In' and 'In' When Row Does Exist
Sql Query to Select Million Records Quickly
How to Select Three Table With Same Column Name But Different Values
Adding $ Dollar Sign on My Total Cost in SQL Server
Sql Query to Return Only First Occurance of One Column Value
Unioning Two Tables With Different Number of Columns
Checking If a SQL Server Login Already Exists
Sql Server-How to Replace a Date Column With a Current Date When the Column Has a Null Value
How to Subtract One Month from a Date Column
Sql Server Function to Return Minimum Date (January 1, 1753)
How to Calculate Percentage Between Two Numbers Using SQL on Bigquery