Find first non-null values for multiple columns
Using first_value()
first_value(col)
can be used with and OVER (ORDER BY CASE WHEN col IS NOT NULL THEN sortcol ELSE maxvalue END)
. ELSE maxvalue
is required because SQL Server sorts nulls first)
CREATE TABLE foo(a int, b int, c int, sortCol int);
INSERT INTO foo VALUES
(null, 4, 8, 1),
(1, null, 0, 2),
(5, 7, null, 3);
Now you can see what we have to do to force nulls to sort after the sortcol
. To do desc
you have to make sure they have a negative value.
SELECT TOP(1)
first_value(a) OVER (ORDER BY CASE WHEN a IS NOT NULL THEN sortcol ELSE 2^31-1 END) AS a,
first_value(b) OVER (ORDER BY CASE WHEN b IS NOT NULL THEN sortcol ELSE 2^31-1 END) AS b,
first_value(c) OVER (ORDER BY CASE WHEN c IS NOT NULL THEN sortcol ELSE 2^31-1 END) AS c
FROM foo;
PostgreSQL
PostgreSQL is slightly simpler,
CREATE TABLE foo(a,b,c,sortCol)
AS VALUES
(null, 4, 8, 1),
(1, null, 0, 2),
(5, 7, null, 3);
SELECT
first_value(a) OVER (ORDER BY CASE WHEN a IS NOT NULL THEN sortcol END) AS a,
first_value(b) OVER (ORDER BY CASE WHEN b IS NOT NULL THEN sortcol END) AS b,
first_value(c) OVER (ORDER BY CASE WHEN c IS NOT NULL THEN sortcol END) AS c
FROM foo
FETCH FIRST ROW ONLY;
I believe all of this goes away when RDBMS start to adopt IGNORE NULLS
. Then it'll just be first_value(a IGNORE NULLS)
.
Get the first non-null value from selected cells in a row
One option with dplyr
could be:
df %>%
rowwise() %>%
mutate(liv6 = with(rle(c_across(liv:liv5)), values[which.max(values != 0)]))
MD liv liv2 liv3 liv4 liv5 liv6
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 100 0 6 1 1 0 6
2 200 0 2 1 0 2 2
3 300 1 0 1 0 7 1
4 400 3 4 1 3 9 3
5 500 4 5 1 5 10 4
How do I get the first non-null value from multiple columns based on another datetime column order and grouped by ID?
(df.sort_values('ts', ascending=False).bfill().groupby('id')[['site', 'type']]
.agg(lambda x:x.bfill().head(1)).reset_index())
id site type
0 111 A 1.0
1 222 C 1.0
Note that if YOU ARE SURE there is ATLEAST 1 NON-NAN per id then you can do:
(df.sort_values('ts', ascending=False).bfill().groupby('id')[['site', 'type']]
.first().reset_index())
id site type
0 111 A 1.0
1 222 C 1.0
SQL how to find the first non null value between 2 columns
Use COALESCE:
SELECT
COALESCE(p.name, ph.name) as name
FROM partner p
FULL JOIN partner_history ph
ON p.idPartner = ph.idPartner
First non-null value per group in a table with many columns
There is a trick, which is to use array_agg()
and remove nulls. That would be:
select groupid,
(array_remove(array_agg(num1 order by submitted desc), null))[1] as num1,
(array_remove(array_agg(num2 order by submitted desc), null))[1] as num2,
. . .
from t
group by groupid;
Here is a db<>fiddle.
Combing non-null values from two columns into one column
np.nan != np.nan
is evaluated to True
. So there are differences between the two commands (what happens when offer id
is nan
?).
Why don't you just use fillna
:
transcript_cp['offer id'].fillna(transcript_cp['offer_id'])
How to select last three non-NULL columns across multiple columns
This will do what you request (db <> fiddle)
Edit: The initial version probably didn't do what you want if there were less than three NOT NULL
values in a row. This version will shift them left.
SELECT Id,
CA.Col1,
CA.Col2,
CA.Col3,
NULL AS Col4,
NULL AS Col5,
NULL AS Col6
FROM YourTable
CROSS APPLY (SELECT MAX(CASE WHEN RN = 1 THEN val END) AS Col1,
MAX(CASE WHEN RN = 2 THEN val END) AS Col2,
MAX(CASE WHEN RN = 3 THEN val END) AS Col3
FROM (SELECT val,
ROW_NUMBER() OVER (ORDER BY ord) AS RN
FROM
(SELECT TOP 3 *
FROM (VALUES(1, col1),
(2, col2),
(3, col3),
(4, col4),
(5, col5),
(6, col6) ) v(ord, val)
WHERE val IS NOT NULL
ORDER BY ord DESC
) d1
) d2
) CA
BigQuery - get values from different columns based on first() non null value
Below is for BigQuery Standard SQL
For first non NULL article
#standardSQL
SELECT AS VALUE ARRAY_AGG(
STRUCT(session_id, article AS first_article, article_type, n_page)
ORDER BY n_page LIMIT 1
)[OFFSET(0)]
FROM `project.dataset.table`
WHERE NOT article IS NULL
GROUP BY session_id
For last non NULL article
#standardSQL
SELECT AS VALUE ARRAY_AGG(
STRUCT(session_id, article AS last_article, article_type, n_page)
ORDER BY n_page DESC LIMIT 1
)[OFFSET(0)]
FROM `project.dataset.table`
WHERE NOT article IS NULL
GROUP BY session_id
Related Topics
Create Temp Table with Range of Numbers
SQL Server Rounding Issue Where There Is 5
Characters That Must Be Escaped in T-Sql
How Can a Do a "Greatest-N-Per-Group" Query in Django
Moving Rows 'Up and Down' in a SQL Database
Convert a String to a Date in Access
Postgresql, Select from Max Id
How to Use the Results of a Stored Procedure from Within Another
How to Reuse a Common Table Expression
How to Use a SQL Select Statement with Access Vba
Oracle 10G Express Home Page Is Not Coming Up
There Is Already an Object Named '#Tmptable' in the Database
Split String in SQL Server to a Maximum Length, Returning Each as a Row