Find First Non-Null Values for Multiple Columns

Find first non-null values for multiple columns

Using first_value()

first_value(col) can be used with and OVER (ORDER BY CASE WHEN col IS NOT NULL THEN sortcol ELSE maxvalue END). ELSE maxvalue is required because SQL Server sorts nulls first)

CREATE TABLE foo(a int, b int, c int, sortCol int);
INSERT INTO foo VALUES
(null, 4, 8, 1),
(1, null, 0, 2),
(5, 7, null, 3);

Now you can see what we have to do to force nulls to sort after the sortcol. To do desc you have to make sure they have a negative value.

SELECT TOP(1)
first_value(a) OVER (ORDER BY CASE WHEN a IS NOT NULL THEN sortcol ELSE 2^31-1 END) AS a,
first_value(b) OVER (ORDER BY CASE WHEN b IS NOT NULL THEN sortcol ELSE 2^31-1 END) AS b,
first_value(c) OVER (ORDER BY CASE WHEN c IS NOT NULL THEN sortcol ELSE 2^31-1 END) AS c
FROM foo;

PostgreSQL

PostgreSQL is slightly simpler,

CREATE TABLE foo(a,b,c,sortCol)
AS VALUES
(null, 4, 8, 1),
(1, null, 0, 2),
(5, 7, null, 3);

SELECT
first_value(a) OVER (ORDER BY CASE WHEN a IS NOT NULL THEN sortcol END) AS a,
first_value(b) OVER (ORDER BY CASE WHEN b IS NOT NULL THEN sortcol END) AS b,
first_value(c) OVER (ORDER BY CASE WHEN c IS NOT NULL THEN sortcol END) AS c
FROM foo
FETCH FIRST ROW ONLY;

I believe all of this goes away when RDBMS start to adopt IGNORE NULLS. Then it'll just be first_value(a IGNORE NULLS).

Get the first non-null value from selected cells in a row

One option with dplyr could be:

df %>%
rowwise() %>%
mutate(liv6 = with(rle(c_across(liv:liv5)), values[which.max(values != 0)]))

MD liv liv2 liv3 liv4 liv5 liv6
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 100 0 6 1 1 0 6
2 200 0 2 1 0 2 2
3 300 1 0 1 0 7 1
4 400 3 4 1 3 9 3
5 500 4 5 1 5 10 4

How do I get the first non-null value from multiple columns based on another datetime column order and grouped by ID?

(df.sort_values('ts', ascending=False).bfill().groupby('id')[['site', 'type']]
.agg(lambda x:x.bfill().head(1)).reset_index())

id site type
0 111 A 1.0
1 222 C 1.0

Note that if YOU ARE SURE there is ATLEAST 1 NON-NAN per id then you can do:

(df.sort_values('ts', ascending=False).bfill().groupby('id')[['site', 'type']]
.first().reset_index())

id site type
0 111 A 1.0
1 222 C 1.0

SQL how to find the first non null value between 2 columns

Use COALESCE:

SELECT 
COALESCE(p.name, ph.name) as name
FROM partner p
FULL JOIN partner_history ph
ON p.idPartner = ph.idPartner

First non-null value per group in a table with many columns

There is a trick, which is to use array_agg() and remove nulls. That would be:

select groupid,
(array_remove(array_agg(num1 order by submitted desc), null))[1] as num1,
(array_remove(array_agg(num2 order by submitted desc), null))[1] as num2,
. . .
from t
group by groupid;

Here is a db<>fiddle.

Combing non-null values from two columns into one column

np.nan != np.nan is evaluated to True. So there are differences between the two commands (what happens when offer id is nan?).

Why don't you just use fillna:

transcript_cp['offer id'].fillna(transcript_cp['offer_id'])

How to select last three non-NULL columns across multiple columns

This will do what you request (db <> fiddle)

Edit: The initial version probably didn't do what you want if there were less than three NOT NULL values in a row. This version will shift them left.

SELECT Id,
CA.Col1,
CA.Col2,
CA.Col3,
NULL AS Col4,
NULL AS Col5,
NULL AS Col6
FROM YourTable
CROSS APPLY (SELECT MAX(CASE WHEN RN = 1 THEN val END) AS Col1,
MAX(CASE WHEN RN = 2 THEN val END) AS Col2,
MAX(CASE WHEN RN = 3 THEN val END) AS Col3
FROM (SELECT val,
ROW_NUMBER() OVER (ORDER BY ord) AS RN
FROM
(SELECT TOP 3 *
FROM (VALUES(1, col1),
(2, col2),
(3, col3),
(4, col4),
(5, col5),
(6, col6) ) v(ord, val)
WHERE val IS NOT NULL
ORDER BY ord DESC
) d1
) d2
) CA

BigQuery - get values from different columns based on first() non null value

Below is for BigQuery Standard SQL

For first non NULL article

#standardSQL
SELECT AS VALUE ARRAY_AGG(
STRUCT(session_id, article AS first_article, article_type, n_page)
ORDER BY n_page LIMIT 1
)[OFFSET(0)]
FROM `project.dataset.table`
WHERE NOT article IS NULL
GROUP BY session_id

For last non NULL article

#standardSQL
SELECT AS VALUE ARRAY_AGG(
STRUCT(session_id, article AS last_article, article_type, n_page)
ORDER BY n_page DESC LIMIT 1
)[OFFSET(0)]
FROM `project.dataset.table`
WHERE NOT article IS NULL
GROUP BY session_id


Related Topics



Leave a reply



Submit