Count Number of Null Values in Each Column in SQL

Count NULL Values from multiple columns with SQL

SELECT COUNT(*)-COUNT(A) As A, COUNT(*)-COUNT(B) As B, COUNT(*)-COUNT(C) As C
FROM YourTable;

Count number of NULL values in each column in SQL

As Paolo said, but here is an example:

DECLARE @TableName VARCHAR(512) = 'invoiceTbl';
DECLARE @SQL VARCHAR(1024);
WITH SQLText AS (
SELECT
ROW_NUMBER() OVER (ORDER BY c.Name) AS RowNum,
'SELECT ''' + c.name + ''', SUM(CASE WHEN ' + c.Name + ' IS NULL THEN 1 ELSE 0 END) AS NullValues FROM ' + @TableName AS SQLRow
FROM
sys.tables t
INNER JOIN sys.columns c ON c.object_id = t.object_id
WHERE
t.name = @TableName),
Recur AS (
SELECT
RowNum,
CONVERT(VARCHAR(MAX), SQLRow) AS SQLRow
FROM
SQLText
WHERE
RowNum = 1
UNION ALL
SELECT
t.RowNum,
CONVERT(VARCHAR(MAX), r.SQLRow + ' UNION ALL ' + t.SQLRow)
FROM
SQLText t
INNER JOIN Recur r ON t.RowNum = r.RowNum + 1
)
SELECT @SQL = SQLRow FROM Recur WHERE RowNum = (SELECT MAX(RowNum) FROM Recur);
EXEC(@SQL);

Snowflake: Count number of NULL values in each column SQL

some setup:

create table test.test.some_nulls(a string, b int, c boolean) as
select * from values
('a', 1, true),
('b', 2, false),
('c', 3, null),
('d', null, true),
('e', null, false),
('f', null, null),

(null, 7, true),
('h', 8, false),
(null, 9, null),
('j', null, true),
(null, null, false),
('l', null, null);

and:

SET DBName = 'TEST';
SET SchemaName = 'TEST';
SET TableName = upper('some_nulls');

this is the sql we are wanting to have run in the end:

select count(*)-count(a) as a, 
count(*)-count(b) as b,
count(*)-count(c) as c
from TEST.TEST.some_nulls;
















ABC
364

Count null values in each column of a table : PSQL

You can use an aggregate with a filter:

select count(*) filter (where col1 is null) as col1_nulls,
count(*) filter (where col2 is null) as col2_nulls,
count(*) filter (where col3 is null) as col3_nulls,
count(*) filter (where col4 is null) as col4_nulls
from the_table;

Number of Null and non-null values for each column in SQL for a given table

The reason for the error is that INFORMATION_SCHEMA.COLUMNS does not contain columns dbo and StudentTable, so using QUOTENAME(dbo), QUOTENAME(StudentScore) and WITHIN GROUP(ORDER BY dbo, StudentScore) is an eror.

If you want to count the NULL and NOT NULL values for NULLABLE columns for one specific table, the statement should be:

SELECT @sql = STRING_AGG(
FORMATMESSAGE('
SELECT table_schema = ''%s''
,table_name = ''%s''
,table_col_name = ''%s''
,row_num = COUNT(*)
,row_num_non_nulls = COUNT(%s)
,row_num_nulls = COUNT(*) - COUNT(%s)
FROM %s.%s
',
QUOTENAME(c.TABLE_SCHEMA),
QUOTENAME(c.TABLE_NAME),
QUOTENAME(c.COLUMN_NAME),
QUOTENAME(c.COLUMN_NAME),
QUOTENAME(c.COLUMN_NAME),
QUOTENAME(c.TABLE_SCHEMA),
QUOTENAME(c.TABLE_NAME)
),
' UNION ALL '
) WITHIN GROUP(ORDER BY c.TABLE_SCHEMA, c.TABLE_NAME)
FROM INFORMATION_SCHEMA.COLUMNS c
WHERE
(c.TABLE_SCHEMA = 'dbo') AND
(c.TABLE_NAME = 'StudentScore') AND
(c.IS_NULLABLE = 'YES');

SELECT @sql;
EXEC (@sql);

Count the number of nulls in each column

Try this

declare @Table_Name nvarchar(max), @Columns nvarchar(max), @stmt nvarchar(max)

declare table_cursor cursor local fast_forward for
select
s.name,
stuff(
(
select
', count(case when ' + name +
' is null then 1 else null end) as count_' + name
from sys.columns as c
where c.object_id = s.object_id
for xml path(''), type
).value('data(.)', 'nvarchar(max)')
, 1, 2, '')
from sys.tables as s

open table_cursor
fetch table_cursor into @Table_Name, @Columns

while @@FETCH_STATUS = 0
begin
select @stmt = 'select ''' + @Table_Name + ''' as Table_Name, ' + @Columns + ' from ' + @Table_Name

exec sp_executesql
@stmt = @stmt

fetch table_cursor into @Table_Name, @Columns
end

close table_cursor
deallocate table_cursor

Snowflake count nulls in all columns

But is there really not a way to count nulls in a table with say, over 30 columns? Like I don't want to specify them all by name?

yes exactly that. I don't understand why it's so difficult - it's like 1 line in pandas?

Keypoint here is if something is not provided as "batteries included" then you need to write your own version. It is not so hard as it may look.

Let's say the input table is as follow:

CREATE OR REPLACE TABLE t AS SELECT $1 AS col1, $2 AS col2, $3 AS col3, $4 AS col4 
FROM VALUES (1,2,3,10),(NULL,2,3,10),(NULL,NULL,4,10),(NULL,NULL,NULL,10);

SELECT * FROM t;
/*
+------+------+------+------+
| COL1 | COL2 | COL3 | COL4 |
+------+------+------+------+
| 1 | 2 | 3 | 10 |
| NULL | 2 | 3 | 10 |
| NULL | NULL | 4 | 10 |
| NULL | NULL | NULL | 10 |
+------+------+------+------+
*/

You probably know how to write the query that gives the desired output, but as it was not provided in the question I will use my own version:

WITH cte AS (
SELECT
COUNT(*) AS total_rows
,total_rows - COUNT(col1) AS col1
,total_rows - COUNT(col2) AS col2
,total_rows - COUNT(col3) AS col3
,total_rows - COUNT(col4) AS col4
FROM t
)
SELECT COLUMN_NAME, NULLS_COLUMN_COUNT,SUM(NULLS_COLUMN_COUNT) OVER() AS NULLS_TOTAL_COUNT
FROM cte
UNPIVOT (NULLS_COLUMN_COUNT FOR COLUMN_NAME IN (col1,col2,col3, col4))
ORDER BY COLUMN_NAME;
/*
+-------------+--------------------+-------------------+
| COLUMN_NAME | NULLS_COLUMN_COUNT | NULLS_TOTAL_COUNT |
+-------------+--------------------+-------------------+
| COL1 | 3 | 6 |
| COL2 | 2 | 6 |
| COL3 | 1 | 6 |
| COL4 | 0 | 6 |
+-------------+--------------------+-------------------+
*/

Here we could see that the query is "static" in nature with few moving parts(column_count_list/table_name/column_list):

WITH cte AS (
SELECT
COUNT(*) AS total_rows
<column_count_list>
FROM <table_name>
)
SELECT COLUMN_NAME, NULLS_COLUMN_COUNT,SUM(NULLS_COLUMN_COUNT) OVER() AS NULLS_TOTAL_COUNT
FROM cte
UNPIVOT (NULLS_COLUMN_COUNT FOR COLUMN_NAME IN (<column_list>))
ORDER BY COLUMN_NAME;

Now using the metadata and variables:

-- input
SET sch_name = 'my_schema';
SET tab_name = 't';

SELECT
LISTAGG(c.COLUMN_NAME, ', ') WITHIN GROUP(ORDER BY c.COLUMN_NAME) AS column_list
,ANY_VALUE(c.TABLE_SCHEMA || '.' || c.TABLE_NAME) AS full_table_name
,LISTAGG(REPLACE(SPACE(6) || ',total_rows - COUNT(<col_name>) AS <col_name>'
|| CHAR(13)
, '<col_name>', c.COLUMN_NAME), '')
WITHIN GROUP(ORDER BY COLUMN_NAME) AS column_count_list

,REPLACE(REPLACE(REPLACE(
'WITH cte AS (
SELECT
COUNT(*) AS total_rows
<column_count_list>
FROM <table_name>
)
SELECT COLUMN_NAME, NULLS_COLUMN_COUNT,SUM(NULLS_COLUMN_COUNT) OVER() AS NULLS_TOTAL_COUNT
FROM cte
UNPIVOT (NULLS_COLUMN_COUNT FOR COLUMN_NAME IN (<column_list>))
ORDER BY COLUMN_NAME;'
,'<column_count_list>', column_count_list)
,'<table_name>', full_table_name)
,'<column_list>', column_list) AS query_to_run

FROM INFORMATION_SCHEMA.COLUMNS c
WHERE TABLE_SCHEMA = UPPER($sch_name)
AND TABLE_NAME = UPPER($tab_name);

Running the code will generate the query to be run:

Sample Image

Copying the output and running it will give the output. This template could be further refined and wrapped with stored procedure if needed(but I will left it as an exercise).



Related Topics



Leave a reply



Submit