Count NULL Values from multiple columns with SQL
SELECT COUNT(*)-COUNT(A) As A, COUNT(*)-COUNT(B) As B, COUNT(*)-COUNT(C) As C
FROM YourTable;
Count number of NULL values in each column in SQL
As Paolo said, but here is an example:
DECLARE @TableName VARCHAR(512) = 'invoiceTbl';
DECLARE @SQL VARCHAR(1024);
WITH SQLText AS (
SELECT
ROW_NUMBER() OVER (ORDER BY c.Name) AS RowNum,
'SELECT ''' + c.name + ''', SUM(CASE WHEN ' + c.Name + ' IS NULL THEN 1 ELSE 0 END) AS NullValues FROM ' + @TableName AS SQLRow
FROM
sys.tables t
INNER JOIN sys.columns c ON c.object_id = t.object_id
WHERE
t.name = @TableName),
Recur AS (
SELECT
RowNum,
CONVERT(VARCHAR(MAX), SQLRow) AS SQLRow
FROM
SQLText
WHERE
RowNum = 1
UNION ALL
SELECT
t.RowNum,
CONVERT(VARCHAR(MAX), r.SQLRow + ' UNION ALL ' + t.SQLRow)
FROM
SQLText t
INNER JOIN Recur r ON t.RowNum = r.RowNum + 1
)
SELECT @SQL = SQLRow FROM Recur WHERE RowNum = (SELECT MAX(RowNum) FROM Recur);
EXEC(@SQL);
Snowflake: Count number of NULL values in each column SQL
some setup:
create table test.test.some_nulls(a string, b int, c boolean) as
select * from values
('a', 1, true),
('b', 2, false),
('c', 3, null),
('d', null, true),
('e', null, false),
('f', null, null),
(null, 7, true),
('h', 8, false),
(null, 9, null),
('j', null, true),
(null, null, false),
('l', null, null);
and:
SET DBName = 'TEST';
SET SchemaName = 'TEST';
SET TableName = upper('some_nulls');
this is the sql we are wanting to have run in the end:
select count(*)-count(a) as a,
count(*)-count(b) as b,
count(*)-count(c) as c
from TEST.TEST.some_nulls;
A | B | C |
---|---|---|
3 | 6 | 4 |
Count null values in each column of a table : PSQL
You can use an aggregate with a filter:
select count(*) filter (where col1 is null) as col1_nulls,
count(*) filter (where col2 is null) as col2_nulls,
count(*) filter (where col3 is null) as col3_nulls,
count(*) filter (where col4 is null) as col4_nulls
from the_table;
Number of Null and non-null values for each column in SQL for a given table
The reason for the error is that INFORMATION_SCHEMA.COLUMNS
does not contain columns dbo
and StudentTable
, so using QUOTENAME(dbo)
, QUOTENAME(StudentScore)
and WITHIN GROUP(ORDER BY dbo, StudentScore)
is an eror.
If you want to count the NULL
and NOT NULL
values for NULLABLE
columns for one specific table, the statement should be:
SELECT @sql = STRING_AGG(
FORMATMESSAGE('
SELECT table_schema = ''%s''
,table_name = ''%s''
,table_col_name = ''%s''
,row_num = COUNT(*)
,row_num_non_nulls = COUNT(%s)
,row_num_nulls = COUNT(*) - COUNT(%s)
FROM %s.%s
',
QUOTENAME(c.TABLE_SCHEMA),
QUOTENAME(c.TABLE_NAME),
QUOTENAME(c.COLUMN_NAME),
QUOTENAME(c.COLUMN_NAME),
QUOTENAME(c.COLUMN_NAME),
QUOTENAME(c.TABLE_SCHEMA),
QUOTENAME(c.TABLE_NAME)
),
' UNION ALL '
) WITHIN GROUP(ORDER BY c.TABLE_SCHEMA, c.TABLE_NAME)
FROM INFORMATION_SCHEMA.COLUMNS c
WHERE
(c.TABLE_SCHEMA = 'dbo') AND
(c.TABLE_NAME = 'StudentScore') AND
(c.IS_NULLABLE = 'YES');
SELECT @sql;
EXEC (@sql);
Count the number of nulls in each column
Try this
declare @Table_Name nvarchar(max), @Columns nvarchar(max), @stmt nvarchar(max)
declare table_cursor cursor local fast_forward for
select
s.name,
stuff(
(
select
', count(case when ' + name +
' is null then 1 else null end) as count_' + name
from sys.columns as c
where c.object_id = s.object_id
for xml path(''), type
).value('data(.)', 'nvarchar(max)')
, 1, 2, '')
from sys.tables as s
open table_cursor
fetch table_cursor into @Table_Name, @Columns
while @@FETCH_STATUS = 0
begin
select @stmt = 'select ''' + @Table_Name + ''' as Table_Name, ' + @Columns + ' from ' + @Table_Name
exec sp_executesql
@stmt = @stmt
fetch table_cursor into @Table_Name, @Columns
end
close table_cursor
deallocate table_cursor
Snowflake count nulls in all columns
But is there really not a way to count nulls in a table with say, over 30 columns? Like I don't want to specify them all by name?
yes exactly that. I don't understand why it's so difficult - it's like 1 line in pandas?
Keypoint here is if something is not provided as "batteries included" then you need to write your own version. It is not so hard as it may look.
Let's say the input table is as follow:
CREATE OR REPLACE TABLE t AS SELECT $1 AS col1, $2 AS col2, $3 AS col3, $4 AS col4
FROM VALUES (1,2,3,10),(NULL,2,3,10),(NULL,NULL,4,10),(NULL,NULL,NULL,10);
SELECT * FROM t;
/*
+------+------+------+------+
| COL1 | COL2 | COL3 | COL4 |
+------+------+------+------+
| 1 | 2 | 3 | 10 |
| NULL | 2 | 3 | 10 |
| NULL | NULL | 4 | 10 |
| NULL | NULL | NULL | 10 |
+------+------+------+------+
*/
You probably know how to write the query that gives the desired output, but as it was not provided in the question I will use my own version:
WITH cte AS (
SELECT
COUNT(*) AS total_rows
,total_rows - COUNT(col1) AS col1
,total_rows - COUNT(col2) AS col2
,total_rows - COUNT(col3) AS col3
,total_rows - COUNT(col4) AS col4
FROM t
)
SELECT COLUMN_NAME, NULLS_COLUMN_COUNT,SUM(NULLS_COLUMN_COUNT) OVER() AS NULLS_TOTAL_COUNT
FROM cte
UNPIVOT (NULLS_COLUMN_COUNT FOR COLUMN_NAME IN (col1,col2,col3, col4))
ORDER BY COLUMN_NAME;
/*
+-------------+--------------------+-------------------+
| COLUMN_NAME | NULLS_COLUMN_COUNT | NULLS_TOTAL_COUNT |
+-------------+--------------------+-------------------+
| COL1 | 3 | 6 |
| COL2 | 2 | 6 |
| COL3 | 1 | 6 |
| COL4 | 0 | 6 |
+-------------+--------------------+-------------------+
*/
Here we could see that the query is "static" in nature with few moving parts(column_count_list/table_name/column_list):
WITH cte AS (
SELECT
COUNT(*) AS total_rows
<column_count_list>
FROM <table_name>
)
SELECT COLUMN_NAME, NULLS_COLUMN_COUNT,SUM(NULLS_COLUMN_COUNT) OVER() AS NULLS_TOTAL_COUNT
FROM cte
UNPIVOT (NULLS_COLUMN_COUNT FOR COLUMN_NAME IN (<column_list>))
ORDER BY COLUMN_NAME;
Now using the metadata and variables:
-- input
SET sch_name = 'my_schema';
SET tab_name = 't';
SELECT
LISTAGG(c.COLUMN_NAME, ', ') WITHIN GROUP(ORDER BY c.COLUMN_NAME) AS column_list
,ANY_VALUE(c.TABLE_SCHEMA || '.' || c.TABLE_NAME) AS full_table_name
,LISTAGG(REPLACE(SPACE(6) || ',total_rows - COUNT(<col_name>) AS <col_name>'
|| CHAR(13)
, '<col_name>', c.COLUMN_NAME), '')
WITHIN GROUP(ORDER BY COLUMN_NAME) AS column_count_list
,REPLACE(REPLACE(REPLACE(
'WITH cte AS (
SELECT
COUNT(*) AS total_rows
<column_count_list>
FROM <table_name>
)
SELECT COLUMN_NAME, NULLS_COLUMN_COUNT,SUM(NULLS_COLUMN_COUNT) OVER() AS NULLS_TOTAL_COUNT
FROM cte
UNPIVOT (NULLS_COLUMN_COUNT FOR COLUMN_NAME IN (<column_list>))
ORDER BY COLUMN_NAME;'
,'<column_count_list>', column_count_list)
,'<table_name>', full_table_name)
,'<column_list>', column_list) AS query_to_run
FROM INFORMATION_SCHEMA.COLUMNS c
WHERE TABLE_SCHEMA = UPPER($sch_name)
AND TABLE_NAME = UPPER($tab_name);
Running the code will generate the query to be run:
Copying the output and running it will give the output. This template could be further refined and wrapped with stored procedure if needed(but I will left it as an exercise).
Related Topics
What Is the SQL for 'Next' and 'Previous' in a Table
Calculate Delta(Difference of Current and Previous Row) in SQL
Merging Date Intervals in SQL Server
Does Oracle Roll Back the Transaction on an Error
Update and Select in One Query
Update Values in Struct Arrays in Bigquery
Window Functions: Partition by One Column After Order by Another
How to Deal with Single Quote in Word Vba SQL Query
Grant Privileges on Future Tables in Postgresql
Selecting Specific Row Number in SQL
Teradata Equivalent for Lead and Lag Function of Oracle
How to Transform Rows to Columns
Combining Union All and Order by in Firebird
Using Dynamic SQL to Specify a Column Name by Adding a Variable to Simple SQL Query