How to Find Which Columns Don't Have Any Data (All Values Are Null)

How to find which columns don't have any data (all values are NULL)?

For a single column, count(ColumnName) returns the number of rows where ColumName is not null:

select  count(TheColumn)
from    YourTable

You can generate a query for all columns. Per Martin's suggestion, you can exclude columns that cannot be null with is_nullable = 1. For example:

select  'count(' + name + ') as ' + name + ', '
from    sys.columns
where   object_id = object_id('YourTable')
        and is_nullable = 1

If the number of tables is large, you can generate a query for all tables in a similiar way. The list of all tables is in sys.tables.

Select columns with NULL values only

Here is the sql 2005 or later version: Replace ADDR_Address with your tablename.

declare @col varchar(255), @cmd varchar(max)

DECLARE getinfo cursor for
SELECT c.name FROM sys.tables t JOIN sys.columns c ON t.Object_ID = c.Object_ID
WHERE t.Name = 'ADDR_Address'

OPEN getinfo

FETCH NEXT FROM getinfo into @col

WHILE @@FETCH_STATUS = 0
BEGIN
    SELECT @cmd = 'IF NOT EXISTS (SELECT top 1 * FROM ADDR_Address WHERE [' + @col + '] IS NOT NULL) BEGIN print ''' + @col + ''' end'
    EXEC(@cmd)

    FETCH NEXT FROM getinfo into @col
END

CLOSE getinfo
DEALLOCATE getinfo

How to find which columns contain any NaN value in Pandas dataframe

UPDATE: using Pandas 0.22.0

Newer Pandas versions have new methods 'DataFrame.isna()' and 'DataFrame.notna()'

In [71]: df
Out[71]:
     a    b  c
0  NaN  7.0  0
1  0.0  NaN  4
2  2.0  NaN  4
3  1.0  7.0  0
4  1.0  3.0  9
5  7.0  4.0  9
6  2.0  6.0  9
7  9.0  6.0  4
8  3.0  0.0  9
9  9.0  0.0  1

In [72]: df.isna().any()
Out[72]:
a     True
b     True
c    False
dtype: bool

as list of columns:

In [74]: df.columns[df.isna().any()].tolist()
Out[74]: ['a', 'b']

to select those columns (containing at least one NaN value):

In [73]: df.loc[:, df.isna().any()]
Out[73]:
     a    b
0  NaN  7.0
1  0.0  NaN
2  2.0  NaN
3  1.0  7.0
4  1.0  3.0
5  7.0  4.0
6  2.0  6.0
7  9.0  6.0
8  3.0  0.0
9  9.0  0.0

OLD answer:

Try to use isnull():

In [97]: df
Out[97]:
     a    b  c
0  NaN  7.0  0
1  0.0  NaN  4
2  2.0  NaN  4
3  1.0  7.0  0
4  1.0  3.0  9
5  7.0  4.0  9
6  2.0  6.0  9
7  9.0  6.0  4
8  3.0  0.0  9
9  9.0  0.0  1

In [98]: pd.isnull(df).sum() > 0
Out[98]:
a     True
b     True
c    False
dtype: bool

or as @root proposed clearer version:

In [5]: df.isnull().any()
Out[5]:
a     True
b     True
c    False
dtype: bool

In [7]: df.columns[df.isnull().any()].tolist()
Out[7]: ['a', 'b']

to select a subset - all columns containing at least one NaN value:

In [31]: df.loc[:, df.isnull().any()]
Out[31]:
     a    b
0  NaN  7.0
1  0.0  NaN
2  2.0  NaN
3  1.0  7.0
4  1.0  3.0
5  7.0  4.0
6  2.0  6.0
7  9.0  6.0
8  3.0  0.0
9  9.0  0.0

SQL query to find columns having at least one non null value

I would not recommend using count(distinct) because it incurs overhead for removing duplicate values. You can just use count().

You can construct the query for counts using a query like this:

select count(col1) as col1_cnt, count(col2) as col2_cnt, . . .
from t;

If you have a list of columns you can do this as dynamic SQL. Something like this:

declare @sql nvarchar(max);

select @sql = concat('select ',
                     string_agg(concat('count(', quotename(s.value), ') as cnt_', s.value),
                      ' from t'
                    )
from string_split(@list) s;

exec sp_executesql(@sql);

This might not quite work if your columns have special characters in them, but it illustrates the idea.

Find all those columns which have only null values, in a MySQL table

You can avoid using a procedure by dynamically creating (from the INFORMATION_SCHEMA.COLUMNS table) a string that contains the SQL you wish to execute, then preparing a statement from that string and executing it.

The SQL we wish to build will look like:

SELECT * FROM (
  SELECT 'tableA' AS `table`,
         IF(COUNT(`column_a`), NULL, 'column_a') AS `column`
  FROM   tableA
UNION ALL
  SELECT 'tableB' AS `table`,
         IF(COUNT(`column_b`), NULL, 'column_b') AS `column`
  FROM   tableB
UNION ALL
  -- etc.
) t WHERE `column` IS NOT NULL

This can be done using the following:

SET group_concat_max_len = 4294967295; -- to overcome default 1KB limitation

SELECT CONCAT(
         'SELECT * FROM ('
       ,  GROUP_CONCAT(
            'SELECT ', QUOTE(TABLE_NAME), ' AS `table`,'
          , 'IF('
          ,   'COUNT(`', REPLACE(COLUMN_NAME, '`', '``'), '`),'
          ,   'NULL,'
          ,    QUOTE(COLUMN_NAME)
          , ') AS `column` '
          , 'FROM `', REPLACE(TABLE_NAME, '`', '``'), '`'
          SEPARATOR ' UNION ALL '
         )
       , ') t WHERE `column` IS NOT NULL'
       )
INTO   @sql
FROM   INFORMATION_SCHEMA.COLUMNS
WHERE  TABLE_SCHEMA = DATABASE();

PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

See it on sqlfiddle.

Pandas select all columns without NaN

You can create with non-NaN columns using

df = df[df.columns[~df.isnull().all()]]

null_cols = df.columns[df.isnull().all()]
df.drop(null_cols, axis = 1, inplace = True)

If you wish to remove columns based on a certain percentage of NaNs, say columns with more than 90% data as null

cols_to_delete = df.columns[df.isnull().sum()/len(df) > .90]
df.drop(cols_to_delete, axis = 1, inplace = True)

how to select rows with no null values (in any column) in SQL?

You need to explicitly list each column. I would recommend:

select t.*
from t
where col1 is not null and col2 is not null and . . .

Some people might prefer a more concise (but slower) method such as:

where concat(col1, col2, col3, . . . ) is not null

This is not actually a simple way to express this, although you can construct the query using metadata table or a spreadsheet.

How to Find Which Columns Don't Have Any Data (All Values Are Null)