Delete Columns Where All Values Are 0

How do I delete a column that contains only zeros in Pandas?

df.loc[:, (df != 0).any(axis=0)]

Here is a break-down of how it works:

In [74]: import pandas as pd

In [75]: df = pd.DataFrame([[1,0,0,0], [0,0,1,0]])

In [76]: df
Out[76]:
0 1 2 3
0 1 0 0 0
1 0 0 1 0

[2 rows x 4 columns]

df != 0 creates a boolean DataFrame which is True where df is nonzero:

In [77]: df != 0
Out[77]:
0 1 2 3
0 True False False False
1 False False True False

[2 rows x 4 columns]

(df != 0).any(axis=0) returns a boolean Series indicating which columns have nonzero entries. (The any operation aggregates values along the 0-axis -- i.e. along the rows -- into a single boolean value. Hence the result is one boolean value for each column.)

In [78]: (df != 0).any(axis=0)
Out[78]:
0 True
1 False
2 True
3 False
dtype: bool

And df.loc can be used to select those columns:

In [79]: df.loc[:, (df != 0).any(axis=0)]
Out[79]:
0 2
0 1 0
1 0 1

[2 rows x 2 columns]

To "delete" the zero-columns, reassign df:

df = df.loc[:, (df != 0).any(axis=0)]

How to delete R data.frame columns with only zero values?

One option using dplyr could be:

df %>%
select(where(~ any(. != 0)))

1 0 2 2
2 2 3 5
3 5 0 1
4 7 0 2
5 2 1 3
6 3 0 4
7 0 4 5
8 3 0 6

Drop all columns where all values are zero

If it's a matter of 0s and not sum, use df.any:

In [291]: df.T[df.any()].T
Out[291]:
b
0 0
1 -1
2 0
3 1

Alternatively:

In [296]: df.T[(df != 0).any()].T # or df.loc[:, (df != 0).any()]
Out[296]:
b
0 0
1 -1
2 0
3 1

How do I delete columns that contain a zeros value in Pandas?

Try:

df.loc[:,~df.eq(0).any()]

OR

as suggested by @sammywemmy

df.loc[:, df.ne(0).all()]

Other possible solutions:

df.mask(df.eq(0)).dropna(axis=1)
#OR
df.drop(df.columns[df.eq(0).any()],1)

output of above code:

    Names  Henry  Jesscia
0 Robert 54 5
1 Dan 22 55

Remove all columns or rows with only zeros out of a data frame

Using colSums():

df[, colSums(abs(df)) > 0]

i.e. a column has only zeros if and only if the sum of the absolute values is zero.

Remove columns with zero values from a dataframe

You almost have it. Put those two together:

 SelectVar[, colSums(SelectVar != 0) > 0]

This works because the factor columns are evaluated as numerics that are >= 1.

Drop columns in Dataframe if more than 90% of the values in the column are 0's

First of all, next time please give an example dataset, not an image or copy of one. It's best to give a minimal example that reproduces your problem (it's also a good way to investigate your problem). This df, for example, will do the trick:

df = pd.DataFrame.from_dict({
'a':[1,0,0,0,0,0,0,0,0,0,0],
'b':[1,1,1,0,1,0,0,0,0,0,0]})

Now, the previous answers help, but if you can avoid a loop, it's preferable. You can write something simpler and more concise that will do the trick:

df.drop(columns=df.columns[df.eq(0).mean()>0.9])

Let's go through it step by step:

The df.eq(0) returns True \ False in each cell.

The .mean() method treats True as 1 and False as 0, so comparing that mean to 0.9 is what you want.

Calling df.columns[...] at these places will return only those where the >0.9 holds,
and drop just drops them.

Remove Column if all values are either NA or 0 in r

Multiple ways to do this

df[colSums(is.na(df) | df == 0) != nrow(df)]

# rv X1 X2 X4
#1 M 0 110 1
#2 J 70 200 3
#3 J NA 115 4
#4 M 65 110 9
#5 J 70 200 3
#6 J 64 115 8

Using apply

df[!apply(is.na(df) | df == 0, 2, all)]

Or using dplyr

library(dplyr)
df %>% select_if(~!all(is.na(.) | . == 0))

Best way to remove all columns and rows with zero sum from a pandas dataframe

here we go:

ad = np.array([[1, 0, 1, 0, 1],
[0, 0, 0, 0, 0],
[1, 1, 1, 0, 1],
[0, 1, 1, 0, 1],
[1, 1, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 1, 0, 1]])

df = pd.DataFrame(ad)

df.drop(df.loc[df.sum(axis=1)==0].index, inplace=True)
df.drop(columns=df.columns[df.sum()==0], inplace=True)

The code above will drop the row/column, when the sum of the row/column is zero. This is achived by calucating the sum along the axis 1 for rows and 0 for columns and then dropting the row/column with a sum of 0 (df.drop(...))



Related Topics



Leave a reply



Submit