How do I delete a column that contains only zeros in Pandas?
df.loc[:, (df != 0).any(axis=0)]
Here is a break-down of how it works:
In [74]: import pandas as pd
In [75]: df = pd.DataFrame([[1,0,0,0], [0,0,1,0]])
In [76]: df
Out[76]:
0 1 2 3
0 1 0 0 0
1 0 0 1 0
[2 rows x 4 columns]
df != 0
creates a boolean DataFrame which is True where df
is nonzero:
In [77]: df != 0
Out[77]:
0 1 2 3
0 True False False False
1 False False True False
[2 rows x 4 columns]
(df != 0).any(axis=0)
returns a boolean Series indicating which columns have nonzero entries. (The any
operation aggregates values along the 0-axis -- i.e. along the rows -- into a single boolean value. Hence the result is one boolean value for each column.)
In [78]: (df != 0).any(axis=0)
Out[78]:
0 True
1 False
2 True
3 False
dtype: bool
And df.loc
can be used to select those columns:
In [79]: df.loc[:, (df != 0).any(axis=0)]
Out[79]:
0 2
0 1 0
1 0 1
[2 rows x 2 columns]
To "delete" the zero-columns, reassign df
:
df = df.loc[:, (df != 0).any(axis=0)]
How to delete R data.frame columns with only zero values?
One option using dplyr
could be:
df %>%
select(where(~ any(. != 0)))
1 0 2 2
2 2 3 5
3 5 0 1
4 7 0 2
5 2 1 3
6 3 0 4
7 0 4 5
8 3 0 6
Drop all columns where all values are zero
If it's a matter of 0s and not sum, use df.any
:
In [291]: df.T[df.any()].T
Out[291]:
b
0 0
1 -1
2 0
3 1
Alternatively:
In [296]: df.T[(df != 0).any()].T # or df.loc[:, (df != 0).any()]
Out[296]:
b
0 0
1 -1
2 0
3 1
How do I delete columns that contain a zeros value in Pandas?
Try:
df.loc[:,~df.eq(0).any()]
OR
as suggested by @sammywemmy
df.loc[:, df.ne(0).all()]
Other possible solutions:
df.mask(df.eq(0)).dropna(axis=1)
#OR
df.drop(df.columns[df.eq(0).any()],1)
output of above code:
Names Henry Jesscia
0 Robert 54 5
1 Dan 22 55
Remove all columns or rows with only zeros out of a data frame
Using colSums()
:
df[, colSums(abs(df)) > 0]
i.e. a column has only zeros if and only if the sum of the absolute values is zero.
Remove columns with zero values from a dataframe
You almost have it. Put those two together:
SelectVar[, colSums(SelectVar != 0) > 0]
This works because the factor columns are evaluated as numerics that are >= 1.
Drop columns in Dataframe if more than 90% of the values in the column are 0's
First of all, next time please give an example dataset, not an image or copy of one. It's best to give a minimal example that reproduces your problem (it's also a good way to investigate your problem). This df, for example, will do the trick:
df = pd.DataFrame.from_dict({
'a':[1,0,0,0,0,0,0,0,0,0,0],
'b':[1,1,1,0,1,0,0,0,0,0,0]})
Now, the previous answers help, but if you can avoid a loop, it's preferable. You can write something simpler and more concise that will do the trick:
df.drop(columns=df.columns[df.eq(0).mean()>0.9])
Let's go through it step by step:
The df.eq(0)
returns True
\ False
in each cell.
The .mean()
method treats True as 1 and False as 0, so comparing that mean to 0.9 is what you want.
Calling df.columns[...]
at these places will return only those where the >0.9
holds,
and drop
just drops them.
Remove Column if all values are either NA or 0 in r
Multiple ways to do this
df[colSums(is.na(df) | df == 0) != nrow(df)]
# rv X1 X2 X4
#1 M 0 110 1
#2 J 70 200 3
#3 J NA 115 4
#4 M 65 110 9
#5 J 70 200 3
#6 J 64 115 8
Using apply
df[!apply(is.na(df) | df == 0, 2, all)]
Or using dplyr
library(dplyr)
df %>% select_if(~!all(is.na(.) | . == 0))
Best way to remove all columns and rows with zero sum from a pandas dataframe
here we go:
ad = np.array([[1, 0, 1, 0, 1],
[0, 0, 0, 0, 0],
[1, 1, 1, 0, 1],
[0, 1, 1, 0, 1],
[1, 1, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 1, 0, 1]])
df = pd.DataFrame(ad)
df.drop(df.loc[df.sum(axis=1)==0].index, inplace=True)
df.drop(columns=df.columns[df.sum()==0], inplace=True)
The code above will drop the row/column, when the sum of the row/column is zero
. This is achived by calucating the sum along the axis 1
for rows and 0
for columns and then dropting the row/column with a sum of 0
(df.drop(...)
)
Related Topics
Replacing the "Print" Function in Knitr Chunk Evaluation
Change Internal Function of a Package
How to Get Rmarkdown 1.2 with Microsoft R Open 3.3.2
Fit a No-Intercept Model in Caret
Sum Nlayers of a Rasterstack in R
R: Remove Multiple Empty Columns of Character Variables
Match and Replace Multiple Strings in a Vector of Text Without Looping in R
R - How to Find Points Within Specific Contour
R Shiny Error: Cannot Coerce Type 'Closure' to Vector of Type 'Double'
Cannot Coerce Type 'Closure' to Vector of Type 'Character'
Long and Wide Data - When to Use What
Replace Missing Value with Previous Value