How to Delete Rows Where All the Columns Are Zero

How to delete rows where all the columns are zero

You can use (1)

dat[as.logical(rowSums(dat != 0)), ]

This works for both positive and negative values.

Another, even faster, possibility for large datasets is (2)

dat[rowSums(!as.matrix(dat)) < ncol(dat), ]

A faster approach for short and long data frames is to use matrix multiplication (3):

dat[as.logical(abs(as.matrix(dat)) %*% rep(1L, ncol(dat))), ]

Some benchmarks:

# the original dataset
dat <- data.frame(a = c(0,0,2,3), b= c(1,0,0,0), c=c(0,0,1,3))

Codoremifa <- function() dat[rowSums(abs(dat)) != 0,]
Marco <- function() dat[!apply(dat, 1, function(x) all(x == 0)), ]
Sven <- function() dat[as.logical(rowSums(dat != 0)), ]
Sven_2 <- function() dat[rowSums(!as.matrix(dat)) < ncol(dat), ]
Sven_3 <- function() dat[as.logical(abs(as.matrix(dat)) %*% rep(1L,ncol(dat))), ]

library(microbenchmark)
microbenchmark(Codoremifa(), Marco(), Sven(), Sven_2(), Sven_3())
# Unit: microseconds
#          expr     min       lq   median       uq     max neval
#  Codoremifa() 267.772 273.2145 277.1015 284.0995 1190.197   100
#       Marco() 192.509 198.4190 201.2175 208.9925  265.594   100
#        Sven() 143.372 147.7260 150.0585 153.9455  227.031   100
#      Sven_2() 152.080 155.1900 156.9000 161.5650  214.591   100
#      Sven_3() 146.793 151.1460 153.3235 157.9885  187.845   100


# a data frame with 10.000 rows
set.seed(1)
dat <- dat[sample(nrow(dat), 10000, TRUE), ]
microbenchmark(Codoremifa(), Marco(), Sven(), Sven_2(), Sven_3())
# Unit: milliseconds
#          expr       min        lq    median        uq        max neval
#   Codoremifa()  2.426419  2.471204  3.488017  3.750189  84.268432   100
#        Marco() 36.268766 37.840246 39.406751 40.791321 119.233175   100
#         Sven()  2.145587  2.184150  2.205299  2.270764  83.055534   100
#       Sven_2()  2.007814  2.048711  2.077167  2.207942  84.944856   100
#       Sven_3()  1.814994  1.844229  1.861022  1.917779   4.452892   100

How to remove rows where all columns are zero using data.table

You can try with rowSums -

library(data.table)
setDT(dat)
dat[rowSums(dat != 0) != 0]

#   a b c
#1: 0 1 0
#2: 2 0 1
#3: 3 0 3

How to remove rows from a DataFrame where some columns only have zero values

try this,

df[~df[list('cdef')].eq(0).all(axis = 1)]

    a   b   c   d   e   f
0   1   2   3   4   5   6
1  11  22  33  44  55  66

Delete rows with all the zeros elements in all columns exceptionally leaving a single non zero column in pandas DF

Use iloc for select all columns without first, comapre for not equal by ne and test at least one True per rows by any for filter by boolean indexing, last reset_index:

df = df[df.iloc[:, 1:].ne(0).any(axis=1)].reset_index(drop=True)

Alternative with remove column Time:

df = df[df.drop('Time', axis=1).ne(0).any(axis=1)].reset_index(drop=True)

print (df)
   Time  a  b  c  d  e
0     2  1  2  0  0  0
1     4  5  0  0  0  0
2     6  7  0  0  0  0

How to remove rows where all columns are zero using dplyr pipe

Here's a dplyr option:

library(dplyr)
filter_all(dat, any_vars(. != 0))

#       A-XXX  fBM-XXX    P-XXX  vBM-XXX
#1 1.51653276 2.228752 1.733567 3.003979
#2 0.07703724 0.000000 0.000000 0.000000

Here we make use of the logic that if any variable is not equal to zero, we will keep it. It's the same as removing rows where all variables are equal to zero.

Regarding row.names:

library(tidyverse)
dat %>% rownames_to_column() %>% filter_at(vars(-rowname), any_vars(. != 0))
#         rowname      A-XXX  fBM-XXX    P-XXX  vBM-XXX
#1  BATF::JUN_AHR 1.51653276 2.228752 1.733567 3.003979
#2 BATF::JUN_CCR9 0.07703724 0.000000 0.000000 0.000000

Drop rows with all zeros in pandas data frame

It turns out this can be nicely expressed in a vectorized fashion:

> df = pd.DataFrame({'a':[0,0,1,1], 'b':[0,1,0,1]})
> df = df[(df.T != 0).any()]
> df
   a  b
1  0  1
2  1  0
3  1  1

How to delete rows where only specific columns are zero in a list in R?

Assume your dataframe is DF

DF <- DF[DF$RC + DF$medRC + DF$RC.norm + DF$medRC.norm != 0,]

Edit

For list of dataframe

# dummy data
listDF <- list(DF1 = data.frame(RC=c(0,1,2,3),
            medRC=c(0,3,4,0),
            RC.norm=c(0,3,3,3),
            medRC.norm=c(0,3,3,3)),
          DF2 = data.frame(RC=c(2,0,1,2,3),
            medRC=c(4,0,3,4,0),
            RC.norm=c(1,0,3,3,3),
            medRC.norm=c(0,0,3,3,3)))


filteredDF <- lapply(listDF, 
                     function(DF) DF[DF$RC + DF$medRC + DF$RC.norm + DF$medRC.norm != 0,])

filteredDF 

# $DF1
#   RC medRC RC.norm medRC.norm
# 2  1     3       3          3
# 3  2     4       3          3
# 4  3     0       3          3

# $DF2
#   RC medRC RC.norm medRC.norm
# 1  2     4       1          0
# 3  1     3       3          3
# 4  2     4       3          3
# 5  3     0       3          3

drop all rows when all cells are either zero or nan

Your solution should be changed by select all columns without first with DataFrame.all:

df1 = df.iloc[:, 1:]
df2 = df[(df1 != 0).all(axis=1) | (df1 != 'nan').all(axis=1)]

Working like DataFrame.isin:

df2 = df[~df.iloc[:, 1:].isin([0, 'nan']).all(axis=1)]

If NaN are missing values is necessary test them by DataFrame.notna:

df1 = df.iloc[:, 1:]
df2 = df[(df1 != 0).all(axis=1) | df1.notna().all(axis=1)]

Or:

df2 = df[~df.iloc[:, 1:].fillna('nan').isin([0, 'nan']).all(axis=1)]

How to Delete Rows Where All the Columns Are Zero