Remove the Columns with the Colsums=0

Remove columns with NA's and/or Zeros Only

One option would be to create a logical vector with colSums based on the number of NA or 0 elements in each column

d[!colSums(is.na(d)|d ==0) == nrow(d)]
# a c
#1 1 98
#2 5 67
#3 56 NA
#4 4 3
#5 9 7

Or another option is to replace all the 0s to NA and then apply is.na

d[colSums(!is.na(replace(d, d == 0, NA))) > 0]

Or more compactly with na_if

d[colSums(!is.na(na_if(d, 0))) > 0]

Excluding columns from a dataframe based on column sums

What about a simple subset? First, we create a simple data frameL

R> dd = data.frame(x = runif(5), y = 20*runif(5), z=20*runif(5))

Then select the columns where the sum is greater than 15

R> dd1 = dd[,colSums(dd) > 15]
R> ncol(dd1)
[1] 2

In your data set, you only want to subset columns 6 onwards, so something like:

 ##Drop the first five columns
dd[,colSums(dd[,6:ncol(dd)]) > 15]

or

 #Keep the first six columns
cols_to_drop = c(rep(TRUE, 5), dd[,6:ncol(dd)]>15)
dd[,cols_to_drop]

should work.


The key part to note is that in the square brackets, we want a vector of logicals, i.e. a vector of TRUE and FALSE. So if you wanted to subset using something a bit more complicated, then create a function that returns TRUE or FALSE and subset as usual.

Remove all columns or rows with only zeros out of a data frame

Using colSums():

df[, colSums(abs(df)) > 0]

i.e. a column has only zeros if and only if the sum of the absolute values is zero.

How to delete R data.frame columns with only zero values?

One option using dplyr could be:

df %>%
select(where(~ any(. != 0)))

1 0 2 2
2 2 3 5
3 5 0 1
4 7 0 2
5 2 1 3
6 3 0 4
7 0 4 5
8 3 0 6

Remove columns with zero values from a dataframe

You almost have it. Put those two together:

 SelectVar[, colSums(SelectVar != 0) > 0]

This works because the factor columns are evaluated as numerics that are >= 1.

How to remove columns and rows that sum to 0 while preserving non-numeric columns

try this:

# remove rows 
df <- df[rowSums(df[-(1:7)]) !=0, ]
# remove columns
df <- df[c(1:7,7 + which(colSums(df[-(1:7)]) !=0))]
# Site Date Mon Day Yr Szn SznYr B C D E F G
# 2 B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 1 0 0 0 0
# 3 B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 0 3 0 0 0
# 4 B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 0 0 0 0 10
# 5 B0002 7/28/97 7 28 1997 Summer 1997-Summer 0 0 0 5 0 0
# 7 B0002 7/28/97 7 28 1997 Summer 1997-Summer 0 0 0 0 6 0
# 10 B0002 7/28/97 7 28 1997 Summer 1997-Summer 0 0 0 0 0 8
# 11 B0002 6/28/07 6 28 2007 Summer 2007-Summer 3 6 1 7 0 1

You can do this in one step to get the same output as @dan-y (the same in this specific case, but different if you have negative values in your real data) :

    df <- df[rowSums(df[-(1:7)]) !=0,
c(1:7,7 + which(colSums(df[-(1:7)]) !=0))]


Related Topics



Leave a reply



Submit