Counting number of zeros per row by Pandas DataFrame?
Use a boolean comparison which will produce a boolean df, we can then cast this to int, True becomes 1, False becomes 0 and then call count
and pass param axis=1
to count row-wise:
In [56]:
df = pd.DataFrame({'a':[1,0,0,1,3], 'b':[0,0,1,0,1], 'c':[0,0,0,0,0]})
df
Out[56]:
a b c
0 1 0 0
1 0 0 0
2 0 1 0
3 1 0 0
4 3 1 0
In [64]:
(df == 0).astype(int).sum(axis=1)
Out[64]:
0 2
1 3
2 2
3 2
4 1
dtype: int64
Breaking the above down:
In [65]:
(df == 0)
Out[65]:
a b c
0 False True True
1 True True True
2 True False True
3 False True True
4 False False True
In [66]:
(df == 0).astype(int)
Out[66]:
a b c
0 0 1 1
1 1 1 1
2 1 0 1
3 0 1 1
4 0 0 1
EDIT
as pointed out by david the astype
to int
is unnecessary as the Boolean
types will be upcasted to int
when calling sum
so this simplifies to:
(df == 0).sum(axis=1)
counting leading & trailing zeros for every row in a dataframe in R
We could use rowCumsums
from matrixStats
along with rowSums
library(matrixStats)
cbind(df[1], total_zeros = rowSums(df[-1] == 0),
Leading_zeros = rowSums(!rowCumsums(df[-1] != 0)))
-output
key total_zeros Leading_zeros
1 10A 3 1
2 11xy 1 0
3 445pe 3 2
or in tidyverse, we may also use rowwise
library(dplyr)
df %>%
mutate(total_zeros = rowSums(select(., starts_with("Obs")) == 0)) %>%
rowwise %>%
transmute(key, total_zeros,
Leading_zeros = sum(!cumsum(c_across(starts_with('Obs')) != 0))) %>%
ungroup
-output
# A tibble: 3 x 3
key total_zeros Leading_zeros
<chr> <dbl> <int>
1 10A 3 1
2 11xy 1 0
3 445pe 3 2
In Python, check for zeros in each row, if row has 3 or more zeros, remove the row. Current code does nothing to the file
Update
df = pd.read_csv('GiftYearTotal.csv', encoding='ISO-8859-1')
df = df.apply(lambda x: x.str.strip())
out = df[df.eq('$0.00').sum(1) <= 3]
Old answer
You can use:
out = df[df.eq('$0.00').sum(1) <= 3]
print(out)
# Output
Year 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022
1 Person_B $100.00 $150.00 $1.00 $50.00 $0.25 $100.00 $0.00 $50.00 $60.00 $50.00 $0.00 $0.00 $1000.00
Pandas: Counting the proportion of zeros in rows and columns of dataframe
try this instead of the first funtion:
print(df[df == 0].count(axis=1)/len(df.columns))
UPDATE (correction):
print('rows')
print(df[df == 0].count(axis=1)/len(df.columns))
print('cols')
print(df[df == 0].count(axis=0)/len(df.index))
Input data (i've decided to add a few rows):
ID var1 var2
1 2 3
2 5 0
3 4 5
4 10 10
5 1 0
Output:
rows
ID
1 0.0
2 0.5
3 0.0
4 0.0
5 0.5
dtype: float64
cols
var1 0.0
var2 0.4
dtype: float64
Count number of zeros per row, and remove rows with more than n zeros
It's not only possible, but very easy:
DF[rowSums(DF == 0) <= 4, ]
You could also use apply
:
DF[apply(DF == 0, 1, sum) <= 4, ]
pandas groupby count the number of zeros in a column
I believe need DataFrameGroupBy.agg
with compare by 0
and sum
:
a) To count no. of zero values:
df1 = df.groupby('Date').agg(lambda x: x.eq(0).sum())
print (df1)
B C
Date
20.07.2018 0 1
21.07.2018 1 1
b) To count no. of non-zero values:
df2 = df.groupby('Date').agg(lambda x: x.ne(0).sum())
print (df2)
B C
Date
20.07.2018 2 1
21.07.2018 1 1
Another idea for improve performance is create DatetimeIndex
, comapre columns and last use sum
per level (DatetimeIndex):
df1 = df.set_index('Date').eq(0).sum(level=0)
print (df1)
B C
Date
20.07.2018 0 1
21.07.2018 1 1
df2 = df.set_index('Date').ne(0).sum(level=0)
print (df2)
B C
Date
20.07.2018 2 1
21.07.2018 1 1
Related Topics
Python Selenium - Element Is Not Currently Interactable and May Not Be Manipulated
How to Limit the User Input to Only Integers in Python
How to Map the Differences Between Two Strings
How to Import a File in Python With Spaces in the Name
Print() Prints Only Every Second Input
In Python, How to Split a String and Keep the Separators
How to Select Last Row and Also How to Access Pyspark Dataframe by Index
Check If a Specific Class and Value Exist in HTML Using Beautifulsoup Python
Pyspark Add New Row to Dataframe
Pyspark Regexp_Replace With List Elements Are Not Replacing the String
Python: How to Print Separate Lines from a List
Python - How to Make User Input Not Case Sensitive
Converting Pandas Column of Comma-Separated Strings into Integers
How to Select All Elements Greater Than a Given Values in a Dataframe
Convert Regular Python String to Raw String
A Way to Quick Preview .Ipynb Files
Find the Index of a Value in a 2D Array
How to Detect and Remove Outliers from Each Column of Pandas Dataframe At One Go