counting the number of values greater than 0 in R in multiple columns
We can use
colSums(myDF[c("L2", "L3", "L4")] > 0)
Count rows which have value 0 for each column (in R)
Try this solution with base R
. One feature of matrices is that they can be easily indexed so that you can apply functions at row or column level like your question. Here the code:
#Code
colSums(ma>0)
Output:
colSums(ma>0)
x y
2 4
Some data used:
#Data
ma <- structure(c(0, 0, 2, 0.3, 3, 0.1, 2, 1), .Dim = c(4L, 2L), .Dimnames = list(
c("a", "b", "c", "d"), c("x", "y")))
You can also save the output like this:
#Save
res <- colSums(ma>0)
Number of column values greater than 0 for given row?
You just need to use the apply
function:
## Example Data
dd = data.frame(col1 = c(0, .3, 0), col2=c(0, 1, 0),
col3=c(0.4, 0, 0.8))
apply(dd, 1, function(i) sum(i > 0))
So to add this too your existing data frame:
dd$col4 = apply(dd, 1, function(i) sum(i > 0))
Alternatively, we could convert the data frame to logical values then use rowSums
rowSums(dd > 0)
Count number of non-NA values greater than 0 by group
We can use
colSums(df[c("L2", "L3", "L4")] > 0, na.rm = TRUE)
Or you may want a sum per person:
m <- rowsum((df[c("L2", "L3", "L4")] > 0) + 0, df[["Name"]], na.rm = TRUE)
# L2 L3 L4
#Carl 1 1 2
#Joe 1 2 1
There is something fun here. df[c("L2", "L3", "L4")] > 0
is a logical matrix (with NA
):
- Although
colSums
can work with it without trouble,rowsum
can not. So a fix is to add a0
to this matrix to cast it to a 0-1 numerical matrix; when adding this
0
, we must do(df[c("L2", "L3", "L4")] > 0) + 0
notdf[c("L2", "L3", "L4")] > 0 + 0
. The operation precedence in R means+
is prior to>
. Have a try on this toy example:5 > 4 + 0 ## FALSE
(5 > 4) + 0 ## 1So we want a bracket to evaluate
>
first, then+
.
If you want the result to be a data frame, just cast the resulting matrix into a data frame by:
data.frame(m)
Follow-up
People stop responding, because your specific question on getting a function is less interesting than getting the summary dataset.
Well, if you still take my approach, I would define such function as:
extract <- function (person) {
m <- rowsum((df[c("L2", "L3", "L4")] > 0) + 0, df[["Name"]], na.rm = TRUE)
rowSums(m)[[person]]
}
Then you can call
extract("Joe")
# 4
extract("Carl")
# 4
Note, this is obviously not the most efficient way to write such a function. Because if you only want to extract the sum for one person, there is no need to proceed all data. We can do:
extract2 <- function (person) {
## subset data
sub <- subset(df, df$Name == person, select = c("L2", "L3", "L4"))
## get sum
sum(sub > 0, na.rm = TRUE)
}
Then you can call
extract2("Joe")
# 4
extract2("Carl")
# 4
Count occurrences of value in multiple columns with duplicates
You could just subset the to
vector:
data.table(table(unlist(toy_data[,c(from,to[to!=from])])))
V1 N
1: A 3
2: B 1
3: C 2
4: D 1
5: E 2
6: F 1
R count values larger than zero in data frame columns
Here is one way:
> mat <- data.frame(A=c(12,10,0,14,0,60),B=c(0,0,0,0,13,65))
>
> keep <- (colSums(mat > 0) / nrow(mat)) > 0.5
> keep
A B
TRUE FALSE
>
> mat[, keep, drop = FALSE]
A
1 12
2 10
3 0
4 14
5 0
6 60
Counting values greater than 0 in a given area (specific Rows * Columns) - Python, Excel, Pandas
Use a boolean mask and sum it:
N = sum((df['Participant'] == 1) & (df['Condition'] == 1) & (df['RT'].notna()))
print(N)
# Output
1
Details:
m1 = df['Participant'] == 1
m2 = df['Condition'] == 1
m3 = df['RT'].notna()
df[['m1', 'm2', 'm3']] = pd.concat([m1, m2, m3], axis=1)
print(df)
# Output
Participant Condition RT m1 m2 m3
0 1 1 0.10 True True True # All True, N = 1
1 1 1 NaN True True False
2 1 2 0.48 True False True
3 2 1 1.20 False True True
4 2 2 NaN False False False
5 2 2 0.58 False False True
Related Topics
Why "Character Is Often Preferred to Factor" in Data.Table for Key
How to Split a Vector by Delimiter
Generating a Date from a String with a 'Month-Year' Format
How to Add Rows with 0 Counts to Summarised Output
Using Jupyter R Kernel with Visual Studio Code
How to Create a Binary Vector with 1 If Elements Are Part of the Same Vector
Subsetting a Data Frame to the Rows Not Appearing in Another Data Frame
Display Selected Folder Path in Shiny
Multiply All the Columns in a Data.Frame by the First
Cant Create File Name with Time Stamp
How to Force the X-Axis Tick Marks to Appear at the End of Bar in Heatmap Graph
Scale Value Inside of Aes_String()
How to Unlock Environment in R
How Can One Mix 2 or More Color Palettes to Show a Combined Color Value