find if each row of a logical matrix has at least one TRUE
apply(df, 1, any)
# YAL001C YAL002W YAL003W YAL004W YAL005C YAL007C
# FALSE FALSE FALSE FALSE FALSE TRUE
Is there any logic to find if there is at least one true in a row
An option using base R
with rowSums
and rowsum
. Create a logical matrix (df[-(1:2)] == 'True'
) based on the occurence of 'True' values, on the columns other than 'EmployeeID', 'Created', get the rowSums
, do a group by sum
with rowsum
with 'EmployeeID' on the logical vector and check if there are any values greater than 0, and return the row names of the matrix ('m1')
m1 <- rowsum(+(rowSums(df[-(1:2)] == 'True') > 0), df$EmployeeID) > 0
row.names(m1)[which(m1)]
#[1] "101" "106" "108"
rowsum
is not needed if the 'EmployeeID' are unique i.e. no duplicates
df$EmployeeID[(rowSums(df[-(1:2)] == 'True') > 0)]
#[1] 101 106 108
If we want to use tidyverse
library(dplyr)
df %>%
rowwise %>%
mutate(Flag = "True" %in% c_across(happy:energitic)) %>%
ungroup
logical matrix how to find efficiently row/column with true value
Let’s think about following example matrix:
[0, 0, 0, 0,
0, 0, 0, 0,
0, 0, 1, 1,
1, 1, 1, 1]
and push zero 16 times.
Then, False, True, True, True, False, True, True, True, False, True, True, True, False, False False and False will be obtained.
There is cyclic behavior (False, True, True, True).
If the length of continued ones was fixed, it isn’t necessary to recalculate every time in update.
Updated the matrix, the length of continued ones at top-left and bottom-right can be change, and it can be needed to update the cyclic memory.
Maintaining continued ones sequences, maintaining total count of cyclic behavior affected by the sequences, the complexity for the rows will be in O(1)
.
In case of column, instead of shifting and pushing, let matrix[cur]=bit
and cur = (cur+1)%(matrix_size*matrix_size)
to represent cur
as the actual upper-left of the matrix.
Maintaining col_sum of each column, maintaining total count satisfying the all-ones-condition, the complexity will be O(1)
.
class Matrix:
def __init__(self, n):
self.mat = [0] * (n*n)
self.seq_len = [0] * (n*n)
self.col_total = [0] * n
self.col_archive = 0
self.row_cycle_cnt = [0] * n
self.cur = 0
self.continued_one = 0
self.n = n
def update(self, bit):
prev_bit = self.mat[self.cur]
self.mat[self.cur] = bit
# update col total
col = self.cur % self.n
if self.col_total[col] == self.n:
self.col_archive -= 1
self.col_total[col] += bit - prev_bit
if self.col_total[col] == self.n:
self.col_archive += 1
# update row index
# process shift out
if prev_bit == 1:
prev_len = self.seq_len[self.cur]
if prev_len > 1:
self.seq_len[(self.cur + 1) % (self.n * self.n)] = prev_len-1
if self.n <= prev_len and prev_len < self.n*2:
self.row_cycle_cnt[self.cur % self.n] -= 1
# process new bit
if bit == 0:
self.continued_one = 0
else:
self.continued_one = min(self.continued_one + 1, self.n*self.n)
# write the length of continued_one at the head of sequence
self.seq_len[self.cur+1 - self.continued_one] = self.continued_one
if self.n <= self.continued_one and self.continued_one < self.n*2:
self.row_cycle_cnt[(self.cur+1) % self.n] += 1
# update cursor
self.cur = (self.cur + 1) % (self.n * self.n)
return (self.col_archive > 0) or (self.row_cycle_cnt[self.cur % self.n] > 0)
def check2(self):
for y in range(self.n):
cnt = 0
for x in range(self.n):
cnt += self.mat[(self.cur + y*self.n + x) % (self.n*self.n)]
if cnt == self.n:
return True
for x in range(self.n):
cnt = 0
for y in range(self.n):
cnt += self.mat[(self.cur + y*self.n + x) % (self.n*self.n)]
if cnt == self.n:
return True
return False
if __name__ == "__main__":
import random
random.seed(123)
m = Matrix(4)
for i in range(100000):
ans1 = m.update(random.randint(0, 1))
ans2 = m.check2()
assert(ans1 == ans2)
print("epoch:{} mat={} ans={}".format(i, m.mat[m.cur:] + m.mat[:m.cur], ans1))
time complexity: check that all elements of at least one row or column in a matrix are equal
Considering your algorithm, you're look at each cell twice
- Once in the first loop (for each row)
- Once in the second loop (for each column)
So your complexity is O(2N), which is ~ O(N).
So you're not really going to do better than this in a worst-case. Additionally, you're also already bailing out early in case of differences (with .all?
and .any?
), so your practical performance will depend on the data but should be pretty good.
I expect there's further room to optimize (like visiting each node once instead of twice) but not from a big-o perspective.
Determining Whether a Matrix Has At Least One Zero Element
The warning is generated because you're presenting a vector of logical to if
, which expects a single value. any
is a function to tell if any of the logical values are TRUE
:
any(A==0)
## [1] TRUE
any(B==0)
## [1] FALSE
There's also a function all
which determines if all of the values in a logical vector are TRUE
.
Compute if each row has a TRUE or is all NA
We can do with two rowSums
to create the NA
for rows having only NAs
(NA^!rowSums(!is.na(x)))*rowSums(x, na.rm = TRUE)>0
#[1] TRUE TRUE FALSE NA
Or another approach is with pmax
do.call(pmax, c(as.data.frame(x), na.rm = TRUE)) > 0
#[1] TRUE TRUE FALSE NA
Find last true element of columns
This would do it
m(max(i.*reshape([1:numel(m)],size(m))))
Explanation
So we are generating an array of indices
reshape([1:numel(m)],size(m))
ans =
1 5 9 13
2 6 10 14
3 7 11 15
4 8 12 16
That represents the indices for each value. The we multiply that with I
to get the values we are interested in
i.*reshape([1:numel(m)],size(m))
ans =
1 0 0 13
0 6 0 0
0 0 0 15
0 8 12 0
Then we do a max
on that since max
works on columns. This will give us the last index in each column.
max(i.*reshape([1:numel(m)],size(m)))
ans =
1 8 12 15
Then apply those indices on m
to get the values
m(max(i.*reshape([1:numel(m)],size(m))))
ans =
16 14 15 12
How to obtain a logical (T/F) result when I reach the last row of data in r?
You can try this:
1:nrow(df) == nrow(df)
To make it a function:
is.last <- function(data) {
1:nrow(data) == nrow(data)
}
is.last(df)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
Error in LDA(cdes, k = K, method = Gibbs, control = list(verbose = 25L, : Each row of the input matrix needs to contain at least one non-zero entry
It looks like some of your documents are empty, in the sense that they contain no counts of any feature.
You can remove them with:
cdes <- dfm_trim(df_des, min_docfreq = 2) %>%
dfm_subset(ntoken(cdes) > 0)
Related Topics
How to Uninstall R Completely from Os X
Creating a Stacked Bar Chart Centered on Zero Using Ggplot
R: How to Filter a Timestamp by Hour and Minute
R: Apply Function to Matrix with Elements of Vector as Argument
Shiny Datatable in Landscape Orientation
How to Append R Data Frame into Existing Excel Without Overwriting
Filtering Single-Column Data Frames
Why Can't One Have Several 'Value.Var' in 'Dcast'
How to Add Columnn Titles in a Sankey Chart Networkd3
Using Read.Csv.Sql to Select Multiple Values from a Single Column
Manually Defining The Colours of a Wireframe
Filter Dataframe Using Global Variable with The Same Name as Column Name
Error in Xj[I]: Invalid Subscript Type 'List'
Quantiles by Factor Levels in R
Spread with Duplicate Identifiers for Rows
Use Different Font Sizes for Different Portions of Text in Ggplot2 Title