Find If Each Row of a Logical Matrix Has at Least One True

find if each row of a logical matrix has at least one TRUE

apply(df, 1, any)
# YAL001C YAL002W YAL003W YAL004W YAL005C YAL007C
# FALSE FALSE FALSE FALSE FALSE TRUE

Is there any logic to find if there is at least one true in a row

An option using base R with rowSums and rowsum. Create a logical matrix (df[-(1:2)] == 'True') based on the occurence of 'True' values, on the columns other than 'EmployeeID', 'Created', get the rowSums, do a group by sum with rowsum with 'EmployeeID' on the logical vector and check if there are any values greater than 0, and return the row names of the matrix ('m1')

m1 <- rowsum(+(rowSums(df[-(1:2)] == 'True') > 0), df$EmployeeID) > 0
row.names(m1)[which(m1)]
#[1] "101" "106" "108"

rowsum is not needed if the 'EmployeeID' are unique i.e. no duplicates

df$EmployeeID[(rowSums(df[-(1:2)] == 'True') > 0)]
#[1] 101 106 108

If we want to use tidyverse

library(dplyr)
df %>%
rowwise %>%
mutate(Flag = "True" %in% c_across(happy:energitic)) %>%
ungroup

logical matrix how to find efficiently row/column with true value

Let’s think about following example matrix:

[0, 0, 0, 0,
0, 0, 0, 0,
0, 0, 1, 1,
1, 1, 1, 1]

and push zero 16 times.

Then, False, True, True, True, False, True, True, True, False, True, True, True, False, False False and False will be obtained.

There is cyclic behavior (False, True, True, True).

If the length of continued ones was fixed, it isn’t necessary to recalculate every time in update.

Updated the matrix, the length of continued ones at top-left and bottom-right can be change, and it can be needed to update the cyclic memory.

Maintaining continued ones sequences, maintaining total count of cyclic behavior affected by the sequences, the complexity for the rows will be in O(1).

In case of column, instead of shifting and pushing, let matrix[cur]=bit and cur = (cur+1)%(matrix_size*matrix_size) to represent cur as the actual upper-left of the matrix.

Maintaining col_sum of each column, maintaining total count satisfying the all-ones-condition, the complexity will be O(1).

class Matrix:
def __init__(self, n):
self.mat = [0] * (n*n)
self.seq_len = [0] * (n*n)
self.col_total = [0] * n
self.col_archive = 0
self.row_cycle_cnt = [0] * n
self.cur = 0
self.continued_one = 0
self.n = n

def update(self, bit):
prev_bit = self.mat[self.cur]
self.mat[self.cur] = bit

# update col total
col = self.cur % self.n
if self.col_total[col] == self.n:
self.col_archive -= 1
self.col_total[col] += bit - prev_bit
if self.col_total[col] == self.n:
self.col_archive += 1

# update row index
# process shift out
if prev_bit == 1:
prev_len = self.seq_len[self.cur]
if prev_len > 1:
self.seq_len[(self.cur + 1) % (self.n * self.n)] = prev_len-1
if self.n <= prev_len and prev_len < self.n*2:
self.row_cycle_cnt[self.cur % self.n] -= 1
# process new bit
if bit == 0:
self.continued_one = 0
else:
self.continued_one = min(self.continued_one + 1, self.n*self.n)
# write the length of continued_one at the head of sequence
self.seq_len[self.cur+1 - self.continued_one] = self.continued_one
if self.n <= self.continued_one and self.continued_one < self.n*2:
self.row_cycle_cnt[(self.cur+1) % self.n] += 1

# update cursor
self.cur = (self.cur + 1) % (self.n * self.n)

return (self.col_archive > 0) or (self.row_cycle_cnt[self.cur % self.n] > 0)

def check2(self):
for y in range(self.n):
cnt = 0
for x in range(self.n):
cnt += self.mat[(self.cur + y*self.n + x) % (self.n*self.n)]
if cnt == self.n:
return True
for x in range(self.n):
cnt = 0
for y in range(self.n):
cnt += self.mat[(self.cur + y*self.n + x) % (self.n*self.n)]
if cnt == self.n:
return True
return False

if __name__ == "__main__":
import random
random.seed(123)
m = Matrix(4)
for i in range(100000):
ans1 = m.update(random.randint(0, 1))
ans2 = m.check2()
assert(ans1 == ans2)
print("epoch:{} mat={} ans={}".format(i, m.mat[m.cur:] + m.mat[:m.cur], ans1))

time complexity: check that all elements of at least one row or column in a matrix are equal

Considering your algorithm, you're look at each cell twice

  1. Once in the first loop (for each row)
  2. Once in the second loop (for each column)

So your complexity is O(2N), which is ~ O(N).

So you're not really going to do better than this in a worst-case. Additionally, you're also already bailing out early in case of differences (with .all? and .any?), so your practical performance will depend on the data but should be pretty good.

I expect there's further room to optimize (like visiting each node once instead of twice) but not from a big-o perspective.

Determining Whether a Matrix Has At Least One Zero Element

The warning is generated because you're presenting a vector of logical to if, which expects a single value. any is a function to tell if any of the logical values are TRUE:

any(A==0)
## [1] TRUE
any(B==0)
## [1] FALSE

There's also a function all which determines if all of the values in a logical vector are TRUE.

Compute if each row has a TRUE or is all NA

We can do with two rowSums to create the NA for rows having only NAs

(NA^!rowSums(!is.na(x)))*rowSums(x, na.rm = TRUE)>0
#[1] TRUE TRUE FALSE NA

Or another approach is with pmax

do.call(pmax, c(as.data.frame(x), na.rm = TRUE)) > 0
#[1] TRUE TRUE FALSE NA

Find last true element of columns

This would do it

m(max(i.*reshape([1:numel(m)],size(m))))

Explanation

So we are generating an array of indices

reshape([1:numel(m)],size(m))

ans =

1 5 9 13
2 6 10 14
3 7 11 15
4 8 12 16

That represents the indices for each value. The we multiply that with I to get the values we are interested in

i.*reshape([1:numel(m)],size(m))
ans =

1 0 0 13
0 6 0 0
0 0 0 15
0 8 12 0

Then we do a max on that since max works on columns. This will give us the last index in each column.

max(i.*reshape([1:numel(m)],size(m)))
ans =

1 8 12 15

Then apply those indices on m to get the values

m(max(i.*reshape([1:numel(m)],size(m))))
ans =

16 14 15 12

How to obtain a logical (T/F) result when I reach the last row of data in r?

You can try this:

1:nrow(df) == nrow(df)

To make it a function:

is.last <- function(data) {
1:nrow(data) == nrow(data)
}

is.last(df)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE

Error in LDA(cdes, k = K, method = Gibbs, control = list(verbose = 25L, : Each row of the input matrix needs to contain at least one non-zero entry

It looks like some of your documents are empty, in the sense that they contain no counts of any feature.

You can remove them with:

cdes <- dfm_trim(df_des, min_docfreq = 2) %>%
dfm_subset(ntoken(cdes) > 0)


Related Topics



Leave a reply



Submit