find indices of non zero elements in matrix
which(X!=0,arr.ind = T)
row col
[1,] 1 1
[2,] 2 1
[3,] 1 3
[4,] 2 3
If arr.ind == TRUE
and X
is an array, the result is a matrix whose rows each are the indices of the elements of X
Fast way to find indexes of nonzero entries for every row in a CSC matrix in Python
Sure. You're pretty close to having an ideal solution, but you're allocating some unnecessary arrays. Here's a faster way:
from scipy import sparse
import numpy as np
def my_impl(csc):
csr = csc.tocsr()
return np.split(csr.indices, csr.indptr[1:-1])
def your_impl(input):
return [
np.nonzero(row)[1]
for row in sparse.csr_matrix(input)
]
## Results
# demo data
csc = sparse.random(15000, 5000, format="csc")
your_result = your_impl(csc)
my_result = my_impl(csc)
## Tests for correctness
# Same result
assert all(np.array_equal(x, y) for x, y in zip(your_result, my_result))
# Right number of rows
assert len(my_result) == csc.shape[0]
## Speed
%timeit my_impl(csc)
# 31 ms ± 1.26 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit your_impl(csc)
# 1.49 s ± 19.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Side question, why are you transposing the matrix? Wouldn't you then be getting the non-zero entries of the columns? If that's what you want, you don't even need to convert to csr
and can just run:
np.split(csc.indices, csc.indptr[1:-1])
Find indices of non-zero elements in array fortran
I'm not 100% sure I understand what you want, but does this do it?
ian@eris:~/work/stack$ cat pack.f90
Program pack_index
Implicit None
Integer, Dimension( 1:4 ) :: my_array = [ 160, 0, 230, 0 ]
Integer, Dimension( : ), Allocatable :: choices
Integer, Dimension( : ), Allocatable :: indices
Integer :: i
indices = Merge( 0, [ ( i, i = 1, Size( my_array ) ) ], my_array == 0 )
choices = Pack( indices, indices /= 0 )
Write( *, * ) choices
End Program pack_index
ian@eris:~/work/stack$ gfortran-8 -std=f2008 -fcheck=all pack.f90
ian@eris:~/work/stack$ ./a.out
1 3
Numpy: given the nonzero indices of a matrix how to extract the elements into a submatrix
It seems like you're looking to find the smallest region of your matrix that contains all the nonzero elements. If that's true, here's a method:
import numpy as np
def submatrix(arr):
x, y = np.nonzero(arr)
# Using the smallest and largest x and y indices of nonzero elements,
# we can find the desired rectangular bounds.
# And don't forget to add 1 to the top bound to avoid the fencepost problem.
return arr[x.min():x.max()+1, y.min():y.max()+1]
test = np.array([[0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0],
[0, 0, 1, 1, 0, 0]])
print submatrix(test)
# Result:
# [[1 1 1 1]
# [0 1 1 0]]
Matlab - Find indices of nearest non-zero element for every zero matrix element
There's an efficient bwdist
function in IPT that computes the distance transform:
M = [
0 1 0 0 0
2 5 0 3 0
0 0 0 0 0
0 5 0 2 1
];
[D,IDX] = bwdist(M~=0)
The result:
D =
1.0000 0 1.0000 1.0000 1.4142
0 0 1.0000 0 1.0000
1.0000 1.0000 1.4142 1.0000 1.0000
1.0000 0 1.0000 0 0
IDX =
2 5 5 14 14
2 6 6 14 14
2 6 6 14 20
8 8 8 16 20
The returned IDX
contains linear indices of nearest nonzero value in M
. It only return one index per element.
Find indices of non-zero elements from [1,2,0,0,4,0] in julia and create an Arrray with them
Here's one alternative:
function myfind(c)
a = similar(c, Int)
count = 1
@inbounds for i in eachindex(c)
a[count] = i
count += (c[i] != zero(eltype(c)))
end
return resize!(a, count-1)
end
It actually outperformed find
for all the cases I tested, though for the very small example vector you posted, the difference was negligible. There is perhaps some performance advantage to avoiding the branch and dynamically growing the index array.
How to create a vector where indices of non-zero elements follow a distribution
This is existing functionality in numpy (choice)
import numpy as np
from scipy import stats
N = 40
K = 11
Your vague description of the distribution you want is not adequate, so I'm just going to use a normal probability distribution with a mean of N/2
and a standard deviation of sqrt(N/2)
.
center = int(N / 2)
scale = np.sqrt(N / 2)
Create a probability vector from the probability density function for each possible index (up to N
):
p = stats.norm(loc=center, scale=scale).pdf(np.arange(N))
Make sure it sums to 1:
p /= np.sum(p)
Initialize a random number generator and call .choice()
on the possible indices, with the probability distribution p
, setting replace
to False
:
rng = np.random.default_rng()
nz_indices = rng.choice(np.arange(N), size=K, p=p, replace=False)
>>> nz_indices
array([27, 20, 23, 19, 16, 24, 13, 25, 26, 22, 21])
How to find indices of non zero elements in large sparse matrix?
Since you have two dense matrices then the double for loop is the only option you have. You don't need a sparse matrix class at all since you only want to know the list of indices (i,j)
for which a[i,j] != b[i,j]
.
In languages like R and Python the double for loop will perform poorly. I'd probably write this in native code for a double for loop and add the indices to a list object. But no doubt the wizards of interpreted code (i.e. R, Python etc.) know efficient ways to do it without resorting to native coding.
Related Topics
Replace Na with Groups Mean in a Non Specified Number of Columns
Easier Way to Plot the Cumulative Frequency Distribution in Ggplot
Simple Frequency Tables Using Data.Table
Create Convex Hull Polygon from Points and Save as Shapefile
Include Data Examples in Developing R Packages
Using Legend with Stat_Function in Ggplot2
Join Data.Table on Exact Date or If Not the Case on the Nearest Less Than Date
How to Automate Multiple Requests to a Web Search Form Using R
R Shiny Error: Object Input Not Found
Get All the Rows with Rownames Starting with Abc111
Importing Data into R from Google Spreadsheet
How to Have Conditional Markdown Chunk Execution in Rmarkdown
Barplot with 2 Variables Side by Side
Package Dependencies When Installing from Source in R
Using ':=' in Data.Table to Sum the Values of Two Columns in R, Ignoring Nas