﻿ Selecting Specific Rows and Columns from Numpy Array - ITCodar

# Selecting Specific Rows and Columns from Numpy Array

## Selecting specific rows and columns from NumPy array

Fancy indexing requires you to provide all indices for each dimension. You are providing 3 indices for the first one, and only 2 for the second one, hence the error. You want to do something like this:

>>> a[[[0, 0], [1, 1], [3, 3]], [[0,2], [0,2], [0, 2]]]
array([[ 0, 2],
[ 4, 6],
[12, 14]])

That is of course a pain to write, so you can let broadcasting help you:

>>> a[[[0], [1], [3]], [0, 2]]
array([[ 0, 2],
[ 4, 6],
[12, 14]])

This is much simpler to do if you index with arrays, not lists:

>>> row_idx = np.array([0, 1, 3])
>>> col_idx = np.array([0, 2])
>>> a[row_idx[:, None], col_idx]
array([[ 0, 2],
[ 4, 6],
[12, 14]])

## Choosing specific rows and columns from numpy array

import numpy as np
a = np.arange(676).reshape((26,26))

First we need to define which rows we want:

index = np.arange(a.shape[0]) != 14 # all rows but the 15th row

we can use the same index for columns, since we are selecting the same rows and columns, and a is a square matrix

Now we can use np.ix_ function to express that we want all selected row and columns.

a[np.ix_(index, index)] #a.shape =(25, 25)

Note that a[index, index] won't work since only the diagonal elements will be selected (the result is an array not a matrix)

## how to select specific rows and columns of a double array in numpy to form a submatrix?

When you select sub-arrays with two broadcastable indices arrays, like array[arr_1, arr2], it will match each element of arr_1 to arr_2 and select corresponding element of array. If you wish to select all rows in arr_1 and all columns in arr_2, the most elegant way would be using np.ix_. The code would be:

ab[np.ix_(np.array([0,2]),np.array([1,3]))]

output:

[[2 4]
[4 3]]

About np.ix_: From numpy doc: This function takes N 1-D sequences and returns N outputs with N dimensions each, such that the shape is 1 in all but one dimension and the dimension with the non-unit shape value cycles through all N dimensions.

Which means you can extend this to ANY dimension array. For array of N dimensions, calling np.ix_(arr_1, arr_2, ..., arr_N) will create N indices array, each will cycle through all arr_i rows of dimension i in array.

## Make a numpy array selecting only certain rows and certain columns from another

Something like this

import numpy as np

first_array = np.random.rand(500,1000)
row_factor = 10
row_start = 1
col_factor = 10
col_start = 1
second_array = first_array[row_start:-1:row_factor,col_start:-1:col_factor]
print(second_array.shape)

You can make simple slicing where you skip row_factor or col_factor in both direction

## How to extract slices and specific columns of a numpy array with one command?

As ombk suggested, you can use r_.
It is a perfect tool to concatenate slice expressions.

In your case:

A[:, np.r_[0:3, 4]]

retrieves the intended part of your array.

Just the same way you can concatenate more slice expressions.

## Extracting multiple sets of rows/ columns from a 2D numpy array

IIUC, you can use numpy.r_ to generate the indices from the slice:

img[np.r_[0,2:4][:,None],2]

output:

array([[ 2],
[12],
[17]])

intermediates:

np.r_[0,2:4]
# array([0, 2, 3])

np.r_[0,2:4][:,None] # variant: np.c_[np.r_[0,2:4]]
# array([[0],
# [2],
# [3]])

## How to select specific row column pairs in numpy array which have specific value?

Get the indices along first two axes that match that criteria with np.nonzero/np.where on the mask of comparisons and then simply index with integer array indexing -

r,c = np.nonzero(x>0.3)
out = x[r,c]

If you are looking to get those indices a list of tuples, zip those indices -

zip(r,c)

To get those starting from 1, add 1 and then zip -

zip(r+1,c+1)

On Python 3.x, you would need to wrap it with list() : list(zip(r,c)) and list(zip(r+1,c+1)).

Sample run -

In [9]: x
Out[9]:
array([[ 0.11874238, 0.71885484, 0.33656161],
[ 0.69432263, 0.25234083, 0.66118676],
[ 0.77542651, 0.71230397, 0.76212491]])

In [10]: r,c = np.nonzero(x>0.3)

In [14]: zip(r,c)
Out[14]: [(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1), (2, 2)]

In [18]: zip(r+1,c+1)
Out[18]: [(1, 2), (1, 3), (2, 1), (2, 3), (3, 1), (3, 2), (3, 3)]

In [13]: x[r,c]
Out[13]:
array([ 0.71885484, 0.33656161, 0.69432263, 0.66118676, 0.77542651,
0.71230397, 0.76212491])

Writing indices to file -

Use np.savetxt with int format, like so -

In [69]: np.savetxt("output.txt", np.argwhere(x>0.3), fmt="%d", comments='')

In [70]: !cat output.txt
0 1
0 2
1 0
1 2
2 0
2 1
2 2

With the 1 based indexing, add 1 to np.argwhere output -

In [71]: np.savetxt("output.txt", np.argwhere(x>0.3)+1, fmt="%d", comments='')

In [72]: !cat output.txt
1 2
1 3
2 1
2 3
3 1
3 2
3 3

Submit