Python Array Slice with Comma

What's the comma (,) doing in the middle of Python list slicing?

NumPy supports multiple dimensions. In your case that's a 2D slice: the part before the comma slices the first dimension, and after the comma slices the second dimension. This implies the data is 2D or greater, and indeed loadtxt() does produce 2D arrays.

Ref: https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.html

Slice Notation with Comma , Numpy - Python

Consider the numpy array

import numpy as np

myArr = np.array([[1,2,3],
[4,5,6],
[7,8,9]])

For any general array arr[a:b, c:d] means that we have to consider rows from index a to index b-1 and columns from index c to index d-1

myArr[1:2, 1:2] means that we have to consider rows from index 1 to index 2-1(i.e 1) and columns from index 1 to 2-1(i.e 1) or in other words, we have the element at row 1 and column 1 which is 5 in our case

Remember that row index starts from 0 and the column index also starts from 0

myArr[:,1:3] here you can see that ':' implies that we have to consider all rows and columns from index 1 to (3-1) i.e. 2

in this case our output will be

array([[2, 3],
[5, 6],
[8, 9]])

See we got all the rows using ':' in the first position and we got 2nd (index 1) and 3rd (index 2) column

Likewise using ':' in columns side will fetch you all the columns

Python Slice Notation with Comma/List

Assuming that the object is really a numpy array, this is known as advanced indexing, and picks out the specified columns:

>>> import numpy as np
>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> a[:, [1,2,3]]
array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]])
>>> a[:, [1,3]]
array([[ 1, 3],
[ 5, 7],
[ 9, 11]])

Note that this won't work with the standard Python list:

>>> a.tolist()
[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]
>>> a.tolist()[:,[1,2,3]]
Traceback (most recent call last):
File "<ipython-input-17-7d77de02047a>", line 1, in <module>
a.tolist()[:,[1,2,3]]
TypeError: list indices must be integers, not tuple

Understanding slicing

The syntax is:

a[start:stop]  # items start through stop-1
a[start:] # items start through the rest of the array
a[:stop] # items from the beginning through stop-1
a[:] # a copy of the whole array

There is also the step value, which can be used with any of the above:

a[start:stop:step] # start through not past stop, by step

The key point to remember is that the :stop value represents the first value that is not in the selected slice. So, the difference between stop and start is the number of elements selected (if step is 1, the default).

The other feature is that start or stop may be a negative number, which means it counts from the end of the array instead of the beginning. So:

a[-1]    # last item in the array
a[-2:] # last two items in the array
a[:-2] # everything except the last two items

Similarly, step may be a negative number:

a[::-1]    # all items in the array, reversed
a[1::-1] # the first two items, reversed
a[:-3:-1] # the last two items, reversed
a[-3::-1] # everything except the last two items, reversed

Python is kind to the programmer if there are fewer items than you ask for. For example, if you ask for a[:-2] and a only contains one element, you get an empty list instead of an error. Sometimes you would prefer the error, so you have to be aware that this may happen.

Relationship with the slice object

A slice object can represent a slicing operation, i.e.:

a[start:stop:step]

is equivalent to:

a[slice(start, stop, step)]

Slice objects also behave slightly differently depending on the number of arguments, similarly to range(), i.e. both slice(stop) and slice(start, stop[, step]) are supported.
To skip specifying a given argument, one might use None, so that e.g. a[start:] is equivalent to a[slice(start, None)] or a[::-1] is equivalent to a[slice(None, None, -1)].

While the :-based notation is very helpful for simple slicing, the explicit use of slice() objects simplifies the programmatic generation of slicing.

Numpy: What is the difference between slicing with brackets and with comma?

Completely non-technical explanation follows...

When you do:

gtfb_fft_hypercube[0]

you have kind of "looked up and extracted" the first row of gtfb_fft_hypercube and it is a new "thing/entity" with shape (42, 96, 1026).

When you then do:

gtfb_fft_hypercube[0][0]

you are then taking the first element (by slicing) of that new thing - it s no longer attached to or part of your original gtfb_fft_hypercube, it is just a sliced portion from a list-like thing.


It might be easier with words, but the principle is the same:

sentence = ["The","cat","sat","on","the","mat"]

# Get first word - by looking it up in sentence
firstWord = sentence[0] # firstWord = "The

# Get first letter of first word
firstLetter = firstWord[0] # firstLetter = "T"

But now we can't refer to sentence or firstWord via firstLetter because it is a new, detached thing.

How is it possible for Numpy to use comma-separated subscripting with `:`?

Define a simple class with a getitem, indexing method:

In [128]: class Foo():
...: def __getitem__(self, arg):
...: print(type(arg), arg)
...:
In [129]: f = Foo()

And look at what different indexes produce:

In [130]: f[:]
<class 'slice'> slice(None, None, None)
In [131]: f[1:2:3]
<class 'slice'> slice(1, 2, 3)
In [132]: f[:, [1,2,3]]
<class 'tuple'> (slice(None, None, None), [1, 2, 3])
In [133]: f[:, :3]
<class 'tuple'> (slice(None, None, None), slice(None, 3, None))
In [134]: f[(slice(1,None),3)]
<class 'tuple'> (slice(1, None, None), 3)

For builtin classes like list, a tuple argument raises an error. But that's a class dependent issue, not a syntax one. numpy.ndarray accepts a tuple, as long as it's compatible with its shape.

The syntax for a tuple index was added to Python to meet the needs of numpy. I don't think there are any builtin classes that use it.

The numpy.lib.index_tricks.py module has several classes that take advantage of this behavior. Look at its code for more ideas.

In [137]: np.s_[3:]
Out[137]: slice(3, None, None)
In [139]: np.r_['0,2,1',[1,2,3],[4,5,6]]
Out[139]:
array([[1, 2, 3],
[4, 5, 6]])
In [140]: np.c_[[1,2,3],[4,5,6]]
Out[140]:
array([[1, 4],
[2, 5],
[3, 6]])

other "indexing" examples:

In [141]: f[...]
<class 'ellipsis'> Ellipsis
In [142]: f[[1,2,3]]
<class 'list'> [1, 2, 3]
In [143]: f[10]
<class 'int'> 10
In [144]: f[{1:12}]
<class 'dict'> {1: 12}

I don't know of any class that makes use of a dict argument, but the syntax allows it.



Related Topics



Leave a reply



Submit