What's the comma (,) doing in the middle of Python list slicing?
NumPy supports multiple dimensions. In your case that's a 2D slice: the part before the comma slices the first dimension, and after the comma slices the second dimension. This implies the data is 2D or greater, and indeed loadtxt()
does produce 2D arrays.
Ref: https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.html
Slice Notation with Comma , Numpy - Python
Consider the numpy array
import numpy as np
myArr = np.array([[1,2,3],
[4,5,6],
[7,8,9]])
For any general array arr[a:b, c:d]
means that we have to consider rows from index a to index b-1 and columns from index c to index d-1
myArr[1:2, 1:2]
means that we have to consider rows from index 1 to index 2-1(i.e 1) and columns from index 1 to 2-1(i.e 1) or in other words, we have the element at row 1 and column 1 which is 5
in our case
Remember that row index starts from 0 and the column index also starts from 0
myArr[:,1:3]
here you can see that ':' implies that we have to consider all rows and columns from index 1 to (3-1) i.e. 2
in this case our output will be
array([[2, 3],
[5, 6],
[8, 9]])
See we got all the rows using ':' in the first position and we got 2nd (index 1) and 3rd (index 2) column
Likewise using ':' in columns side will fetch you all the columns
Python Slice Notation with Comma/List
Assuming that the object is really a numpy
array, this is known as advanced indexing, and picks out the specified columns:
>>> import numpy as np
>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> a[:, [1,2,3]]
array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]])
>>> a[:, [1,3]]
array([[ 1, 3],
[ 5, 7],
[ 9, 11]])
Note that this won't work with the standard Python list:
>>> a.tolist()
[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]
>>> a.tolist()[:,[1,2,3]]
Traceback (most recent call last):
File "<ipython-input-17-7d77de02047a>", line 1, in <module>
a.tolist()[:,[1,2,3]]
TypeError: list indices must be integers, not tuple
Understanding slicing
The syntax is:
a[start:stop] # items start through stop-1
a[start:] # items start through the rest of the array
a[:stop] # items from the beginning through stop-1
a[:] # a copy of the whole array
There is also the step
value, which can be used with any of the above:
a[start:stop:step] # start through not past stop, by step
The key point to remember is that the :stop
value represents the first value that is not in the selected slice. So, the difference between stop
and start
is the number of elements selected (if step
is 1, the default).
The other feature is that start
or stop
may be a negative number, which means it counts from the end of the array instead of the beginning. So:
a[-1] # last item in the array
a[-2:] # last two items in the array
a[:-2] # everything except the last two items
Similarly, step
may be a negative number:
a[::-1] # all items in the array, reversed
a[1::-1] # the first two items, reversed
a[:-3:-1] # the last two items, reversed
a[-3::-1] # everything except the last two items, reversed
Python is kind to the programmer if there are fewer items than you ask for. For example, if you ask for a[:-2]
and a
only contains one element, you get an empty list instead of an error. Sometimes you would prefer the error, so you have to be aware that this may happen.
Relationship with the slice
object
A slice
object can represent a slicing operation, i.e.:
a[start:stop:step]
is equivalent to:
a[slice(start, stop, step)]
Slice objects also behave slightly differently depending on the number of arguments, similarly to range()
, i.e. both slice(stop)
and slice(start, stop[, step])
are supported.
To skip specifying a given argument, one might use None
, so that e.g. a[start:]
is equivalent to a[slice(start, None)]
or a[::-1]
is equivalent to a[slice(None, None, -1)]
.
While the :
-based notation is very helpful for simple slicing, the explicit use of slice()
objects simplifies the programmatic generation of slicing.
Numpy: What is the difference between slicing with brackets and with comma?
Completely non-technical explanation follows...
When you do:
gtfb_fft_hypercube[0]
you have kind of "looked up and extracted" the first row of gtfb_fft_hypercube
and it is a new "thing/entity" with shape (42, 96, 1026)
.
When you then do:
gtfb_fft_hypercube[0][0]
you are then taking the first element (by slicing) of that new thing - it s no longer attached to or part of your original gtfb_fft_hypercube
, it is just a sliced portion from a list-like thing.
It might be easier with words, but the principle is the same:
sentence = ["The","cat","sat","on","the","mat"]
# Get first word - by looking it up in sentence
firstWord = sentence[0] # firstWord = "The
# Get first letter of first word
firstLetter = firstWord[0] # firstLetter = "T"
But now we can't refer to sentence
or firstWord
via firstLetter
because it is a new, detached thing.
How is it possible for Numpy to use comma-separated subscripting with `:`?
Define a simple class with a getitem
, indexing method:
In [128]: class Foo():
...: def __getitem__(self, arg):
...: print(type(arg), arg)
...:
In [129]: f = Foo()
And look at what different indexes produce:
In [130]: f[:]
<class 'slice'> slice(None, None, None)
In [131]: f[1:2:3]
<class 'slice'> slice(1, 2, 3)
In [132]: f[:, [1,2,3]]
<class 'tuple'> (slice(None, None, None), [1, 2, 3])
In [133]: f[:, :3]
<class 'tuple'> (slice(None, None, None), slice(None, 3, None))
In [134]: f[(slice(1,None),3)]
<class 'tuple'> (slice(1, None, None), 3)
For builtin classes like list
, a tuple argument raises an error. But that's a class dependent issue, not a syntax one. numpy.ndarray
accepts a tuple, as long as it's compatible with its shape.
The syntax for a tuple index was added to Python to meet the needs of numpy
. I don't think there are any builtin classes that use it.
The numpy.lib.index_tricks.py
module has several classes that take advantage of this behavior. Look at its code for more ideas.
In [137]: np.s_[3:]
Out[137]: slice(3, None, None)
In [139]: np.r_['0,2,1',[1,2,3],[4,5,6]]
Out[139]:
array([[1, 2, 3],
[4, 5, 6]])
In [140]: np.c_[[1,2,3],[4,5,6]]
Out[140]:
array([[1, 4],
[2, 5],
[3, 6]])
other "indexing" examples:
In [141]: f[...]
<class 'ellipsis'> Ellipsis
In [142]: f[[1,2,3]]
<class 'list'> [1, 2, 3]
In [143]: f[10]
<class 'int'> 10
In [144]: f[{1:12}]
<class 'dict'> {1: 12}
I don't know of any class that makes use of a dict argument, but the syntax allows it.
Related Topics
When Is Not a Good Time to Use Python Generators
Format String Unused Named Arguments
Remove or Replace Spaces in Column Names
How to Implement a Minimal Server for Ajax in Python
How to Get Md5 Sum of a String Using Python
Rotate Image and Crop Out Black Borders
How to Use MySQLdb with Python and Django in Osx 10.6
Check If Value Already Exists Within List of Dictionaries
Python Read File as Stream from Hdfs
How to Install the Yaml Package for Python
Set Up Python Simplehttpserver on Windows
Sqlalchemy Unique Across Multiple Columns
Find the Indexes of All Regex Matches
Python: Slicing a Multi-Dimensional Array
Split Views.Py in Several Files
Builtins.Typeerror: Must Be Str, Not Bytes
Kivy Not Working (Error: Unable to Find Any Valuable Window Provider.)