Python Slice How-To, I Know the Python Slice But How to Use Built-In Slice Object for It

Python slice how-to, I know the Python slice but how can I use built-in slice object for it?

You create a slice by calling slice with the same fields you would use if doing [start:end:step] notation:

sl = slice(0,4)

To use the slice, just pass it as if it were the index into a list or string:

>>> s = "ABCDEFGHIJKL"
>>> sl = slice(0,4)
>>> print(s[sl])
'ABCD'

Let's say you have a file of fixed-length text fields. You could define a list of slices to easily extract the values from each "record" in this file.

data = """\
0010GEORGE JETSON 12345 SPACESHIP ST HOUSTON TX
0020WILE E COYOTE 312 ACME BLVD TUCSON AZ
0030FRED FLINTSTONE 246 GRANITE LANE BEDROCK CA
0040JONNY QUEST 31416 SCIENCE AVE PALO ALTO CA""".splitlines()

fieldslices = [slice(*fielddef) for fielddef in [
(0,4), (4, 21), (21,42), (42,56), (56,58),
]]
fields = "id name address city state".split()

for rec in data:
for field,sl in zip(fields, fieldslices):
print("{} : {}".format(field, rec[sl]))
print('')

# or this same code using itemgetter, to make a function that
# extracts all slices from a string into a tuple of values
import operator
rec_reader = operator.itemgetter(*fieldslices)
for rec in data:
for field, field_value in zip(fields, rec_reader(rec)):
print("{} : {}".format(field, field_value))
print('')

Prints:

id : 0010
name : GEORGE JETSON
address : 12345 SPACESHIP ST
city : HOUSTON
state : TX

id : 0020
name : WILE E COYOTE
address : 312 ACME BLVD
city : TUCSON
state : AZ

id : 0030
name : FRED FLINTSTONE
address : 246 GRANITE LANE
city : BEDROCK
state : CA

id : 0040
name : JONNY QUEST
address : 31416 SCIENCE AVE
city : PALO ALTO
state : CA

Understanding slicing

The syntax is:

a[start:stop]  # items start through stop-1
a[start:] # items start through the rest of the array
a[:stop] # items from the beginning through stop-1
a[:] # a copy of the whole array

There is also the step value, which can be used with any of the above:

a[start:stop:step] # start through not past stop, by step

The key point to remember is that the :stop value represents the first value that is not in the selected slice. So, the difference between stop and start is the number of elements selected (if step is 1, the default).

The other feature is that start or stop may be a negative number, which means it counts from the end of the array instead of the beginning. So:

a[-1]    # last item in the array
a[-2:] # last two items in the array
a[:-2] # everything except the last two items

Similarly, step may be a negative number:

a[::-1]    # all items in the array, reversed
a[1::-1] # the first two items, reversed
a[:-3:-1] # the last two items, reversed
a[-3::-1] # everything except the last two items, reversed

Python is kind to the programmer if there are fewer items than you ask for. For example, if you ask for a[:-2] and a only contains one element, you get an empty list instead of an error. Sometimes you would prefer the error, so you have to be aware that this may happen.

Relationship with the slice object

A slice object can represent a slicing operation, i.e.:

a[start:stop:step]

is equivalent to:

a[slice(start, stop, step)]

Slice objects also behave slightly differently depending on the number of arguments, similarly to range(), i.e. both slice(stop) and slice(start, stop[, step]) are supported.
To skip specifying a given argument, one might use None, so that e.g. a[start:] is equivalent to a[slice(start, None)] or a[::-1] is equivalent to a[slice(None, None, -1)].

While the :-based notation is very helpful for simple slicing, the explicit use of slice() objects simplifies the programmatic generation of slicing.

Implementing slicing in __getitem__

The __getitem__() method will receive a slice object when the object is sliced. Simply look at the start, stop, and step members of the slice object in order to get the components for the slice.

>>> class C(object):
... def __getitem__(self, val):
... print val
...
>>> c = C()
>>> c[3]
3
>>> c[3:4]
slice(3, 4, None)
>>> c[3:4:-2]
slice(3, 4, -2)
>>> c[():1j:'a']
slice((), 1j, 'a')

How do I access the elements of a slice object in python

Are you looking for the start, stop and step properties?

>>> s = slice(1, 2, 3)
>>> s.start
1
>>> s.stop
2
>>> s.step
3

slice.indices computes the start/stop/step for the indices that would be accessed for an iterable with the input length. So,

>>> s = slice(-1, None, None)
>>> s.indices(30)
(29, 30, 1)

Which means that you would take item 29 from the iterable. It is able to be conveniently combined with xrange (or range):

for item in range(*some_slice.indices(len(sequence))):
print(sequence[item])

As a concrete example:

>>> a = range(30)
>>> for i in a[-2:]:
... print(i)
...
28
29
>>> s = slice(-2, None, None)
>>> for ix in range(*s.indices(len(a))):
... print(a[ix])
...
28
29

What is `a[start:stop, i]` in Python slicing?

You cannot combine : and , with lists.

: is for direct slicing:

a[1:3:1]

, is used with slice:

a[slice(1,3,1)]

However with objects supporting it (like numpy arrays) you can slice in several dimensions:

import numpy as np
a = np.array([[0,1,3],[3,4,5]])
a[0:1,2]

output: array([3])

python create slice object from string

slice(*[{True: lambda n: None, False: int}[x == ''](x) for x in (mystring.split(':') + ['', '', ''])[:3]])

Why is floating point slicing (slice(0,1,0.1)) allowed in python, but calling the indices method (slice(0,1,0.1).indices) raises TypeError?

tl;dr slice objects are generated by the interpreter when we use them in square-bracket notation, and allowing them to be arbitrary allows us to design code that uses them. However, because the built-in list relies on indices() to behave properly, that method has to return list-compatible values (i.e. integers), and if it can't, it throws an error.


When you do

my_obj[1:3:2]

the interpreter essentially translates it* as

my_obj.__getitem__(slice(1, 3, 2))

This is most obvious when using lists, which have special behavior when slices are given, but this behavior is also other datatypes in various popular libraries (e.g. numpy.array and pandas.Dataframe). These classes implement their own __getitem__() methods, that have their own special ways to handle slices.

Now, the built-in list presumably uses slice.indices() to decompose the entire slice into a set of individual indices that it can access and then group together and return. List indices can only be integers, and they don't want this functionality to break, so the most consistent way to go about it is to make slice.indices() throw an error when it can't produce a list of integers.

They can't restrict slice to having only those values, though, because it's an interpreter-generated object that other user-defined classes might want to use. If you design an object like this:

class myGenerator:
def __getitem__(self, s): # s is a slice
def gen():
i = s.start
while i < s.stop:
yield i
i += s.step
return list(gen())

h = myGenerator()
print(h[1:4:.25])
# [1, 1.25, 1.5, 1.75, 2.0, 2.25, 2.5, 2.75, 3.0, 3.25, 3.5, 3.75]
print(h[0:1:0.1])
# [0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999, 0.8999999999999999, 0.9999999999999999]

then it can co-opt slice notation to work however it wants, so we can institute custom behavior for it. But if we were to change slice.indices() to use that instead, then it would break the built-in list - thus, python doesn't allow us to.


*technically, for a lot of built-ins, the python interpreter may take shortcuts and executes hardcoded routines instead of actually translating the notation into function calls and executing them. But for our purposes the analogy works well enough, since it does do that for user-generated objects.

Make an object that behaves like a slice

TLDR: It's impossible to make custom classes replace slice for builtins types such as list and tuple.


The __index__ method exists purely to provide an index, which is by definition an integer in python (see the Data Model). You cannot use it for resolving an object to a slice.

I'm afraid that slice seems to be handled specially by python. The interface requires an actual slice; providing its signature (which also includes the indices method) is not sufficient. As you've found out, you cannot inherit from it, so you cannot create new types of slices. Even Cython will not allow you to inherit from it.


So why is slice special? Glad you asked. Welcome to the innards of CPython. Please wash your hands after reading this.

So slice objects are described in slice.rst. Note these two guys:

.. c:var:: PyTypeObject PySlice_Type

The type object for slice objects. This is the same as :class:slice in the
Python layer.

.. c:function:: int PySlice_Check(PyObject *ob)
Return true if ob is a slice object; ob must not be NULL.

Now, this is actually implemented in sliceobject.h as :

#define PySlice_Check(op) (Py_TYPE(op) == &PySlice_Type)

So only the slice type is allowed here. This check is actually used in list_subscript (and tuple subscript, ...) after attempting to use the index protocol (so having __index__ on a slice is a bad idea). A custom container class is free to overwrite __getitem__ and use its own rules, but that's how list (and tuple, ...) does it.

Now, why is it not possible to subclass slice? Well, type actually has a flag indicating whether something can be subclassed. It is checked here and generates the error you have seen:

    if (!PyType_HasFeature(base_i, Py_TPFLAGS_BASETYPE)) {
PyErr_Format(PyExc_TypeError,
"type '%.100s' is not an acceptable base type",
base_i->tp_name);
return NULL;
}

I haven't been able to track down how slice (un)sets this value, but the fact that one gets this error means it does. This means you cannot subclass it.


Closing remarks: After remembering some long-forgotten C-(non)-skills, I'm fairly sure this is not about optimization in the strict sense. All existing checks and tricks would still work (at least those I've found).

After washing my hands and digging around in the internet, I've found a few references to similar "issues". Tim Peters has said all there is to say:

Nothing implemented in C is subclassable unless somebody volunteers the work
to make it subclassable; nobody volunteered the work to make the [insert name here]
type subclassable. It sure wasn't at the top of my list wink.

Also see this thread for a short discussion on non-subclass'able types.

Practically all alternative interpreters replicate the behavior to various degrees: Jython, Pyston, IronPython and PyPy (didn't find out how they do it, but they do).

Can I store slicers in a variable? (Pandas/Python)

Yes, but you don't write slices like that. You write slice('1900', '2000', None) instead.



Related Topics



Leave a reply



Submit