How Are Python In-Place Operator Functions Different Than the Standard Operator Functions

How are Python in-place operator functions different than the standard operator functions?

First, you need to understand the difference between __add__ and __iadd__.

An object's __add__ method is regular addition: it takes two parameters, returns their sum, and doesn't modify either parameter.

An object's __iadd__ method also takes two parameters, but makes the change in-place, modifying the contents of the first parameter. Because this requires object mutation, immutable types (like the standard number types) shouldn't have an __iadd__ method.

a + b uses __add__. a += b uses __iadd__ if it exists; if it doesn't, it emulates it via __add__, as in tmp = a + b; a = tmp. operator.add and operator.iadd differ in the same way.

To the other question: operator.iadd(x, y) isn't equivalent to z = x; z += y, because if no __iadd__ exists __add__ will be used instead. You need to assign the value to ensure that the result is stored in both cases: x = operator.iadd(x, y).

You can see this yourself easily enough:

import operator
a = 1
operator.iadd(a, 2)
# a is still 1, because ints don't have __iadd__; iadd returned 3

b = ['a']
operator.iadd(b, ['b'])
# lists do have __iadd__, so b is now ['a', 'b']

Difference between operators and methods

If I understand question currectly...

In nutshell, everything is a method of object. You can find "expression operators" methods in python magic class methods, in the operators.

So, why python has "sexy" things like [x:y], [x], +, -? Because it is common things to most developers, even to unfamiliar with development people, so math functions like +, - will catch human eye and he will know what happens. Similar with indexing - it is common syntax in many languages.

But there is no special ways to express upper, replace, strip methods, so there is no "expression operators" for it.

So, what is different between "expression operators" and methods, I'd say just the way it looks.

+ and += operators are different?

docs explain it very well, I think:

__iadd__(), etc.

These methods are called to implement the augmented arithmetic assignments (+=, -=, *=, /=, //=, %=, **=, <<=, >>=, &=, ^=, |=). These methods should attempt to do the operation in-place (modifying self) and return the result (which could be, but does not have to be, self). If a specific method is not defined, the augmented assignment falls back to the normal methods. For instance, to execute the statement x += y, where x is an instance of a class that has an __iadd__() method, x.__iadd__(y) is called.

+= are designed to implement in-place modification. in case of simple addition, new object created and it's labelled using already-used name (c).

Also, you'd notice that such behaviour of += operator only possible because of mutable nature of lists. Integers - an immutable type - won't produce the same result:

>>> c = 3
>>> print(c, id(c))
3 505389080
>>> c += c
>>> print(c, id(c))
6 505389128

Difference between assignment and compound operators in Python

This is a difference between mutable and immutable objects. A mutable object can implement obj *= something by actually modifying the object in place; an immutable object can only return a new object with the updated value (in which case the result is identical to obj = obj * something). The compound assignment statements can handle either case, it's entirely up to the object's implementation.

Which operator (+ vs +=) should be used for performance? (In-place Vs not-in-place)

x = x + 1 vs x += 1

Performance

It seems that you understand the semantical difference between x += 1 and x = x + 1.

For benchmarking, you can use timeit in IPython.

After defining those functions:

import numpy as np
def in_place(n):
x = np.arange(n)
x += 1

def not_in_place(n):
x = np.arange(n)
x = x + 1

def in_place_no_broadcast(n):
x = np.arange(n)
x += np.ones(n, dtype=np.int)

You can simply use the %timeit syntax to compare performances:

%timeit in_place(10**7)
20.3 ms ± 81.4 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit not_in_place(10**7)
30.4 ms ± 253 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit in_place_no_broadcast(10**7)
35.4 ms ± 101 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

not_in_place is 50% slower than in_place.

Note that broadcasting also makes a huge difference : numpy understands x += 1 as adding a 1 to every single element of x, without having to create yet another array.

Warning

in_place should be the preferred function: it's faster and uses less memory. You might run into bugs if you use and mutate this object at different places in your code, though. The typical example would be :

x = np.arange(5)
y = [x, x]
y[0][0] = 10
y
# [array([10, 1, 2, 3, 4]), array([10, 1, 2, 3, 4])]

Sorting

Your understanding of the advantages of in-place sorting is correct. It can make a huge difference in memory requirements when sorting large data sets.

There are other desirable features for a sorting algorithm (stable, acceptable worst-case complexity, ...) and it looks like the standard Python algorithm (Timsort) has many of them.

Timsort is an hybrid algorithm. Some parts of it are in-place and some require extra memory. It will never use more than n/2 though.

difference between adding lists in python with + and +=

p = p + test1 assigns a new value to variable p, while p += test1 extends the list stored in variable p. And since the list in p is the same list as in test, appending to p also appends to test, while assigning a new value to the variable p does not change the value assigned to test in any way.

Inplace functions in Python

You can't have a-priory knowledge about the operation for a given function. You need to either look at the source and deduce this information, or, examine the doc-string for it and hope the developer documents this behavior.

For example, in list.sort:

help(list.sort)
Help on method_descriptor:

sort(...)
L.sort(key=None, reverse=False) -> None -- stable sort *IN PLACE*

For functions operating on certain types, their mutability generally lets you extract some knowledge about the operation. You can be certain, for example, that all functions operating on strings will eventually return a new one, meaning, they can't perform in-place operations. This is because you should be aware that strings in Python are immutable objects.

The difference between x += y and x = x + y

The object "on the left" handles the operator (usually, see the r-operator forms); in this case it is an Inplace Operator.

10.3.2. Inplace Operators

Many operations have an “in-place” version. Listed below are functions providing a more primitive access to in-place operators than the usual syntax does; for example, the statement x += y is equivalent to x = operator.iadd(x, y) ..

The actual result is determined by the "x" object and if it handles __iadd__ (eg. mutated as with lists) or just __add__ (eg. a new result object, as with strings) - the selection of which protocol to use, and what value to return for the assignment, is determined by operator.iadd itself1.

So the shorthand of x += y ~~ x = x + y is only true for some objects - notably those that are immutable and [only] implement __add__.

See How are Python in-place operator functions different than the standard operator functions?


1 Semantically the operator.iadd function works about like:

if x.__iadd__:
x.__iadd__(y) # side-effect performed on x,
return x # returns original-but-modified object
else
return x.__add__(y) # return new object,
# __add__ should not have side-effects

Is making in-place operations return the object a bad idea?

Yes, it is a bad idea. The reason is that if in-place and non-in-place operations have apparently identical output, then programmers will frequently mix up in-place operations and non-in-place operations (List.sort() vs. sorted()) and that results in hard-to-detect errors.

In-place operations returning themselves can allow you to perform "method chaining", however, this is bad practice because you may bury functions with side-effects in the middle of a chain by accident.

To prevent errors like this, method chains should only have one method with side-effects, and that function should be at the end of the chain. Functions before that in the chain should transform the input without side-effects (for instance, navigating a tree, slicing a string, etc.). If in-place operations return themselves then a programmer is bound to accidentally use it in place of an alternative function that returns a copy and therefore has no side effects (again, List.sort() vs. sorted()) which may result in an error that is difficult to debug.

This is the reason Python standard library functions always either return a copy or return None and modify objects in-place, but never modify objects in-place and also return themselves. Other Python libraries like Django also follow this practice (see this very similar question about Django).



Related Topics



Leave a reply



Submit