Why is 1000000000000000 in range(1000000000000001) so fast in Python 3?
The Python 3 range()
object doesn't produce numbers immediately; it is a smart sequence object that produces numbers on demand. All it contains is your start, stop and step values, then as you iterate over the object the next integer is calculated each iteration.
The object also implements the object.__contains__
hook, and calculates if your number is part of its range. Calculating is a (near) constant time operation *. There is never a need to scan through all possible integers in the range.
From the range()
object documentation:
The advantage of the
range
type over a regularlist
ortuple
is that a range object will always take the same (small) amount of memory, no matter the size of the range it represents (as it only stores thestart
,stop
andstep
values, calculating individual items and subranges as needed).
So at a minimum, your range()
object would do:
class my_range:
def __init__(self, start, stop=None, step=1, /):
if stop is None:
start, stop = 0, start
self.start, self.stop, self.step = start, stop, step
if step < 0:
lo, hi, step = stop, start, -step
else:
lo, hi = start, stop
self.length = 0 if lo > hi else ((hi - lo - 1) // step) + 1
def __iter__(self):
current = self.start
if self.step < 0:
while current > self.stop:
yield current
current += self.step
else:
while current < self.stop:
yield current
current += self.step
def __len__(self):
return self.length
def __getitem__(self, i):
if i < 0:
i += self.length
if 0 <= i < self.length:
return self.start + i * self.step
raise IndexError('my_range object index out of range')
def __contains__(self, num):
if self.step < 0:
if not (self.stop < num <= self.start):
return False
else:
if not (self.start <= num < self.stop):
return False
return (num - self.start) % self.step == 0
This is still missing several things that a real range()
supports (such as the .index()
or .count()
methods, hashing, equality testing, or slicing), but should give you an idea.
I also simplified the __contains__
implementation to only focus on integer tests; if you give a real range()
object a non-integer value (including subclasses of int
), a slow scan is initiated to see if there is a match, just as if you use a containment test against a list of all the contained values. This was done to continue to support other numeric types that just happen to support equality testing with integers but are not expected to support integer arithmetic as well. See the original Python issue that implemented the containment test.
* Near constant time because Python integers are unbounded and so math operations also grow in time as N grows, making this a O(log N) operation. Since it’s all executed in optimised C code and Python stores integer values in 30-bit chunks, you’d run out of memory before you saw any performance impact due to the size of the integers involved here.
Strange speeds of `x in range(...)` checks
In CPython, the implementation first checks the value against both endpoints:
if (cmp1 == 1) { /* positive steps: start <= ob < stop */
cmp2 = PyObject_RichCompareBool(r->start, ob, Py_LE);
cmp3 = PyObject_RichCompareBool(ob, r->stop, Py_LT);
}
If either is false, it can exit immediately:
if (cmp2 == 0 || cmp3 == 0) { /* ob outside of range */
result = 0;
goto end;
}
If both checks are true, though, it then has to check if the stride rejects the value. (For example, 2000 in range(1501, 2500)
is true, but 2000 in range(1501, 2500, 2)
is false.)
tmp1 = PyNumber_Subtract(ob, r->start);
if (tmp1 == NULL)
goto end;
tmp2 = PyNumber_Remainder(tmp1, r->step);
if (tmp2 == NULL)
goto end;
/* result = ((int(ob) - start) % step) == 0 */
result = PyObject_RichCompareBool(tmp2, zero, Py_EQ);
It's this last step that makes 2000 in range(1500, 2500)
slower than either out-of-bound check.
(See https://github.com/python/cpython/blob/main/Objects/rangeobject.c#L381 for the full implementation.)
Python: how to print range a-z?
>>> import string
>>> string.ascii_lowercase[:14]
'abcdefghijklmn'
>>> string.ascii_lowercase[:14:2]
'acegikm'
To do the urls, you could use something like this
[i + j for i, j in zip(list_of_urls, string.ascii_lowercase[:14])]
What is the meaning of int(a[::-1]) in Python?
Assuming a
is a string. The Slice notation in python has the syntax -
list[<start>:<stop>:<step>]
So, when you do a[::-1]
, it starts from the end towards the first taking each element. So it reverses a. This is applicable for lists/tuples as well.
Example -
>>> a = '1234'
>>> a[::-1]
'4321'
Then you convert it to int and then back to string (Though not sure why you do that) , that just gives you back the string.
Why does range(start, end) not include end?
Because it's more common to call range(0, 10)
which returns [0,1,2,3,4,5,6,7,8,9]
which contains 10 elements which equals len(range(0, 10))
. Remember that programmers prefer 0-based indexing.
Also, consider the following common code snippet:
for i in range(len(li)):
pass
Could you see that if range()
went up to exactly len(li)
that this would be problematic? The programmer would need to explicitly subtract 1. This also follows the common trend of programmers preferring for(int i = 0; i < 10; i++)
over for(int i = 0; i <= 9; i++)
.
If you are calling range with a start of 1 frequently, you might want to define your own function:
>>> def range1(start, end):
... return range(start, end+1)
...
>>> range1(1, 10)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Why is looping over range() in Python faster than using a while loop?
see the disassembly of python byte code, you may get a more concrete idea
use while loop:
1 0 LOAD_CONST 0 (0)
3 STORE_NAME 0 (i)
2 6 SETUP_LOOP 28 (to 37)
>> 9 LOAD_NAME 0 (i) # <-
12 LOAD_CONST 1 (100000000) # <-
15 COMPARE_OP 0 (<) # <-
18 JUMP_IF_FALSE 14 (to 35) # <-
21 POP_TOP # <-
3 22 LOAD_NAME 0 (i) # <-
25 LOAD_CONST 2 (1) # <-
28 INPLACE_ADD # <-
29 STORE_NAME 0 (i) # <-
32 JUMP_ABSOLUTE 9 # <-
>> 35 POP_TOP
36 POP_BLOCK
The loop body has 10 op
use range:
1 0 SETUP_LOOP 23 (to 26)
3 LOAD_NAME 0 (range)
6 LOAD_CONST 0 (0)
9 LOAD_CONST 1 (100000000)
12 CALL_FUNCTION 2
15 GET_ITER
>> 16 FOR_ITER 6 (to 25) # <-
19 STORE_NAME 1 (n) # <-
2 22 JUMP_ABSOLUTE 16 # <-
>> 25 POP_BLOCK
>> 26 LOAD_CONST 2 (None)
29 RETURN_VALUE
The loop body has 3 op
The time to run C code is much shorter than intepretor and can be ignored.
Check if something is (not) in a list in Python
The bug is probably somewhere else in your code, because it should work fine:
>>> 3 not in [2, 3, 4]
False
>>> 3 not in [4, 5, 6]
True
Or with tuples:
>>> (2, 3) not in [(2, 3), (5, 6), (9, 1)]
False
>>> (2, 3) not in [(2, 7), (7, 3), "hi"]
True
Related Topics
Is There Any Pythonic Way to Combine Two Dicts (Adding Values For Keys That Appear in Both)
How to Concatenate Text Files in Python
Find If 24 Hrs Have Passed Between Datetimes
How to Resize an Image Using Pil and Maintain Its Aspect Ratio
How to Make One Python File Run Another
How to Merge Dictionaries of Dictionaries
How to Open a Chrome Profile Through Python
Text Progress Bar in Terminal With Block Characters
What Is an Alternative to Execfile in Python 3
How to Convert an Integer to a String in Any Base
Are List-Comprehensions and Functional Functions Faster Than "For Loops"
How to Run a Python Script as a Service in Windows
How to Disable Python Warnings
Store Output of Subprocess.Popen Call in a String
How to Find Overlapping Matches With a Regexp
What Is the Syntax Rule For Having Trailing Commas in Tuple Definitions
Python 3: Unboundlocalerror: Local Variable Referenced Before Assignment