Create a list with initial capacity in Python
Warning: This answer is contested. See comments.
def doAppend( size=10000 ):
result = []
for i in range(size):
message= "some unique object %d" % ( i, )
result.append(message)
return result
def doAllocate( size=10000 ):
result=size*[None]
for i in range(size):
message= "some unique object %d" % ( i, )
result[i]= message
return result
Results. (evaluate each function 144 times and average the duration)
simple append 0.0102
pre-allocate 0.0098
Conclusion. It barely matters.
Premature optimization is the root of all evil.
Create an empty list with certain size in Python
You cannot assign to a list like xs[i] = value
, unless the list already is initialized with at least i+1
elements. Instead, use xs.append(value)
to add elements to the end of the list. (Though you could use the assignment notation if you were using a dictionary instead of a list.)
Creating an empty list:
>>> xs = [None] * 10
>>> xs
[None, None, None, None, None, None, None, None, None, None]
Assigning a value to an existing element of the above list:
>>> xs[1] = 5
>>> xs
[None, 5, None, None, None, None, None, None, None, None]
Keep in mind that something like xs[15] = 5
would still fail, as our list has only 10 elements.
range(x) creates a list from [0, 1, 2, ... x-1]
# 2.X only. Use list(range(10)) in 3.X.
>>> xs = range(10)
>>> xs
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Using a function to create a list:
>>> def display():
... xs = []
... for i in range(9): # This is just to tell you how to create a list.
... xs.append(i)
... return xs
...
>>> print display()
[0, 1, 2, 3, 4, 5, 6, 7, 8]
List comprehension (Using the squares because for range you don't need to do all this, you can just return range(0,9)
):
>>> def display():
... return [x**2 for x in range(9)]
...
>>> print display()
[0, 1, 4, 9, 16, 25, 36, 49, 64]
How to create a fix size list in python?
(tl;dr: The exact answer to your question is numpy.empty
or numpy.empty_like
, but you likely don't care and can get away with using myList = [None]*10000
.)
Simple methods
You can initialize your list to all the same element. Whether it semantically makes sense to use a non-numeric value (that will give an error later if you use it, which is a good thing) or something like 0 (unusual? maybe useful if you're writing a sparse matrix or the 'default' value should be 0 and you're not worried about bugs) is up to you:
>>> [None for _ in range(10)]
[None, None, None, None, None, None, None, None, None, None]
(Here _
is just a variable name, you could have used i
.)
You can also do so like this:
>>> [None]*10
[None, None, None, None, None, None, None, None, None, None]
You probably don't need to optimize this. You can also append to the array every time you need to:
>>> x = []
>>> for i in range(10):
>>> x.append(i)
Performance comparison of simple methods
Which is best?
>>> def initAndWrite_test():
... x = [None]*10000
... for i in range(10000):
... x[i] = i
...
>>> def initAndWrite2_test():
... x = [None for _ in range(10000)]
... for i in range(10000):
... x[i] = i
...
>>> def appendWrite_test():
... x = []
... for i in range(10000):
... x.append(i)
Results in python2.7:
>>> import timeit
>>> for f in [initAndWrite_test, initAndWrite2_test, appendWrite_test]:
... print('{} takes {} usec/loop'.format(f.__name__, timeit.timeit(f, number=1000)*1000))
...
initAndWrite_test takes 714.596033096 usec/loop
initAndWrite2_test takes 981.526136398 usec/loop
appendWrite_test takes 908.597946167 usec/loop
Results in python 3.2:
initAndWrite_test takes 641.3581371307373 usec/loop
initAndWrite2_test takes 1033.6499214172363 usec/loop
appendWrite_test takes 895.9040641784668 usec/loop
As we can see, it is likely better to do the idiom [None]*10000
in both python2 and python3. However, if one is doing anything more complicated than assignment (such as anything complicated to generate or process every element in the list), then the overhead becomes a meaninglessly small fraction of the cost. That is, such optimization is premature to worry about if you're doing anything reasonable with the elements of your list.
Uninitialized memory
These are all however inefficient because they go through memory, writing something in the process. In C this is different: an uninitialized array is filled with random garbage memory (sidenote: that has been reallocated from the system, and can be a security risk when you allocate or fail to mlock and/or fail to delete memory when closing the program). This is a design choice, designed for speedup: the makers of the C language thought that it was better not to automatically initialize memory, and that was the correct choice.
This is not an asymptotic speedup (because it's O(N)
), but for example you wouldn't need to first initialize your entire memory block before you overwrite with stuff you actually care about. This, if it were possible, is equivalent to something like (pseudo-code) x = list(size=10000)
.
If you want something similar in python, you can use the numpy
numerical matrix/N-dimensional-array manipulation package. Specifically, numpy.empty
or numpy.empty_like
That is the real answer to your question.
Initialise a list to a specific length in Python
If the "default value" you want is immutable, @eduffy's suggestion, e.g. [0]*10
, is good enough.
But if you want, say, a list of ten dict
s, do not use [{}]*10
-- that would give you a list with the same initially-empty dict
ten times, not ten distinct ones. Rather, use [{} for i in range(10)]
or similar constructs, to construct ten separate dict
s to make up your list.
how to define a list with predefined length in Python
Try this one
values = [None]*1000
In place of 1000 use your desired number.
Initializing a list to a known number of elements in Python
The first thing that comes to mind for me is:
verts = [None]*1000
But do you really need to preinitialize it?
Is it better to pre-allocate array in python or use arr.append()?
In this simple timing test, the use of [None] * n
does indeed appear to be slightly quicker, but arguably not by enough to justify adopting this approach over the more usual idioms.
import time
def func1(size):
a = [None] * size
for i in range(size):
a[i] = i
def func2(size):
a = []
for i in range(size):
a.append(i)
def func3(size):
a = [i for i in range(size)]
size = 1000000
repeat = 100
t0 = time.time()
for _ in range(repeat):
func1(size)
t1 = time.time()
for _ in range(repeat):
func2(size)
t2 = time.time()
for _ in range(repeat):
func2(size)
t3 = time.time()
print(t1 - t0, t2 - t1, t3 - t2)
Results:
[None * size]
and then index: 4.82 secondsappend
in a loop: 6.37 seconds- list comprehension: 6.34 seconds
Repeating the tests with size=1000
and repeat=100000
give similar results:
[None * size]
and then index: 3.16 secondsappend
in a loop: 4.88 seconds- list comprehension: 4.84 seconds
And again with size=10
and repeat = 10000000
:
[None * size]
and then index: 6.09 secondsappend
in a loop: 7.65 seconds- list comprehension: 7.66 seconds
Initialising an array of fixed size in Python
You can use:
>>> lst = [None] * 5
>>> lst
[None, None, None, None, None]
Related Topics
Matplotlib: Save Plot to Numpy Array
How Would I Access Variables from One Class to Another
Python Selenium Webdriver. Writing My Own Expected Condition
Python Round Up Integer to Next Hundred
How to Remove All Characters After a Specific Character in Python
How to Manually Create a Legend
How to Check If Type of a Variable Is String
Selecting a Row of Pandas Series/Dataframe by Integer Index
How to Put Individual Tags for a Matplotlib Scatter Plot
Given a Url to a Text File, What Is the Simplest Way to Read the Contents of the Text File
How to Forward-Declare a Function to Avoid 'Nameerror's for Functions Defined Later
How to Get Rid of "Unnamed: 0" Column in a Pandas Dataframe Read in from CSV File
How to Remove Nan Values from a Numpy Array