Python string 'join' is faster (?) than '+', but what's wrong here?
I have figured out the answer from the answers posted here by experts. Python string concatenation (and timing measurements) depends on these (as far as I've seen):
- Number of concatenations
- Average length of strings
- Number of function callings
I have built a new code that relates these. Thanks to Peter S Magnusson, sepp2k, hughdbrown, David Wolever and others for indicating important points I had missed earlier. Also, in this code I might have missed something. So, I highly appreciate any replies pointing our errors, suggestions, criticisms etc. After all, I am here for learning. Here is my new code:
from timeit import timeit
noc = 100
tocat = "a"
def f_call():
pass
def loop_only():
for i in range(noc):
pass
def concat_method():
s = ''
for i in range(noc):
s = s + tocat
def list_append():
s=[]
for i in range(noc):
s.append(tocat)
''.join(s)
def list_append_opt():
s = []
zap = s.append
for i in range(noc):
zap(tocat)
''.join(s)
def list_comp():
''.join(tocat for i in range(noc))
def concat_method_buildup():
s=''
def list_append_buildup():
s=[]
def list_append_opt_buildup():
s=[]
zap = s.append
def function_time(f):
return timeit(f,number=1000)*1000
f_callt = function_time(f_call)
def measure(ftuple,n,tc):
global noc,tocat
noc = n
tocat = tc
loopt = function_time(loop_only) - f_callt
buildup_time = function_time(ftuple[1]) -f_callt if ftuple[1] else 0
total_time = function_time(ftuple[0])
return total_time, total_time - f_callt - buildup_time - loopt*ftuple[2]
functions ={'Concat Method\t\t':(concat_method,concat_method_buildup,True),
'List append\t\t\t':(list_append,list_append_buildup,True),
'Optimized list append':(list_append_opt,list_append_opt_buildup,True),
'List comp\t\t\t':(list_comp,0,False)}
for i in range(5):
print("\n\n%d concatenation\t\t\t\t10'a'\t\t\t\t 100'a'\t\t\t1000'a'"%10**i)
print('-'*80)
for (f,ft) in functions.items():
print(f,"\t|",end="\t")
for j in range(3):
t = measure(ft,10**i,'a'*10**j)
print("%.3f %.3f |" % t,end="\t")
print()
And here is what I have got. [In the time column two times (scaled) are shown: first one is the total function execution time, and the second time is the actual(?) concatenation time. I have deducted the function calling time, function buildup time(initialization time), and iteration time. Here I am considering a case where it can't be done without loop (say more statement inside).]
1 concatenation 1'a' 10'a' 100'a'
------------------- ---------------------- ------------------- ----------------
List comp | 2.310 2.168 | 2.298 2.156 | 2.304 2.162
Optimized list append | 1.069 0.439 | 1.098 0.456 | 1.071 0.413
Concat Method | 0.552 0.034 | 0.541 0.025 | 0.565 0.048
List append | 1.099 0.557 | 1.099 0.552 | 1.094 0.552
10 concatenations 1'a' 10'a' 100'a'
------------------- ---------------------- ------------------- ----------------
List comp | 3.366 3.224 | 3.473 3.331 | 4.058 3.916
Optimized list append | 2.778 2.003 | 2.956 2.186 | 3.417 2.639
Concat Method | 1.602 0.943 | 1.910 1.259 | 3.381 2.724
List append | 3.290 2.612 | 3.378 2.699 | 3.959 3.282
100 concatenations 1'a' 10'a' 100'a'
------------------- ---------------------- ------------------- ----------------
List comp | 15.900 15.758 | 17.086 16.944 | 20.260 20.118
Optimized list append | 15.178 12.585 | 16.203 13.527 | 19.336 16.703
Concat Method | 10.937 8.482 | 25.731 23.263 | 29.390 26.934
List append | 20.515 18.031 | 21.599 19.115 | 24.487 22.003
1000 concatenations 1'a' 10'a' 100'a'
------------------- ---------------------- ------------------- ----------------
List comp | 134.507 134.365 | 143.913 143.771 | 201.062 200.920
Optimized list append | 112.018 77.525 | 121.487 87.419 | 151.063 117.059
Concat Method | 214.329 180.093 | 290.380 256.515 | 324.572 290.720
List append | 167.625 133.619 | 176.241 142.267 | 205.259 171.313
10000 concatenations 1'a' 10'a' 100'a'
------------------- ---------------------- ------------------- ----------------
List comp | 1309.702 1309.560 | 1404.191 1404.049 | 2912.483 2912.341
Optimized list append | 1042.271 668.696 | 1134.404 761.036 | 2628.882 2255.804
Concat Method | 2310.204 1941.096 | 2923.805 2550.803 | STUCK STUCK
List append | 1624.795 1251.589 | 1717.501 1345.137 | 3182.347 2809.233
To sum up all these I have made this decisions for me:
- If you have a string list available,
string 'join' method is best and
fastest. - If you can use list comprehension,
that's the easiest and fast as well. - If you need 1 to 10 concatenation
(average) with length 1 to 100, list
append, '+' both takes same (almost, note that times are scaled) time. - Optimized list append seems very
good in most situation. - When #concatenation or string length rises, '+' starts to take significantly more
and more time. Note that, for 10000 concatenations with 100'a' my PC is stuck! - If you use list append and 'join'
always, you are safe all time
(pointed by Alex
Martelli). - But in some situation say, where you
need to take user input and print
'Hello user's world!', it is simplest to use '+'. I think building a list
and join for this case like x = input("Enter user name:") and then x.join(["Hello ","'s world!"]) is uglier than "Hello %s's world!"%x or "Hello " +x+ "'s world" - Python 3.1 has improved
concatenation performance. But, in
some implementation
like Jython, '+' is less efficient. - Premature optimization is the root
of all evil (experts' saying). Most
of the time
you do not need optimization. So, don't waste time in aspiration
for optimization
(unless you are writing a big or computational project where every
micro/milli second
counts. - Use these information and write in
whatever way you like taking
circumstances under
consideration. - If you really need optimization ,
use a profiler, find the
bottlenecks and try to
optimize those.
Finally, I am trying to learn python more deeply. So, it is not unusual that there will be mistakes (error) in my observations. So, comment on this and suggest me if I am taking a wrong route. Thanks to all for participating.
Why is ''.join() faster than += in Python?
Let's say you have this code to build up a string from three strings:
x = 'foo'
x += 'bar' # 'foobar'
x += 'baz' # 'foobarbaz'
In this case, Python first needs to allocate and create 'foobar'
before it can allocate and create 'foobarbaz'
.
So for each +=
that gets called, the entire contents of the string and whatever is getting added to it need to be copied into an entirely new memory buffer. In other words, if you have N
strings to be joined, you need to allocate approximately N
temporary strings and the first substring gets copied ~N times. The last substring only gets copied once, but on average, each substring gets copied ~N/2
times.
With .join
, Python can play a number of tricks since the intermediate strings do not need to be created. CPython figures out how much memory it needs up front and then allocates a correctly-sized buffer. Finally, it then copies each piece into the new buffer which means that each piece is only copied once.
There are other viable approaches which could lead to better performance for +=
in some cases. E.g. if the internal string representation is actually a rope
or if the runtime is actually smart enough to somehow figure out that the temporary strings are of no use to the program and optimize them away.
However, CPython certainly does not do these optimizations reliably (though it may for a few corner cases) and since it is the most common implementation in use, many best-practices are based on what works well for CPython. Having a standardized set of norms also makes it easier for other implementations to focus their optimization efforts as well.
Python string join performance
It is true you should not use '+'. Your example is quite special. Try the same code with:
s1 = '*' * 100000
s2 = '+' * 100000
Then the second version (str.join) is much faster.
How slow is Python's string concatenation vs. str.join?
From: Efficient String Concatenation
Method 1:
def method1():
out_str = ''
for num in xrange(loop_count):
out_str += 'num'
return out_str
Method 4:
def method4():
str_list = []
for num in xrange(loop_count):
str_list.append('num')
return ''.join(str_list)
Now I realise they are not strictly representative, and the 4th method appends to a list before iterating through and joining each item, but it's a fair indication.
String join is significantly faster then concatenation.
Why? Strings are immutable and can't be changed in place. To alter one, a new representation needs to be created (a concatenation of the two).
Is list join really faster than string concatenation in python?
From Efficient String Concatenation in Python
Method 1
: 'a' + 'b' + 'c'
Method 6
: a = ''.join(['a', 'b', 'c'])
20,000 integers were concatenated into a string 86kb long :
Concatenations per second Process size (kB)
Method 1 3770 2424
Method 6 119,800 3000
Conclusion : YES, str.join()
is significantly faster then typical concatenation (str1+str2
).
Why is concatenating strings running faster than joining them?
The time difference you're seeing comes from creating the list to be passed to join
. And while you can get a small speedup from using a tuple instead, it's still going to be slower than just concatenating with +
when there are only a few short strings.
It would be different if you had an iterable of strings to start with, rather than an object with strings as attributes. Then you could call join
directly on the iterable, rather than needing to build a new one for each call.
Here's some testing I did with the timeit
module:
import timeit
short_strings = ["foo", "bar", "baz"]
long_strings = [s*1000 for s in short_strings]
def concat(a, b, c):
return a + b + c
def concat_from_list(lst):
return lst[0] + lst[1] + lst[2]
def join(a, b, c):
return "".join([a, b, c])
def join_tuple(a, b, c):
return "".join((a, b, c))
def join_from_list(lst):
return "".join(lst)
def test():
print("Short strings")
print("{:20}{}".format("concat:",
timeit.timeit(lambda: concat(*short_strings))))
print("{:20}{}".format("concat_from_list:",
timeit.timeit(lambda: concat_from_list(short_strings))))
print("{:20}{}".format("join:",
timeit.timeit(lambda: join(*short_strings))))
print("{:20}{}".format("join_tuple:",
timeit.timeit(lambda: join_tuple(*short_strings))))
print("{:20}{}\n".format("join_from_list:",
timeit.timeit(lambda: join_from_list(short_strings))))
print("Long Strings")
print("{:20}{}".format("concat:",
timeit.timeit(lambda: concat(*long_strings))))
print("{:20}{}".format("concat_from_list:",
timeit.timeit(lambda: concat_from_list(long_strings))))
print("{:20}{}".format("join:",
timeit.timeit(lambda: join(*long_strings))))
print("{:20}{}".format("join_tuple:",
timeit.timeit(lambda: join_tuple(*long_strings))))
print("{:20}{}".format("join_from_list:",
timeit.timeit(lambda: join_from_list(long_strings))))
Output:
Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> ================================ RESTART ================================
>>>
>>> test()
Short strings
concat: 0.5453461176251436
concat_from_list: 0.5185697357936024
join: 0.7099379456477868
join_tuple: 0.5900842397209949
join_from_list: 0.4177281794285359
Long Strings
concat: 2.002303591571888
concat_from_list: 1.8898819841869416
join: 1.5672863477837913
join_tuple: 1.4343144915087596
join_from_list: 1.231374639083505
So, joining from an already existing list is always fastest. Concatenating with +
is faster for individual items if they are short, but for long strings it is always slowest. I suspect the difference shown between concat
and concat_from_list
comes from the unpacking of the lists in the function call in the test code.
Why is 'join' faster than normal concatenation?
The code in a join function knows upfront all the strings it’s being asked to concatenate and how large those strings are, and hence it can calculate the final string length before beginning the operation.
Hence it needs only allocate memory for the final string once and then it can place each source string (and delimiter) in the correct place in memory.
On the other hand, a single += operation on a string has no choice but to simply allocate enough memory for the final string which is the concatenation of just two strings. Subsequent +='s must do the same, each allocating memory which on the next += will be discarded. Each time the evergrowing string is copied from one place in memory to another.
What is the most efficient string concatenation method in Python?
If you know all components beforehand once, use the literal string interpolation, also known as f
-strings or formatted strings, introduced in Python 3.6.
Given the test case from mkoistinen's answer, having strings
domain = 'some_really_long_example.com'
lang = 'en'
path = 'some/really/long/path/'
The contenders and their execution time on my computer using Python 3.6 on Linux as timed by IPython and the timeit module are
f'http://{domain}/{lang}/{path}'
- 0.151 µs'http://%s/%s/%s' % (domain, lang, path)
- 0.321 µs'http://' + domain + '/' + lang + '/' + path
- 0.356 µs''.join(('http://', domain, '/', lang, '/', path))
- 0.249 µs (notice that building a constant-length tuple is slightly faster than building a constant-length list).
Thus the shortest and the most beautiful code possible is also fastest.
The speed can be contrasted with the fastest method for Python 2, which is +
concatenation on my computer; and that takes 0.203 µs with 8-bit strings, and 0.259 µs if the strings are all Unicode.
(In alpha versions of Python 3.6 the implementation of f''
strings was the slowest possible - actually the generated byte code is pretty much equivalent to the ''.join()
case with unnecessary calls to str.__format__
which without arguments would just return self
unchanged. These inefficiencies were addressed before 3.6 final.)
Related Topics
Typeerror: Can Only Concatenate Str (Not "Float") to Str
Best Way to Format Integer as String with Leading Zeros
How to Insert Pandas Dataframe via MySQLdb into Database
Importing from a Relative Path in Python
List of Dicts To/From Dict of Lists
Dll Load Failed When Importing Pyqt5
Changing Iteration Variable Inside for Loop in Python
How Many Concurrent Requests Does a Single Flask Process Receive
List All the Modules That Are Part of a Python Package
What Determines Which Strings Are Interned and When
Making Heatmap from Pandas Dataframe
Python How to Pad Numpy Array with Zeros
Plotting 3D Polygons in Python-Matplotlib
How to Config Nltk Data Directory from Code
Multiple Inputs and Outputs in Python Subprocess Communicate
Differencebetween Using Loc and Using Just Square Brackets to Filter for Columns in Pandas/Python