Is the Time-Complexity of Iterative String Append Actually O(N^2), or O(N)

Is the time-complexity of iterative string append actually O(n^2), or O(n)?

In CPython, the standard implementation of Python, there's an implementation detail that makes this usually O(n), implemented in the code the bytecode evaluation loop calls for + or += with two string operands. If Python detects that the left argument has no other references, it calls realloc to attempt to avoid a copy by resizing the string in place. This is not something you should ever rely on, because it's an implementation detail and because if realloc ends up needing to move the string frequently, performance degrades to O(n^2) anyway.

Without the weird implementation detail, the algorithm is O(n^2) due to the quadratic amount of copying involved. Code like this would only make sense in a language with mutable strings, like C++, and even in C++ you'd want to use +=.

Time complexity of string concatenation in Python

Yes, in your case*1 string concatenation requires all characters to be copied, this is a O(N+M) operation (where N and M are the sizes of the input strings). M appends of the same word will trend to O(M^2) time therefor.

You can avoid this quadratic behaviour by using str.join():

word = ''.join(list_of_words)

which only takes O(N) (where N is the total length of the output). Or, if you are repeating a single character, you can use:

word = m * char

You are prepending characters, but building a list first, then reversing it (or using a collections.deque() object to get O(1) prepending behaviour) would still be O(n) complexity, easily beating your O(N^2) choice here.


*1 As of Python 2.4, the CPython implementation avoids creating a new string object when using strA += strB or strA = strA + strB, but this optimisation is both fragile and not portable. Since you use strA = strB + strA (prepending) the optimisation doesn't apply.

Is the time complexity of string += a the same as string = string + a?

They have the same time complexity.

In the general Python standard defined case: They both have the same time complexity of O(n), where n is the length of string s.

In practice, with the CPython implementation: They can in some cases both be O(1), because of an optimization that the interpreter can do when it detects that s is the only reference to the string in question.

Demo (using Python 3.10.1):

O(1) (optimization in play):

String of length 10⁹, using +=:

$ python -m timeit --setup='s = "s" * 10**9' 's += "a"'
5000000 loops, best of 5: 96.6 nsec per loop

String of length 10⁹, using +:

$ python -m timeit --setup='s = "s" * 10**9' 's = s + "a"'
5000000 loops, best of 5: 95.5 nsec per loop

String of length 1, using +=:

$ python -m timeit --setup='s = "s"' 's += "a"'
5000000 loops, best of 5: 97 nsec per loop

String of length 1, using +:

$ python -m timeit --setup='s = "s"' 's = s + "a"'
5000000 loops, best of 5: 97.9 nsec per loop

O(n) (optimization doesn't apply):

String of length 10⁹, optimization doesn't apply, using +=:

$ python -m timeit --setup='s = "s" * 10**9; b = s' 's += "a"'
1 loop, best of 5: 440 msec per loop

String of length 10⁹, optimization doesn't apply, using +:

$ python -m timeit --setup='s = "s" * 10**9; b = s' 's = s + "a"'
1 loop, best of 5: 445 msec per loop

String of length 1, optimization doesn't apply, using +=:

$ python -m timeit --setup='s = "s"; b = s' 's += "a"'
5000000 loops, best of 5: 85.8 nsec per loop

String of length 1, optimization doesn't apply, using +:

$ python -m timeit --setup='s = "s"; b = s' 's = s + "a"'
5000000 loops, best of 5: 84.8 nsec per loop

More info about the time complexity of string concatenation:
https://stackoverflow.com/a/37133870/9835872
https://stackoverflow.com/a/34008199/9835872

Big O for Palindrome Checker Python

About the first part: string[::-1] is O(n), not O(n²). I am not sure how exactly it is done, internally, but it definitely does not repeatedly append each character to a sequence of growing immutable strings. Sometimes, things like this can easily be checked in Python itself by timing the execution:

>>> s1 = str(list(range(1000)))                                             
>>> s2 = str(list(range(10000)))
>>> len(s1), len(s2)
(4890, 58890)
>>> %timeit s1[::-1]
3.03 µs ± 10.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit s2[::-1]
34.4 µs ± 60.6 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

About the second part: This particular recursive algorithm should indeed be O(n²) in both time and space. The main problem here is, of course, repeatedly creating the ever-smaller slices of the string (which, according to this question, creates actual copies of the string and not just a "view" on part of the string). Stacked on top of each other, those would have the shape of a triangle, i.e. not quite "quadratic" but still O(n²). The case would be different, though, if instead of slicing, you'd pass the start- and end-index (or just the offset from both start and end) as additional parameters, then there is no slicing and Python would just have to store a reference to the original string on the stack.

What is the complexity of str() function in Python3?

As can be seen from the source code, the implementation of int.__str__ has runtime complexity of O(m*n) where m is the number of binary digits and n is the number of decimal digits. Since for an integer i the number of digits in an arbitrary base b is given by log(i, base=b) and logarithms in different bases differ only by a constant, the runtime complexity is O(log(i)**2), i.e. quadratic in the number of digits.

This can be verified by running a performance test:

import perfplot

perfplot.show(
setup=lambda n: 10**n,
kernels=[str],
n_range=range(1, 1001, 10),
xlabel='number of digits',
)

performance plot

The quadratic time complexity in the number of digits is also mentioned in the issue for CVE-2020-10735:

[...] A huge integer will always consume a near-quadratic amount of CPU time in conversion to or from a base 10 (decimal) string with a large number of digits. No efficient algorithm exists to do otherwise.

Algorithm, Big O notation: Is this function O(n^2) ? or O(n)?

During every iteration of the loop, the statement answer += c; must copy each and every character already in the string answer to a new string.

E.g. n = 5, c = '5'

  • First loop: answer is an empty string, but it must still create a new string. There is one operation to append the first '5', and answer is now "5".
  • Second loop: answer will now point to a new string, with the first '5' copied to a new string with another '5' appended, to make "55". Not only is a new String created, one character '5' is copied from the previous string and another '5' is appended. Two characters are appended.
  • "n"th loop: answer will now point to a new string, with n - 1 '5' characters copied to a new string, and an additional '5' character appended, to make a string with n 5s in it.

The number of characters copied is 1 + 2 + ... + n = n(n + 1)/2. This is O(n2).

The efficient way to constructs strings like this in a loop in Java is to use a StringBuilder, using one object that is mutable and doesn't need to copy all the characters each time a character is appended in each loop. Using a StringBuilder has a cost of O(n).

What is the time complexity of this function that iterates through a list a creates a dictionary?

Your time and space complexity are both Theta(n). While sometimes it can be useful for clarity to include terms in the time or space complexity that don't change the asymptotic value (a classic example being string searching algorithms), it doesn't make as much sense here.

While your claim of O(n + m^2) time complexity is technically correct as Big-O notation is an upper bound, you can show that O(n) is also an upper bound, since the dictionary has size at most n and we iterate over each key exactly once: there are n items read from input, at most n loop iterations for the dictionary, and n items appended to the output list.

If you wanted, you can calculate 'auxiliary' space required, which would be the amount of space needed but not counting the input or output arrays. Here, that would be Theta(m). You should note, however, that such analyses are fairly uncommon: by assumption, unless specified otherwise, space complexity analysis will include the size of the output.


To address a common confusion about why the second loop is still linear time with many duplicate values, let's look at an example.

The lines in question are:

for k, v in d.items():
if v > 0:
for i in range(v):
output_list.append(k)

Suppose our input list was [1, 1, 1, 1, 1, 1, 1, 2, 2, 2] (ten elements total: seven '1's and three '2's).

Then our dictionary.items() (which has the count of each element, minus one) would look like: [(key: 1, value: 6), (key: 2, value: 2)] (it's not really stored as a Python list of tuples internally, but these are the full contents of the items view-object).

Let's walk through that second loop's operation, line by line:

for k, v in [(key: 1, value: 6), (key: 2, value: 2)]: 
# On our first iteration, so k = 1, v = 6.

if 6 > 0: # 1 operation
for i in range(6): # 6 operations
output_list.append(1) # 6 operations

# For the (k, v) pair of (1, 6), the inner-loop has run 6 times.
# Every time the inner-loop runs, the length of output_list
# increases by 1

# Second iteration of outer-loop runs again:
for k, v in [(key: 1, value: 6), (key: 2, value: 2)]:
# On our second iteration, so k = 2, v = 2.

if 2 > 0: # 1 operation
for i in range(2): # 2 operations
output_list.append(2) # 2 operations

# For the (k, v) pair of (1, 6), the inner loop has run 2 times.
# In total: 8 inner-loop iterations, and output_list has len 8

In informal complexity analysis, a 'standard' rule of thumb is that the run-time of a double-nested loop is often quadratic. This is because we're counting the total number of inner-loop iterations, for example

for i in range(n):
for j in range(n):

as

(n inner-loops per outer-loop) * (n outer-loops) = n^2 inner-loops

This assumption shouldn't be applied when the number of inner-loop iterations varies dramatically based on the state of the outer-loop. In our example, the inner-loop iterations is v, which depends on the outer loop.

To find the runtime here, we need a different way to count how many inner-loop iterations occur. Luckily, we can do that: in each inner-loop iteration, we append an element to output_list.

Since the final length of output_list is n, we know that the inner-loop has executed at most n times (technically, it's executed exactly n-m times, since the output_list already has size m after the earlier dictionary-initializing loop has terminated). Instead of incorrectly multiplying this by m, the number of outer-loop iterations, we should instead add the inner and outer loop iterations for a runtime of Theta(n+m) which is Theta(n).


Addendum: Comments have correctly pointed out that since Python dictionaries don't have an O(1) amortized worst-case lookup/insert time guarantee, so the first loop is, at best, Omega(m*n). While Python uses pseudo-random probing on an open-addressing table, this only ensures good 'average' performance. Thanks to Kelly Bundy for the highly informative discussion and corrections.

Unfortunately, while O(1) amortized worst-case lookup/insert hashing is possible, for example with Cuckoo hashing, in practice this is significantly slower on average than what's currently used in most standard libraries, and is unlikely to change in the near future.



Related Topics



Leave a reply



Submit