Most Pythonic Way to Interleave Two Strings

Most pythonic way to interleave two strings

For me, the most pythonic* way is the following which pretty much does the same thing but uses the + operator for concatenating the individual characters in each string:

res = "".join(i + j for i, j in zip(u, l))
print(res)
# 'AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz'

It is also faster than using two join() calls:

In [5]: l1 = 'A' * 1000000; l2 = 'a' * 1000000

In [6]: %timeit "".join("".join(item) for item in zip(l1, l2))
1 loops, best of 3: 442 ms per loop

In [7]: %timeit "".join(i + j for i, j in zip(l1, l2))
1 loops, best of 3: 360 ms per loop

Faster approaches exist, but they often obfuscate the code.

Note: If the two input strings are not the same length then the longer one will be truncated as zip stops iterating at the end of the shorter string. In this case instead of zip one should use zip_longest (izip_longest in Python 2) from the itertools module to ensure that both strings are fully exhausted.


*To take a quote from the Zen of Python: Readability counts.

Pythonic = readability for me; i + j is just visually parsed more easily, at least for my eyes.

All possible ways to interleave two strings

The Idea

Let the two strings you want to interleave be s and t. We will use recursion to generate all the possible ways to interleave these two strings.

If at any point of time we have interleaved the first i characters of s and the first j characters of t to create some string res, then we have two ways to interleave them for the next step-

  1. Append the i+1 th character of s to res
  2. Append the j+1 th character of t to res

We continue this recursion till all characters of both the strings have been used and then we store this result in a list of strings lis as in the code below.

The Code

def interleave(s, t, res, i, j, lis):
if i == len(s) and j == len(t):
lis.append(res)
return
if i < len(s):
interleave(s, t, res + s[i], i + 1, j, lis)
if j < len(t):
interleave(s, t, res + t[j], i, j + 1, lis)

l = []
s = "ab"
t = "cd"
interleave(s, t, "", 0, 0, l)
print l

Output

['abcd', 'acbd', 'acdb', 'cabd', 'cadb', 'cdab']

This implementation is as efficient as we can get (at least asymptotically) since we never generate the same string twice.

How to Interleave two strings in Python when their lengths are not equal?

Use zip_longest from itertools.

from itertools import zip_longest

u = 'Abcdefgh'
l = 'Wxyz'
res = "".join(i + j for i, j in zip_longest(u, l, fillvalue=''))
print(res)

Interleave two multiline strings in Python

Multiline-strings are denoted with ''' in Python.

>>> s1 = '''AAA1
...:AAA2
...:AAA3'''
>>>
>>> s2 = '''BBB1
...:BBB2
...:BBB3'''

You can split them with str.splitlines and interleave them with zip.

>>> strings = [s1, s2]
>>> pairs = zip(*(s.splitlines() for s in strings))
>>> result = '\n'.join(''.join(pair) for pair in pairs)
>>>
>>> print(result)
AAA1BBB1
AAA2BBB2
AAA3BBB3

A generic function that accepts any number of multiline strings using the *args syntax can be written as follows.

>>> def combine(*strings):
...: lines = zip(*(s.splitlines() for s in strings))
...: return '\n'.join(''.join(line) for line in lines)
>>>
>>> str1 = '''A
...:D
...:H'''
>>> str2 = '''B
...:E
...:I'''
>>> str3 = '''C
...:F
...:J'''
>>>
>>> print(combine(str1, str2, str3))
ABC
DEF
HIJ

Note that zip truncates the iterable it produces to the length of its shortest argument, i.e. the result has as many lines as the shortest multiline string passed to combine.

If you expect strings with different numbers of lines and need fill-values, you can play around with zip_longest from the itertools module.

How can I interleave or create unique permutations of two strings (without recursion)

Your problem can be reduced to that of creating all unique permutations of a particular list. Say A and B are the lengths of the strings arr1 and arr2, respectively. Then construct a list like this:

[0] * A + [1] * B

There exists a one-to-one correspondence (a bijection) from the unique permutations of this list to all the possible interleavings of the two strings arr1 and arr2. The idea is to let each value of the permutation specify which string to take the next character from. Here is an example implementation showing how to construct an interleaving from a permutation:

>>> def make_interleave(arr1, arr2, permutation):
... iters = [iter(arr1), iter(arr2)]
... return "".join(iters[i].next() for i in permutation)
...
>>> make_interleave("ab", "cde", [1, 0, 0, 1, 1])
'cabde'

I found this question in the python mailing list which asks how to solve this problem in an efficient manner. The answers suggest using an algorithm which is described in Knuth's The Art of Computer Programming, Volume 4, Fascicle 2: Generating All Permutations. I found an online pdf of the draft here. The algorithm is also described in this wikipedia article.

Here's my own annotated implementation of the next_permutation algorithm, as a python generator function.

def unique_permutations(seq):
    """
    Yield only unique permutations of seq in an efficient way.

    A python implementation of Knuth's "Algorithm L", also known from the 
    std::next_permutation function of C++, and as the permutation algorithm
of Narayana Pandita.
    """

    # Precalculate the indices we'll be iterating over for speed
    i_indices = list(range(len(seq) - 1, -1, -1))
    k_indices = i_indices[1:]

    # The algorithm specifies to start with a sorted version
    seq = sorted(seq)

    while True:
        yield seq

# Working backwards from the last-but-one index,          k
        # we find the index of the first decrease in value.  0 0 1 0 1 1 1 0
        for k in k_indices:
            if seq[k] < seq[k + 1]:
                break
        else:
# Introducing the slightly unknown python for-else syntax:
# else is executed only if the break statement was never reached.
# If this is the case, seq is weakly decreasing, and we're done.
            return

# Get item from sequence only once, for speed
        k_val = seq[k]

        # Working backwards starting with the last item,       k     i
        # find the first one greater than the one at k   0 0 1 0 1 1 1 0
        for i in i_indices:
            if k_val < seq[i]:
                break

        # Swap them in the most efficient way
        (seq[k], seq[i]) = (seq[i], seq[k])           #       k     i
                                          # 0 0 1 1 1 1 0 0

# Reverse the part after but not k
# including k, also efficiently. 0 0 1 1 0 0 1 1
seq[k + 1:] = seq[-1:k:-1]

Each yield of the algorithm has an amortized complexity of O(1), according to this question, but according to rici who commented below, this is only the case if all numbers are unique, which they are definately not in this case.

In any case, the number of yields provides a lower bound for the time complexity, and it is given by


(A + B)! / (A! * B!)

Then to find the real time complexity we need to sum the average complexity of each yield with the complexity of constructing the resulting string based on the permutation. If we multiply this sum with the above formula we get the total time complexity.

Checking interweaving strings in pythonic way

One approach of using the lru_cache with lambda to speed things up:

from functools import lru_cache
class Solution:
def isInterleave(self, s1: str, s2: str, s3: str) -> bool:
return (memo:=lru_cache()
( lambda i=0, j=0, k=0:
(i, j, k) == (len(s1), len(s2), len(s3))
or ( i < len(s1) and k < len(s3) and s1[i] == s3[k] and memo( i+1, j, k+1 ) )
or ( j < len(s2) and k < len(s3) and s2[j] == s3[k] and memo( i, j+1, k+1 ) ) )
)()

combine two strings in python

try

y = str(i) + str(x)

it should works.



Related Topics



Leave a reply



Submit