bit-dp
dp
bitmask
]
Leetcode 0943. Find the Shortest Superstring
Problem statement
https://leetcode.com/problems/find-the-shortest-superstring/
Solution1
This is actually Travelling salesman problem (TSP) problem: if we look at our strings as nodes, then we can evaluate distance between one string and another, for example for abcde
and cdefghij
distance is 5, because we need to use 5 more symbols fghij
to continue first string to get the second. Note, that this is not symmetric, so our graph is oriented.
Then our problem is equivalent to Traveling Salesman Problem (TSP): we need to find path with smallest weight, visiting all nodes. Here we have 2^n * n
possible states, where state consists of two elements:
mask
: binary mask with length n with already visited nodes.last
: last visited node.- In each
dp[mask][last]
we will keep length of built string + built string itself.
Complexity
There will be O(n)
possible transitions for each state: for example to the state (10011101, 7)
we will have possible steps from (00011101, 0), (00011101, 2), (00011101, 3), (00011101, 4)
. Finally, time complexity is O(2^n*n^2*M)
, where M
is the length of answer and space complexity is O(2^n*n*M)
as well.
class Solution:
def shortestSuperstring(self, A):
@lru_cache(None)
def connect(w1, w2):
return [w2[i:] for i in range(len(w1) + 1) if w1[-i:] == w2[:i] or not i][-1]
N = len(A)
dp = [[(float("inf"), "")] * N for _ in range(1<<N)]
for i in range(N): dp[1<<i][i] = (len(A[i]), A[i])
for mask in range(1<<N):
n_z_bits = [j for j in range(N) if mask & 1<<j]
for j, k in permutations(n_z_bits, 2):
cand = dp[mask ^ 1<<j][k][1] + connect(A[k], A[j])
dp[mask][j] = min(dp[mask][j], (len(cand), cand))
return min(dp[-1])[1]
Solution 2
Idea is exactly the same as in previous solutoin, but here we can use functionaliry of lru_cache
.
Complexity
is the same as Solution 1.
Code
class Solution:
def shortestSuperstring(self, A):
@lru_cache(None)
def suff(w1, w2):
return [w2[i:] for i in range(len(w1) + 1) if w1[-i:] == w2[:i] or not i][-1]
@lru_cache(None)
def dp(mask, l):
if mask + 1 == 1<<N: return ""
return min([suff(A[l], A[i]) + dp(mask | 1<<i, i) for i in range(N) if not mask & 1<<i], key = len)
N = len(A)
return min([A[i] + dp(1<<i, i) for i in range(N)], key=len)
Or you can write it very long, but oneliner
return (lambda f, c: min([A[i] + f(1<<i, i, c) for i in range(len(A))], key=len))((f := lambda m, l, c: c.get((m, l)) or c.setdefault((m,l), "" if m + 1 == 1<<len(A) else min([next(A[i][j:] for j in range(len(A[i]), -1, -1) if A[l].endswith(A[i][:j])) + f(m | 1<<i, i, c) for i in range(len(A)) if not m & 1<<i], key = len))), {})
Solution 3
Here is improvement, which potentially will work faster for big strings: instead of keeping built string for every state, we keep only its length and connection to previous node.
Complexity
Here time and space complexity is O(2^n*n^2)
, without constant M
. However in practice it works just a bit faster, because of testcases.
class Solution:
def shortestSuperstring(self, A):
@lru_cache(None)
def connect(w1, w2):
return [(w2[i:], len(w2) - i) for i in range(len(w1) + 1) if w1[-i:] == w2[:i] or not i][-1]
N = len(A)
dp = [[(float("inf"), -1)] * N for _ in range(1<<N)]
for i in range(N): dp[1<<i][i] = (len(A[i]), -1)
for mask in range(1<<N):
n_z_bits = [j for j in range(N) if mask & 1<<j]
for j, k in permutations(n_z_bits, 2):
dp[mask][j] = min(dp[mask][j], (dp[mask ^ 1<<j][k][0] + connect(A[k], A[j])[1], k))
mask = (1<<N) - 1
prev = min(dp[mask])
last = dp[mask].index(prev)
prev = prev[1]
ans = ""
while prev != -1:
ans = connect(A[prev], A[last])[0] + ans
mask -= (1<<last)
prev, last = dp[mask][prev][1], prev
return A[last] + ans
Remark
Let me know if you know how to write the solution 3 in shorter way, I think code is too long. Also see my collections of similar problems: https://leetcode.com/discuss/general-discussion/1125779/dynamic-programming-on-subsets-with-examples-explained