Leetcode 0940. Distinct Subsequences II

Problem statement

https://leetcode.com/problems/distinct-subsequences-ii/

Solution 1

First idea I have when I see this problem is to use similar idea as problem 0730 Count Different Palindromic Subsequences, here dfs(i) is answer for string S[i:] and we find first occurrence of letter a, b, ... z in this string.

Complexity

Time complexity is O(mn), where m is size of alphabet, space is O(n).

Code

class Solution:
    def distinctSubseqII(self, S):
        alphabet = "abcdefghijklmnopqrstuvwxyz"
        n, MOD = len(S), 10**9 + 7
        
        def letter_get(s, letter, dr):
            res, cur = [-1]*len(s), -1
            for i in range(len(s))[::dr]:
                if s[i] == letter: cur = i
                res[i] = cur
            return res
        
        corrs = defaultdict(list)
        for letter in alphabet:
            corrs[letter] = letter_get(S, letter, -1)
        
        @lru_cache(None)
        def dfs(i):
            if i > n - 1: return 1
            ans = 1
            for letter in alphabet:
                i0 = corrs[letter][i]
                if i <= i0 <= n - 1:
                    ans += dfs(i0 + 1)
            return ans % MOD
                    
        return dfs(0) - 1

Solution 2

Define by dp[i] is number of unique subsequences, which ends with symbol i. On each step we will keep [0]*26 array, because size of alphabet is 26 (in fact originally it is 2d dp table, but we always need only last line).

Imagine, that we have s1, ..., sk being unique subsequences when we reached symbol with index j. What happens when we go to index j + 1? Imagine we have new symbol x. Then we can say that number of unique subsequences which end with X is k + 1: s1 + X, s2 + X, ..., xk + X, X. Number of unique subsequences which ends on the other symbols stays the same.

Complexity

It is O(mn) for time, where m is size of alphabet and O(m) for space.

Code

class Solution:
    def solve(self, s):
        dp = [0] * 26
        mod = 10 ** 9 + 7
        for ch in s:
            dp[ord(ch) - ord("a")] = (sum(dp) + 1) % mod
        return sum(dp) % mod

Solution 3

There is very elegant and short solution with O(n) complexity as well! Let us add symbol by symbol and evaluate how many options we have. We need to multiply it by two and subtract all subsequences we meet twice: for this we need to find previous occurrence of letter we are adding. Imagine we have acbab, we found all subsequences for acba:

"", a, b, c, ac, ab, aa, cb, ca, ba, acb, aca, aba, cba, acba
New ones are b, ab, bb, cb, acb, abb, aab, cbb, cab, bab, acbb, acab, abab, cbab, acbab
Repetitions are b, ab, cb, acb, number of it equal to dp[2]

Complexity

It is O(n) for time and O(n) for space.

Code

class Solution:
    def distinctSubseqII(self, S):
        dp, last = [1], {}
        for i, x in enumerate(S):
            dp.append(dp[-1] * 2)
            if x in last:
                dp[-1] -= dp[last[x]]
            last[x] = i

        return (dp[-1] - 1) % (10**9 + 7)

Remark

See similar problem 1987. Number of Unique Good Subsequences