Problem statement

https://leetcode.com/problems/string-compression-ii/

Solution

Let dp(i, l, cl, k) be dymamic programming state, where:

  1. we alreadh traversed string s[:i]
  2. l is the last letter in created so-far compression
  3. lc is how many times we have the last letter repeated
  4. k is how many symbols we allowed to delete.

Then if we meet new symbol we can have two options:

  1. It is equal to our last symbol, in this case we always will take the symbol. Also we increase answer by 1 if we have a -> aa, a*9 -> a*10 and a*99 -> a*100 cases.
  2. If we have symbol different that we have in our built string so far, then we can have two options: take it or not.

Complexity

Complexity is O(n*n*k), because there is n options for i, k options for k and n options for l, lc pair. Space complexity is the same.

Code

class Solution:
    def getLengthOfOptimalCompression(self, s, k):
        @lru_cache(None)
        def dp(i, l, lc, k):
            if k < 0: return float('inf')
            if i >= n: return 0
            if s[i] == l:
                return dp(i+1, l, lc+1, k) + (lc in [1, 9, 99])
            else:
                return min(1 + dp(i+1, s[i], 1, k), dp(i+1, l, lc, k-1))
        
        n = len(s)
        return dp(0, "", 0, k)

Note that actually we did optimization here: if we have aaaaa, then what we do if we delete symbols from group of equal symbols, we do it only from beginning. Solution can be written in the following way as well:

class Solution:
    def getLengthOfOptimalCompression(self, s, k):
        @lru_cache(None)
        def dp(i, l, lc, k):
            if k < 0: return float('inf')
            if i >= n: return 0
            ans = dp(i+1, l, lc, k-1)
            if s[i] == l:
                return min((lc in [1, 9, 99]) + dp(i+1, l, lc+1, k), ans)
            else:
                return min(1 + dp(i+1, s[i], 1, k), ans)
        
        n = len(s)
        return dp(0, "", 0, k)