string
sort
trie
]
Leetcode 0527 Word Abbreviation
Problem statement
https://leetcode.com/problems/word-abbreviation/
Solution
We can have conflict, if we have equal first and last symbol and the same length. So, let us sort our dic
, using key (len(x), x[0], x[-1], x[1:-1])
. Now, let us traverse sorted list and for each pair of adjacent words, create abbreviations of this two words, updating comm
value. For example, if we have words abcdxxxz, abcexxxz, abcetyxz
, for word abcexxxz
it have common prefix of size 3
with previous word and common prefix of size 4
with next word, so we need to use abbreviation abcex2z
. In the end we traverse our list and check if abbreviation is built or not.
Complexity
Time complexity is O(n log n * m)$, where
n is number of words in
dic and
m is average size of word in
dic. Space complexity is
O(nm) to keep words in
d`.
Code
from os.path import commonprefix
class Solution:
def abbr(self, word, size):
if len(word) - size <= 3: return word
return word[:size + 1] + str(len(word) - size - 2) + word[-1]
def wordsAbbreviation(self, dic):
n = len(dic)
dic_sorted = sorted(dic, key = lambda x: (len(x), x[0], x[-1], x[1:-1]))
d = defaultdict(tuple)
for i in range(1, n):
w0 = dic_sorted[i-1]
w1 = dic_sorted[i]
if w1[0] == w0[0] and w1[-1] == w0[-1] and len(w1) == len(w0):
comm = len(commonprefix([w1, w0]))
d[w1] = max(d[w1], (comm, self.abbr(w1, comm)))
d[w0] = max(d[w0], (comm, self.abbr(w0, comm)))
return [d[w][1] if w in d else self.abbr(w,0) for w in dic]
Remark
Instead of sorting words, we can use tries. Let us split data into groups, where inside each group we have the same starting, ending symbols and length. Then what we need to do inside each group is to find for each word the length of longest common prefix between this word and all other words, which can be done, using Tries. Note, that is not Longest Common Prefix problem, where we need to find common prefix for all strings. Here for each string, we need to return maximum of all length of common prefix between this string and each other.
I think complexity will be O(mn)
, both time and space.