hash table
random
design
]
Leetcode 0535. Encode and Decode TinyURL
https://leetcode.com/problems/encode-and-decode-tinyurl
The idea of this problem is the following: let us keep 2
dictinoaries:
self.long_short
to keep links between original long address and encoded short address.self.short_long
to keep the opposit connections.
Why we need to keep both dictionaries? Because we want to do fast encoding and decoding.
1.encode(self, longUrl)
will work like this: let us try to generate random code
, say with 6
symbols, which consists of letters. If this code was not used for some other long link, we are happy: we put connections to our direct and inverse dictionaries. If it happen, that this code was used for some other long link, we are not happy, and we generate one more code and so on.
2.decode(self, shortUrl)
is pretty straightforward: we just look at our selfl.short_long
dictionary.
Complexity: time complexity of one encoding and decoding is O(n + m)
, where n
is length of original string and m
is length of encoded strigng, if we assume that probability of collision is small enough. So, what is probability of collision: we have 26^6
approx N = 3*10^8
different options, and if we use Birthday problem, than if you have T = 1000
different strings to encode we will have (1-1/N)*(1-2/N)*...(1-(T-1)/N)
probability to not have a collision, which is very close to 1
. In fact, we need to choose T
, such that T^2 << N
, and in this case probability can be approximated as exp(T^2/2*N)
, which in our case equal to exp(-1/300)
approx (1 - 1/300)
.
class Codec:
def __init__(self):
self.long_short = {}
self.short_long = {}
self.alphabet = "abcdefghijklmnopqrstuvwzyz"
def encode(self, longUrl):
while longUrl not in self.long_short:
code = "".join(choices(self.alphabet, k=6))
if code not in self.short_long:
self.short_long[code] = longUrl
self.long_short[longUrl] = code
return 'http://tinyurl.com/' + self.long_short[longUrl]
def decode(self, shortUrl):
return self.short_long[shortUrl[-6:]]
If you like the solution, you can upvote it on leetcode discussion section: Problem 0535