Problem statement

https://leetcode.com/problems/find-duplicate-file-in-system/

Solution

When you understand what is asked, it is actually very easy: all you need to do is to use hash-table (defaultdict), where keys are content of files and values are list of path names. What we do we iterate for path in paths, then we split each path, using space, first part is the path of folder and for the other parts we split it again, using (. Then we add item to defaultdict. Finally, we iterate over our ans and choose only those elements, for which we have at least 2 duplicates.

Complexity

Time and space complexity is O(T + S), where T is total sum of all length of full path names and S is total length of all content of files. So we can say, this solution is linear.

Code

class Solution:
    def findDuplicate(self, paths):
        ans = defaultdict(list)
        for path in paths:
            l = path.split()
            for i in range(1, len(l)):
                name, cont = l[i].split("(")
                ans[cont[:-1]].append(l[0] + "/" + name)
                
        return [i for i in ans.values() if len(i) > 1]