Problem statement

https://leetcode.com/problems/similar-string-groups/

Solution

Usual graph problem, where we need to find connected components. Let n be number of words and m be length of each word. Then we can create graph of connectedness in two different ways:

  1. Compare each pair of words in O(m), leading to O(n^2m) complexity.
  2. For each word find all candidates it can be connected with, for each word we need to check O(m^2) candidates, each of them having length m, so overall complexity will be O(m^3n). On leetcode this will work slower, so we do not really need it.

Then we just perform usual dfs on n nodes, where each of them can be connected with at most O(m^2).

Complexity

Time complexity is O(n^2m + m^2n). Space complexity is O(n^2).

Code

class Solution:
    def numSimilarGroups(self, strs):
        n, m, count = len(strs), len(strs[0]), 0
        set_strs, Graph = set(strs), defaultdict(set)
        visited = set()
        
        for i1, i2 in combinations(range(n), 2):
            if sum(i!=j for i,j in zip(strs[i1], strs[i2])) <= 2:
                Graph[i1].add(i2)
                Graph[i2].add(i1)
                    
        def dfs(i):
            if i in visited: return
            visited.add(i)
            for neib in Graph[i]:
                dfs(neib)
                
        for i in range(n):
            if i not in visited:
                count += 1
                dfs(i)
        
        return count

Remark

Also, we can optimize our code several times, if we use smarter comparison function: we do not need to check all m elements each time and we can break if we have more than do places, where strings are different.

def similar(word1, word2):
    diff = i = 0
    while i < WN and diff <= 2:
        if word1[i] != word2[i]: diff += 1
        i += 1
    return diff <= 2