Efficiently Grouping Anagrams in Python

Anagrams are words formed by rearranging the letters of another word, like "eat", "tea", and "ate". Grouping anagrams efficiently is a common coding challenge in interviews and competitive programming.

In this article, we’ll explore two approaches:

Brute Force Approach – Comparing every word with others.
Optimal Approach – Using frequency counting and hashing for efficiency.

Let’s dive into both methods and analyze their performance.

1️⃣ Brute Force Approach

Explanation:

Compare each word with every other word in the list.
Sort both words and check if they match.
If they are anagrams, group them together.
Mark words that have already been grouped to avoid duplicates.

Code Implementation:

pythonCopyEditfrom collections import defaultdict

str = ["eat", "tea", "tan", "ate", "nat", "bat"]

def groupAnagram(str):
    res = []
    for i in range(len(str)):
        if str[i] != "":
            temp = [str[i]]
            for j in range(i+1, len(str)):
                if sorted(str[i]) == sorted(str[j]):
                    temp.append(str[j])
                    str[j] = ""  # Marking word as used
            res.append(temp)
    return res

print(groupAnagram(str))

Output:

pythonCopyEdit[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

Time Complexity Analysis:

Sorting a string takes O(k log k) (where k is the length of the string).
We compare each string with every other string O(n²) times.
The overall time complexity is O(n² k log k), making it inefficient for large datasets.

2️⃣ Optimal Approach (Using Hashing)

Explanation:

Instead of sorting words, use character frequency counting as a unique key.
Store anagrams in a dictionary using this key.
This avoids unnecessary sorting and reduces time complexity.

Code Implementation:

pythonCopyEditfrom collections import defaultdict

str = ["eat", "tea", "tan", "ate", "nat", "bat"]

def groupAnagram(str):
    res = defaultdict(list)
    for s in str:
        count = [0] * 26  # Array to store frequency of each letter
        for c in s:
            count[ord(c) - ord('a')] += 1  # Count occurrence of each character
        res[tuple(count)].append(s)  # Use tuple as a hashable key
    return list(res.values())

print(groupAnagram(str))

Output:

pythonCopyEdit[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

Time Complexity Analysis:

Counting characters in a word takes O(k).
Inserting into the dictionary is O(1) on average.
Overall time complexity is O(nk), which is much faster than the brute force approach.

📌 Key Takeaways:

Approach	Time Complexity	Space Complexity	Efficiency
Brute Force	O(n² k log k)	O(n)	❌ Slow for large inputs
Optimal (Hashing)	O(nk)	O(n)	✅ Fast and scalable

🔹 The brute force method works for small datasets but becomes inefficient as input size grows.
🔹 The optimal method leverages hashing for an efficient O(nk) time complexity, making it ideal for larger datasets.

Conclusion

Grouping anagrams efficiently is a great example of how optimizing algorithms can drastically improve performance. The brute force approach is easier to understand, but the optimal approach using frequency counting is the way to go for real-world applications.

🚀 Want to master DSA? Try implementing this solution in different languages and optimizing further!

What do you think? Let me know in the comments!

Grouping Anagrams in Python: Brute Force vs Optimal Approach

Table of contents

1️⃣ Brute Force Approach

Explanation:

Code Implementation:

Output:

Time Complexity Analysis:

2️⃣ Optimal Approach (Using Hashing)

Explanation:

Code Implementation:

Output:

Time Complexity Analysis:

📌 Key Takeaways:

Conclusion