Anagrams are words formed by rearranging the letters of another word, like "eat", "tea", and "ate". Grouping anagrams efficiently is a common coding challenge in interviews and competitive programming.
In this article, we’ll explore two approaches:
Brute Force Approach – Comparing every word with others.
Optimal Approach – Using frequency counting and hashing for efficiency.
Let’s dive into both methods and analyze their performance.
1️⃣ Brute Force Approach
Explanation:
Compare each word with every other word in the list.
Sort both words and check if they match.
If they are anagrams, group them together.
Mark words that have already been grouped to avoid duplicates.
Code Implementation:
pythonCopyEditfrom collections import defaultdict
str = ["eat", "tea", "tan", "ate", "nat", "bat"]
def groupAnagram(str):
res = []
for i in range(len(str)):
if str[i] != "":
temp = [str[i]]
for j in range(i+1, len(str)):
if sorted(str[i]) == sorted(str[j]):
temp.append(str[j])
str[j] = "" # Marking word as used
res.append(temp)
return res
print(groupAnagram(str))
Output:
pythonCopyEdit[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
Time Complexity Analysis:
Sorting a string takes O(k log k) (where k is the length of the string).
We compare each string with every other string O(n²) times.
The overall time complexity is O(n² k log k), making it inefficient for large datasets.
2️⃣ Optimal Approach (Using Hashing)
Explanation:
Instead of sorting words, use character frequency counting as a unique key.
Store anagrams in a dictionary using this key.
This avoids unnecessary sorting and reduces time complexity.
Code Implementation:
pythonCopyEditfrom collections import defaultdict
str = ["eat", "tea", "tan", "ate", "nat", "bat"]
def groupAnagram(str):
res = defaultdict(list)
for s in str:
count = [0] * 26 # Array to store frequency of each letter
for c in s:
count[ord(c) - ord('a')] += 1 # Count occurrence of each character
res[tuple(count)].append(s) # Use tuple as a hashable key
return list(res.values())
print(groupAnagram(str))
Output:
pythonCopyEdit[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
Time Complexity Analysis:
Counting characters in a word takes O(k).
Inserting into the dictionary is O(1) on average.
Overall time complexity is O(nk), which is much faster than the brute force approach.
📌 Key Takeaways:
Approach | Time Complexity | Space Complexity | Efficiency |
Brute Force | O(n² k log k) | O(n) | ❌ Slow for large inputs |
Optimal (Hashing) | O(nk) | O(n) | ✅ Fast and scalable |
🔹 The brute force method works for small datasets but becomes inefficient as input size grows.
🔹 The optimal method leverages hashing for an efficient O(nk) time complexity, making it ideal for larger datasets.
Conclusion
Grouping anagrams efficiently is a great example of how optimizing algorithms can drastically improve performance. The brute force approach is easier to understand, but the optimal approach using frequency counting is the way to go for real-world applications.
🚀 Want to master DSA? Try implementing this solution in different languages and optimizing further!
What do you think? Let me know in the comments!