Anagrams In Python Using Lists

April 08, 2023 Post a Comment

Imagine we have following list of strings: Input: ['eat', 'tea', 'tan', 'ate', 'nat', 'bat'] The output of our program should group each set of anagram and return them all togethe

Solution 1:

you can do it using defaultdict of python in-built collections library and sorted :

In [1]: l = ["eat", "tea", "tan", "ate", "nat", "bat"]

In [2]: from collections import defaultdict

In [3]: d = defaultdict(list)

In [4]: for x in l:
   ...:     d[str(sorted(x))].append(x)

In [5]: d.values()
Out[5]: dict_values([['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']])

to fix your the solution you need add the variable to check is allready added, for exanmple(and the while walk through the strs i use enumerate for little performance in the search of the anagrams):

class Solution(object):
    def groupAnagrams(self, strs):
        allResults = []
        added = set([])
        temp=''
        for i, s in enumerate(strs):
            results = []
            unique_s = "".join(sorted(s))
            if unique_s in added:
                continue
            else:
                added.add(unique_s)
            for x in strs[i:]:
              if unique_s=="".join(sorted(x)):
                results.append(strs[i])
            allResults.append(results)

        print(added)

        return allResults

Solution 2:

Use itertools.groupby

>>> lst =  ["eat", "tea", "tan", "ate", "nat", "bat"]
>>> 
>>> from itertools import groupby
>>> f = lambda w: sorted(w)
>>> [list(v) for k,v in groupby(sorted(lst, key=f), f)]
[['bat'], ['eat', 'tea', 'ate'], ['tan', 'nat']]

Solution 3:

Using only lists, as requested in the title of the question:

The second line s_words takes all the letters of each word in words, sorts them, and recreates a string composed of the sorted letters of the word; it creates a list of all the these sorted letters strings, in the same order as the original sequence of words --> this will be used to compare the possible anagrams (the letters of anagrams produce the same string when sorted)

The 3rd line indices hold True or False values, to indicate if the corresponding word has been extracted already, and avoid duplicates.

The following code is a double loop that for each s_word, determines which other s_word is identical, and uses its index to retrieve the corresponding word in the original list of words; it also updates the truth value of the indices.

words = ["eat", "tea", "tan", "ate", "nat", "bat"]
s_words = [''.join(sorted(list(word))) for word in words]
indices = [False for _ in range(len(words))]
anagrams = []
for idx, s_word in enumerate(s_words):
    if indices[idx]:
        continue
    ana = [words[idx]]
    for jdx, word in enumerate(words):
        if idx != jdx and not indices[jdx] and s_word == s_words[jdx]:
            ana.append(words[jdx])
            indices[jdx] = True
    anagrams.append(ana)

print(anagrams)

output:

[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

Solution 4:

The way you implemented your function, you are only looking at rotations of the strings (that is you shift a letter from the beginning to the end, e.g. a-t-e -> t-e-a -> e-a-t). What your algorithm cannot detect is single permutations were you only switch two letters (n-a-t -> t-a-n). In mathematical language you only consider the even permutations of the three letter strings and not the odd permutations.

A modification of your code could for example be:

def get_list_of_permutations(input_string):
  list_out = []
  if len(input_string) > 1:
    first_char = input_string[0]
    remaining_string = input_string[1:]
    remaining_string_permutations = get_list_of_permutations(remaining_string)
    for i in range(len(remaining_string)+1):
      for permutation in remaining_string_permutations:
        list_out.append(permutation[0:i]+first_char+permutation[i:])
  else:
    return [input_string]
  return list_out

def groupAnagrams(strs):
  allResults=[]
  for s in strs:  
    results = []
    list_of_permutations = get_list_of_permutations(s)
    for i in range(0,len(strs)):
      if strs[i] in list_of_permutations:
        results.append(strs[i])
    if results not in allResults:
      allResults.append(results)     
  return allResults

The output is

Out[218]: [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

Edit: modified the code to work with all lengths of strings.

Solution 5:

https://docs.python.org/3/library/itertools.html#itertools.permutations

from itertools import permutations

word_list = ["eat", "tea", "tan", "ate", "nat", "bat"]
anagram_group_list = []

for word in word_list:

    if word == None:
        pass
    else:
        anagram_group_list.append([])

        for anagram in permutations(word):
            anagram = ''.join(anagram)

            try:
                idx = word_list.index(anagram)
                word_list[idx] = None 

                anagram_group_list[-1].append(anagram)

            except ValueError:
                pass # this anagram is not present in word_list

print(anagram_group_list)
# [['eat', 'ate', 'tea'], ['tan', 'nat'], ['bat']]

after refactoring code and stopping it from producing redundant result your code still doesn't give expected result as logic for producing anagram is not completely correct

def groupAnagrams(word_list):
    allResults=[]
    results=[]

    for idx,s in enumerate(word_list):
        if s == None:
            pass
        else:
            results = [s] # word s is added to anagram list

            # you were generating only 1 anagram like for tan --> ant but in word_list only nat was present
            for i in range(1,len(s),1):
                temp = s[i:]+s[:i] #anagram 
                    # for s = 'tan' it generates only 'ant and 'nta'
                    # when it should generate all six tna ant nta _nat_ atn tan

                if temp in word_list:
                  results.append(temp)
                  word_list[word_list.index(temp)] = None

            allResults.append(results) 

    return allResults

print(groupAnagrams(["eat", "tea", "tan", "ate", "nat", "bat"]))
# [['eat', 'ate', 'tea'], ['tan'], ['nat'], ['bat']]

Learn Python Tutorials