Alien dictionary#

Levels: level-5
Data structures: graph

Description#

  • There is a new alien language which uses the latin alphabet.
  • However, the order among letters are unknown to you.
  • You receive a list of words from the dictionary, where words are sorted lexicographically by the rules of this new language.
  • Derive the order of letters in this language.

Note:

  • You may assume all letters are in lowercase.
  • If the order is invalid, return an empty string.
  • There may be multiple valid order of letters, return any one of them is fine.

Examples#

 1Given the following words in dictionary,
 2[
 3    "wrt",
 4    "wrf",
 5    "er",
 6    "ett",
 7    "rftt"
 8]
 9
10The correct order is: "wertf".
1Input:
2[
3  "z",
4  "x"
5]
6Output: "zx"

Example 3:

1Input:
2[
3  "z",
4  "x",
5  "z"
6]
7Output: ""
8Explanation: The order is invalid, so return "".

Python Solution#

 1from typing import List
 2from collections import defaultdict, deque
 3
 4
 5class Solution(object):
 6    def alienOrder(self, words: List[str]) -> str:
 7        G = self.construct_graph(words)
 8        visited = defaultdict(int)  # 0 not visited, 1 visiting, 2 visted
 9        ret = deque()
10        for u in G.keys():
11            if visited[u] == 0:
12                if not self.topo_dfs(G, u, visited, ret):
13                    return ""
14
15        return "".join(ret)
16
17    def construct_graph(self, words):
18        G = defaultdict(list)
19        # need to initialize, consider test case ["z", "z"]
20        for w in words:  # error
21            for c in w:
22                G[c]
23
24        for i in range(len(words) - 1):  # compare word_i and word_{i+1}
25            for c1, c2 in zip(words[i], words[i + 1]):
26                if c1 != c2:  # lexical order
27                    G[c1].append(c2)
28                    break  # need to break for lexical order
29
30        return G
31
32    def topo_dfs(self, G, u, visited, ret):
33        """
34        Topological sort
35        G = defaultdict(list)
36        visited = defaultdict(int)  # 0 not visited, 1 visiteding, 2 visted
37        pre-condition: u is not visited (0)
38        """
39        visited[u] = 1
40        for nbr in G[u]:
41            if visited[nbr] == 1:
42                return False
43            if visited[nbr] == 0:
44                if not self.topo_dfs(G, nbr, visited, ret):
45                    return False
46
47        visited[u] = 2
48        ret.appendleft(u)  # visit larger first
49        return True
50
51
52if __name__ == "__main__":
53    lst = [
54        "ze", "yf", "xd", "wd", "vd", "ua", "tt", "sz", "rd", "qd", "pz", "op",
55        "nw", "mt", "ln", "ko", "jm", "il", "ho", "gk", "fa", "ed", "dg", "ct",
56        "bb", "ba"
57    ]
58    assert Solution().alienOrder(lst) == "zyxwvutsrqponmlkjihgfedcba"