This is a popular hard problem where you are provided a group of letters (like dictionaries) and you have to determine the order of the characters the dictionary’s words are sorted by.
In one sentence, we need to sort characters given a group of words. So first we need to detect which characters will come first. Of course the first letter in the first word will come first in lexicographic order. But after that, we will need to keep a relationship between character of who comes before who depending on the order of the words.
So, keeping a relationship between objects tells us that this problem could be solved by graph. And we need to make an order of characters where each character comes first when no other character is above that one in the words list. In other words, a character will be picked only when there are no other characters which depend on it. This is the perfect problem to be solved by topological sorting.
Step 2: Build the graph
To solve the problem with topological sorting, we need a graph first. In this graph, the nodes will be characters and the edges will be the relationship stating “this character must come before the other character”.
While building the graph, we will use a built-in datatype called defaultdict. In python, defaultdict is a dictionary where each non-existent key access will return a default value without raising an exception.
We will use a commonly used built-in generator function named zip. What zip does is it zips two container data structures so that one can easily loop through both of that at the same time. Without zip we would have to do something like this:
But with the help of zip, we can just loop over these without taking extra index variable word_ind:
1
forword1,word2inzip(words,words[1:]):
So first we create a graph of defaultdict type, loop through the array or words, then loop through the characters of that word, and create an edge for each consecutive words.
While building the graph, an optimization can be done for a corner case. What if multiple words are prefixes of one another. Then there will be no way of sorting them in lexicographic order. We can check then only if the char1 != char2 is always False. And then, the for loop will never hit the break statement. We could use a flag to check if the break is hit or not.
Or, we can use another cool python feature of python named for/else. Yes, else block can be used after a for block in python and it does exactly what is written above. After checking this with else, we can return null string if we cannot sort the characters. So the code will be:
1
2
3
4
# Check that second word isn't a prefix of first word.else:iflen(word2)<len(word1):return""
Step 3: Build an in_degree dict for topological sorting
For a vertex, the number of head ends adjacent to a vertex is called the indegree of the vertex
Now we need to get to know the in_degree of each node in the graph. Because topological sorting starts (and uses) with the nodes which has in-degrees’ of zero.
First we initialize in_degree with all the characters of words with 0 (this should be done before building the graph so that we can use it while creating the graph)
1
2
3
4
5
# initialize in degree of graphin_degree={}forwordinwords:forcharinword:in_degree[char]=0
Finally, we increment a character’s in_degree when we create an edge while building the graph.
1
2
3
ifchar2notingraph[char1]:graph[char1].add(char2)in_degree[char2]+=1# new line for in_degree
The codes might be looking out of order to you and difficult to understand, so please see the full program given at the end to make sure you understand the order of code snippets used above.
Step 4: Topological Sorting
Topological Sorting is like BFS (Breadth First Search) or DFS (Depth First Search) where instead of starting from a source node, we start from all the nodes that have in_degree[node] == 0. For bfs, we need to use a queue data structure. Python has built-in deque which can be used as a queue with same time and space complexity. Then we push all the nodes that has in_degree[node] == 0 to the queue.
1
2
3
4
5
# Identify vertices that has in_degree = 0queue=deque()forcharinin_degree:ifin_degree[char]==0:queue.append(char)
After that, we keep getting nodes from the queue and update all it’s neighboring nodes in_degree. At the same time when popping nodes from the queue, we will append that with out resulting string aliendict.
1
2
3
4
5
6
7
8
9
10
# Do the topological sortaliendict=""whilequeue:node=queue.popleft()aliendict+=node# Loop through neighbors setforcharingraph[node]:in_degree[char]-=1ifin_degree[char]==0:queue.append(char)
Finally we will need to check if all the characters are inside the aliendict. If they are not inside the aliendict, then we are sure that at some stage, there was a cycle in topological sorting. So there is no single way of sorting the characters given the words. At that time, we will return a null string.
This is the final solution of the program. Topological sorting takes O(|V| + |E|) time and O(|V|) space complexity where V is the list of vertices (characters for word in words in this problem), and E is the list of edges in the graph (|words| * (characters in each word) in this problem). The complete solution is given below:
fromcollectionsimportdefaultdict,dequeclassSolution:defalienOrder(self,words:[str])->str:# initialize in degree of graphin_degree={}forwordinwords:forcharinword:in_degree[char]=0# Build graphgraph=defaultdict(set)forword1,word2inzip(words,words[1:]):forchar1,char2inzip(word1,word2):ifchar1!=char2:ifchar2notingraph[char1]:graph[char1].add(char2)in_degree[char2]+=1break# Check that second word isn't a prefix of first word.else:iflen(word2)<len(word1):return""# Identify vertices that has in_degree = 0queue=deque()forcharinin_degree:ifin_degree[char]==0:queue.append(char)# Do the topological sortaliendict=""whilequeue:node=queue.popleft()aliendict+=node# Loop through neighbors setforcharingraph[node]:in_degree[char]-=1ifin_degree[char]==0:queue.append(char)iflen(aliendict)<len(in_degree):return""returnaliendict
classSolution{public:stringalienOrder(vector<string>&words){// initialize in degree of graph
unordered_map<char,int>inDegree;for(autoword:words)for(autocharacter:word)inDegree[character]=0;// Build graph
unordered_map<char,vector<char>>graph;for(intword_ind=0;word_ind<words.size()-1;word_ind++){stringword1=words[word_ind];stringword2=words[word_ind+1];boolloopBroken=false;for(intchar_ind=0;char_ind<min(word1.length(),word2.length());char_ind++){charchar1=word1[char_ind];charchar2=word2[char_ind];if(char1!=char2){if(graph.find(char1)==graph.end()){graph[char1].push_back(char2);inDegree[char2]++;}loopBroken=true;break;}}// Check that second word isn't a prefix of first word
if(!loopBroken&&word2.length()<word1.length())return"";}// Identify vertices that has inDegree = 0
queue<char>q;for(autocharDegreePair:inDegree)if(charDegreePair.second==0)q.push(charDegreePair.first);// Do the topological sort
stringalienDict;while(!q.empty()){autonode=q.front();q.pop();alienDict.push_back(node);// Loop through neighbors set
for(autocharacter:graph[node]){inDegree[character]--;if(inDegree[character]==0)q.push(character);}}if(alienDict.length()<inDegree.size())return"";returnalienDict;}};
classSolution{publicStringalienOrder(String[]words){// initialize in degree of graphMap<Character,Integer>inDegree=newHashMap<Character,Integer>();for(Stringword:words)for(Charactercharacter:word.toCharArray())inDegree.put(character,0);// Build graphMap<Character,List<Character>>graph=newHashMap<Character,List<Character>>();for(intword_ind=0;word_ind<words.length-1;word_ind++){Stringword1=words[word_ind];Stringword2=words[word_ind+1];booleanloopBroken=false;for(intchar_ind=0;char_ind<Math.min(word1.length(),word2.length());char_ind++){Characterchar1=word1.charAt(char_ind);Characterchar2=word2.charAt(char_ind);if(char1!=char2){if(!graph.getOrDefault(char1,newArrayList<Character>()).contains(char2)){List<Character>prevList=graph.getOrDefault(char1,newArrayList<Character>());prevList.add(char2);graph.put(char1,prevList);inDegree.put(char2,inDegree.get(char2)+1);}loopBroken=true;break;}}// Check that second word isn't a prefix of first wordif(!loopBroken)if(word2.length()<word1.length())return"";}// Identify vertices that has inDegree = 0Queue<Character>queue=newLinkedList<>();inDegree.forEach((character,charInDegree)->{if(charInDegree==0)queue.add(character);});// Do the topological sortStringBuilderalienDict=newStringBuilder();while(!queue.isEmpty()){Characternode=queue.remove();alienDict.append(node);// Loop through neighbors setfor(Charactercharacter:graph.getOrDefault(node,newArrayList<>())){inDegree.put(character,inDegree.get(character)-1);if(inDegree.get(character)==0)queue.add(character);}}if(alienDict.length()<inDegree.size())return"";returnalienDict.toString();}}