LeetCode
LeetCode 14: Longest Common Prefix Solution in Python
This algorithm employs a Trie to find the longest common prefix among strings by inserting each and traversing the Trie for the shared start sequence.
The LeetCode 14 problem, Longest Common Prefix, requires us to develop a function that identifies the longest shared starting substring across a list of strings, returning an empty string if no such prefix exists.
This task fundamentally relates to the Trie data structure. It can be addressed by creating a TrieNode and a Trie class, followed by executing a search for the prefix. This approach leverages the Trie’s ability to efficiently manage and search for prefixes within a collection of strings. In this tutorial article, I am demonstrating my Python solution for this LeetCode problem.
class TrieNode:
def __init__(self):
self.children = {}
self.isWord = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
# Start from the root
node = self.root
# Iterate through characters
for char in word:
# Is the character a child of the current node?
if char not in node.children:
node.children[char] = TrieNode()
# Move to the child node
node = node.children[char]
# Mark the end of the processed word
node.isWord = True
class Solution:
def longestCommonPrefix(self, strs: List[str]) -> str:
trie = Trie()
prefix = ""
node = trie.root
# Add every word into the Trie
for word in strs:
trie.insert(word)
while True:
if len(node.children) == 1 and not node.isWord:
# The key is the character, the value is the TrieNode object.
char, next_node = next(iter(node.children.items()))
# Update the prefix
prefix += char
node = next_node
else:
break
return prefix
Python Solution with Trie Data Structure
As you can see looking at the full Python code snippet above, the provided solution employs a Trie data structure to solve the problem of finding the longest common prefix among an array of strings.
We start by defining a TrieNode
class representing each node in the Trie. Each node contains a dictionary named children
to store its child nodes, keyed by characters. The isWord
boolean flag indicates whether the node marks the completion of a word in the Trie.
class TrieNode:
def __init__(self):
self.children = {}
self.isWord = False
We proceed by creating a Trie
class, which implements the Trie data structure, starting with a root node of type TrieNode
.
The insert(word)
method adds a word to the Trie. It iterates through each character in the input word, checking if the character exists as a child of the current node. If not, it creates a new TrieNode
and links it to the current node as a child keyed by the character. After processing all characters, it marks the last node as the end of a word (isWord = True
).
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
# Start from the root
node = self.root
# Iterate through characters
for char in word:
# Is the character a child of the current node?
if char not in node.children:
node.children[char] = TrieNode()
# Move to the child node
node = node.children[char]
# Mark the end of the processed word
node.isWord = True
The Solution
class contains the longestCommonPrefix(strs)
method, which utilizes the Trie to find the longest common prefix among the given strings strs
. It first creates a Trie and inserts all the strings from strs
into the Trie data structure using the insert
method. The problem’s solution function then initializes an empty string prefix
to accumulate the common prefix characters. It iterates through the Trie starting from the root, following the path where there is exactly one child node and the current node does not mark the end of a word (isWord
is False
). Note that a prefix may be the shortest string in the input string collection, assuming that all other longer strings in the collection are based on the shortest string (e.g. the words ‘apply’ and ‘apple’ are based on the word ‘app’ character-wise, which is the shorter string of the three).
The iteration continues as long as the above condition is met, appending each traversed character to the prefix
string variable. This process effectively accumulates the longest common prefix. The loop breaks when either a node has more than one child (indicating a divergence in the strings) or the current node marks the end of a word (isWord
is True
), meaning the common prefix ends. Finally, the accumulated prefix
is returned as the solution.
class Solution:
def longestCommonPrefix(self, strs: List[str]) -> str:
trie = Trie()
prefix = ""
node = trie.root
# Add every word into the Trie
for word in strs:
trie.insert(word)
while True:
if len(node.children) == 1 and not node.isWord:
# The key is the character, the value is the TrieNode object.
char, next_node = next(iter(node.children.items()))
# Update the prefix
prefix += char
node = next_node
else:
break
Longest Common Prefix Complexity Analysis
This proposed Python solution is efficient for finding the longest common prefix because it exploits the Trie’s structure to quickly navigate through shared initial segments of the strings, stopping when a divergence occurs, or a complete word is encountered before any divergence.
Time Complexity
For each word, the algorithm iterates through its characters to insert it into the Trie data structure. If `n` is the number of strings and `m` represents the length of the longest string, the time complexity for building the Trie is O(n*m). This is because each character of every string is processed once when inserting.
The search for the longest common prefix traverses down the Trie from the root, following the single-child nodes until it cannot proceed (either due to encountering a node with multiple children or a node that marks the end of a word). Under normal circumstances, the maximum possible depth of this traversal is m
(the length of the longest string). However, this part of the algorithm is bounded by the length of the shortest string in the worst case, since the common prefix cannot be longer than the shortest string. Let s
be the length of the shortest string in the array. The time complexity for finding the longest common prefix is then O(s).
We can thus say that the total time complexity is O(nm + s). However, since s
is less than or equal to m
, and considering that inserting words is the more dominant factor, the overall time complexity for the Longest Common Prefix LeetCode problem can be approximated as O(nm).
Space Complexity
The Trie data structure’s space complexity depends on the total number of nodes and the size of the character set. In the worst case, if all strings are unique and have no common prefixes, the Trie would need to store all characters of all strings, leading to a space complexity of O(n*m).
The space required for storing the prefix is O(s), where s
is the length of the shortest string, since the longest possible common prefix cannot exceed the shortest string in the array. Therefore, the total space complexity of the Longest Common Prefix Python algorithm for the LeetCode 14 problem is dominated by the Trie storage, resulting in O(n*m) as the overall space complexity.