LeetCode

LeetCode 14: Longest Common Prefix Solution in Python

This algorithm employs a Trie to find the longest common prefix among strings by inserting each and traversing the Trie for the shared start sequence.

Published

2 months ago

February 18, 2024

Isaac Asante

The LeetCode 14 problem, Longest Common Prefix, requires us to develop a function that identifies the longest shared starting substring across a list of strings, returning an empty string if no such prefix exists.

This task fundamentally relates to the Trie data structure. It can be addressed by creating a TrieNode and a Trie class, followed by executing a search for the prefix. This approach leverages the Trie’s ability to efficiently manage and search for prefixes within a collection of strings. In this tutorial article, I am demonstrating my Python solution for this LeetCode problem.

class TrieNode:
  def __init__(self):
    self.children = {}
    self.isWord = False

class Trie:
  def __init__(self):
    self.root = TrieNode()
  
  def insert(self, word):
    # Start from the root
    node = self.root
  
    # Iterate through characters
    for char in word:
      # Is the character a child of the current node?
      if char not in node.children:
        node.children[char] = TrieNode()
      # Move to the child node
      node = node.children[char]
    # Mark the end of the processed word
    node.isWord = True

class Solution:
    def longestCommonPrefix(self, strs: List[str]) -> str:
      trie = Trie()
      prefix = ""
      node = trie.root

      # Add every word into the Trie
      for word in strs:
        trie.insert(word)
      
      while True:
        if len(node.children) == 1 and not node.isWord:
          # The key is the character, the value is the TrieNode object.
          char, next_node = next(iter(node.children.items()))
          # Update the prefix
          prefix += char
          node = next_node
        else:
          break
      
      return prefix

Python Solution with Trie Data Structure

As you can see looking at the full Python code snippet above, the provided solution employs a Trie data structure to solve the problem of finding the longest common prefix among an array of strings.

We start by defining a TrieNode class representing each node in the Trie. Each node contains a dictionary named children to store its child nodes, keyed by characters. The isWord boolean flag indicates whether the node marks the completion of a word in the Trie.

class TrieNode:
  def __init__(self):
    self.children = {}
    self.isWord = False

We proceed by creating a Trie class, which implements the Trie data structure, starting with a root node of type TrieNode.

The insert(word) method adds a word to the Trie. It iterates through each character in the input word, checking if the character exists as a child of the current node. If not, it creates a new TrieNode and links it to the current node as a child keyed by the character. After processing all characters, it marks the last node as the end of a word (isWord = True).

class Trie:
  def __init__(self):
    self.root = TrieNode()
  
  def insert(self, word):
    # Start from the root
    node = self.root
  
    # Iterate through characters
    for char in word:
      # Is the character a child of the current node?
      if char not in node.children:
        node.children[char] = TrieNode()
      # Move to the child node
      node = node.children[char]
    # Mark the end of the processed word
    node.isWord = True

The Solution class contains the longestCommonPrefix(strs) method, which utilizes the Trie to find the longest common prefix among the given strings strs. It first creates a Trie and inserts all the strings from strs into the Trie data structure using the insert method. The problem’s solution function then initializes an empty string prefix to accumulate the common prefix characters. It iterates through the Trie starting from the root, following the path where there is exactly one child node and the current node does not mark the end of a word (isWord is False). Note that a prefix may be the shortest string in the input string collection, assuming that all other longer strings in the collection are based on the shortest string (e.g. the words ‘apply’ and ‘apple’ are based on the word ‘app’ character-wise, which is the shorter string of the three).

The iteration continues as long as the above condition is met, appending each traversed character to the prefix string variable. This process effectively accumulates the longest common prefix. The loop breaks when either a node has more than one child (indicating a divergence in the strings) or the current node marks the end of a word (isWord is True), meaning the common prefix ends. Finally, the accumulated prefix is returned as the solution.

class Solution:
    def longestCommonPrefix(self, strs: List[str]) -> str:
      trie = Trie()
      prefix = ""
      node = trie.root

      # Add every word into the Trie
      for word in strs:
        trie.insert(word)
      
      while True:
        if len(node.children) == 1 and not node.isWord:
          # The key is the character, the value is the TrieNode object.
          char, next_node = next(iter(node.children.items()))
          # Update the prefix
          prefix += char
          node = next_node
        else:
          break

Screenshot of my *Longest Common Prefix* Python algorithm submission on LeetCode, which successfully beat 83% of other Python 3 submissions in runtime, and 98% in memory space.

Longest Common Prefix Complexity Analysis

This proposed Python solution is efficient for finding the longest common prefix because it exploits the Trie’s structure to quickly navigate through shared initial segments of the strings, stopping when a divergence occurs, or a complete word is encountered before any divergence.

Time Complexity

For each word, the algorithm iterates through its characters to insert it into the Trie data structure. If `n` is the number of strings and `m` represents the length of the longest string, the time complexity for building the Trie is O(n*m). This is because each character of every string is processed once when inserting.

The search for the longest common prefix traverses down the Trie from the root, following the single-child nodes until it cannot proceed (either due to encountering a node with multiple children or a node that marks the end of a word). Under normal circumstances, the maximum possible depth of this traversal is m (the length of the longest string). However, this part of the algorithm is bounded by the length of the shortest string in the worst case, since the common prefix cannot be longer than the shortest string. Let s be the length of the shortest string in the array. The time complexity for finding the longest common prefix is then O(s).

We can thus say that the total time complexity is O(nm + s). However, since s is less than or equal to m, and considering that inserting words is the more dominant factor, the overall time complexity for the Longest Common Prefix LeetCode problem can be approximated as O(nm).

Space Complexity

The Trie data structure’s space complexity depends on the total number of nodes and the size of the character set. In the worst case, if all strings are unique and have no common prefixes, the Trie would need to store all characters of all strings, leading to a space complexity of O(n*m).

The space required for storing the prefix is O(s), where s is the length of the shortest string, since the longest possible common prefix cannot exceed the shortest string in the array. Therefore, the total space complexity of the Longest Common Prefix Python algorithm for the LeetCode 14 problem is dominated by the Trie storage, resulting in O(n*m) as the overall space complexity.