When storing strings for prefix matching, autocomplete, or spellchecking, standard data structures like Hash Tables fall short. A Hash Table can easily check if a word exists, but finding all words starting with a prefix requires a costly linear scan. π§©
The solution? The Trie (pronounced "try" or "Prefix Tree"). A Trie is a tree-like data structure where characters are stored level by level along paths, allowing us to find matching prefixes in time proportional only to the prefix length! π
π§© The Problem
Implement a trie (prefix tree) with
insert,search, andstartsWithmethods.
Example
Input: ["Trie", "insert", "search", "search", "startsWith", "insert", "search"]
Args: [[], ["apple"], ["apple"], ["app"], ["app"], ["app"], ["app"]]
Output: [null, null, true, false, true, null, true]
Explanation:
trie = Trie()
trie.insert("apple")
trie.search("apple") # return True
trie.search("app") # return False (path exists, but not marked as word end)
trie.startsWith("app") # return True (path exists)
trie.insert("app")
trie.search("app") # return True (now marked as word end)π Visualizing Trie Operations
Let's look at how the Trie structure grows after we insert "apple" and "app".
(root)
βββ a
βββ p
βββ p* <-- "app" ends here (is_end = True)
βββ l
βββ e* <-- "apple" ends here (is_end = True)| Operation | Word/Prefix | Path Traversed | End State Check | Result |
|---|---|---|---|---|
insert |
"apple" |
a -> p -> p -> l -> e |
Marks e as end |
Created path |
search |
"apple" |
a -> p -> p -> l -> e |
Checks e.is_end |
True β
|
search |
"app" |
a -> p -> p |
Checks p.is_end |
False (before inserting "app") |
startsWith |
"app" |
a -> p -> p |
Path exists? | True β
|
π οΈ Approach 1: The Standard Object-Oriented Trie
The textbook way to implement a Trie is to create a custom node class (TrieNode) which holds a dictionary of children and a boolean flag (is_end).
class TrieNode(object):
def __init__(self):
# A dictionary mapping characters to TrieNode instances
self.children = {}
self.is_end = False
class Trie(object):
def __init__(self):
self.root = TrieNode()
def insert(self, word):
"""
:type word: str
:rtype: None
"""
node = self.root
for ch in word:
if ch not in node.children:
node.children[ch] = TrieNode()
node = node.children[ch]
node.is_end = True
def search(self, word):
"""
:type word: str
:rtype: bool
"""
node = self.root
for ch in word:
if ch not in node.children:
return False
node = node.children[ch]
return node.is_end
def startsWith(self, prefix):
"""
:type prefix: str
:rtype: bool
"""
node = self.root
for ch in prefix:
if ch not in node.children:
return False
node = node.children[ch]
return TrueThis is highly readable, modular, and standard. It gets 77.8% runtime beats on LeetCode. But can we make it faster? π
π Approach 2: The Ultra-Optimized Python Dict Trie (Beating 95%+)
If you want to absolutely dominate performance on Python Trie problems, you want to eliminate custom classes entirely.
Why Nested Dicts Win π
Plain dictionaries are heavily optimized in CPython. By removing the custom TrieNode wrapper, we avoid class instantiation overhead and memory dereferencing delays. We can represent our entire Trie using nested dictionaries and a special marker (like '$') to represent the end of a word.
class Trie(object):
def __init__(self):
# The entire tree is just a plain, nested dictionary!
self.trie = {}
def insert(self, word):
"""
:type word: str
:rtype: None
"""
node = self.trie
for ch in word:
if ch not in node:
node[ch] = {}
node = node[ch]
node['$'] = True # '$' functions as the is_end marker
def search(self, word):
"""
:type word: str
:rtype: bool
"""
node = self.trie
for ch in word:
if ch not in node:
return False
node = node[ch]
return '$' in node
def startsWith(self, prefix):
"""
:type prefix: str
:rtype: bool
"""
node = self.trie
for ch in prefix:
if ch not in node:
return False
node = node[ch]
return TrueThis simple shift slashes runtime dramatically, frequently hitting 95%+ on LeetCode.
β±οΈ Complexity Analysis
For both implementations, the complexities are identical, but the constant factors in the nested-dictionary approach are significantly smaller!
| Operation | Time Complexity | Space Complexity | Rating | Description |
|---|---|---|---|---|
insert |
π Optimal | is the word length. We may need to allocate up to new nodes. | ||
search |
π Optimal | We traverse the tree at most times. | ||
startsWith |
π Optimal | is the prefix length. We only follow links. |
π§ How to Recognize This Pattern
Keep an eye out for these requirements in coding interviews:
- Autocomplete or Typeahead Search: Checking for strings matching a prefix dynamic query.
- Spell Checking / Dictionary matching: Checking if subsegments form valid dictionary words.
- Prefix/Suffix Operations: Performing match queries starting from either end of a string.
This pattern appears in:
- Design Add and Search Words Data Structure (LeetCode 211)
- Word Search II (LeetCode 212)
- Replace Words (LeetCode 648)
π― Interview Tips
- Trade-offs of Trie vs Hash Table: Be prepared to explain why you chose a Trie. While a Hash Table has search, it does not support prefix matching (
startsWith) in time, nor does it share prefixes to save memory when storing many similar strings. - CPython Optimization: Discussing how avoiding object instantiation by using raw nested dictionaries speeds up Python execution demonstrates a deep, language-specific runtime knowledge.
- Character Array vs Dictionary: You can also implement Trie nodes using a size-26 array (e.g.
[None] * 26) since inputs are constrained to lowercase English letters. Explain that a dictionary is often preferred in production because it supports any character set (Unicode) and saves space when nodes have few children!
