Remove Duplicates from List Python Using OrderedDict is a key step in data cleaning, especially when you’re dealing with user input, API responses, or datasets with overlapping entries. Duplicate data can skew analytics, lead to redundant storage, or cause unexpected behavior in your program.
Python offers several ways to remove duplicates, ranging from loops to set-based methods. However, when you need to preserve the original order of items while cleaning the data, OrderedDict.fromkeys()
stands out as a clean, readable, and elegant solution.
2. What Is OrderedDict.fromkeys()
?
OrderedDict
is a class from Python’s built-in collections
module. Unlike regular dictionaries before Python 3.7, OrderedDict
maintains the insertion order of keys—making it ideal for tasks that require both uniqueness and order.
The fromkeys()
method is a convenient way to create an OrderedDict
where each item in the list becomes a key. Since dictionaries do not allow duplicate keys, this naturally removes any repeated items while keeping the first occurrence of each.
Syntax:
from collections import OrderedDict OrderedDict.fromkeys(list_name)
This method outputs an OrderedDict
object. You can convert it back to a regular list using list()
.
3. Why Use OrderedDict.fromkeys()
to Remove Duplicates?
- ✅ Preserves original order: Unlike sets, which don’t guarantee element order,
OrderedDict
keeps items exactly where they first appeared. - ✅ Clean one-liner code: No need for verbose loops or conditionals.
- ✅ No manual tracking needed: Avoids the need to check if an element is already added.
- ✅ Pythonic and readable: Great for beginners and pros alike looking for elegant code.
- ✅ Efficient for small-to-medium datasets: Suitable for most typical data-cleaning scenarios.
Python Program to Remove Duplicate Elements From a List Using collections.OrderedDict.fromkeys()
# Remove Duplicate Element From a List Using collections.OrderedDict.fromkeys() in python from collections import OrderedDict # initializing list test_list = [1, 5, 3, 6, 3, 5, 6, 7,1] print ("The original list is : " + str(test_list)) # using collections.OrderedDict.fromkeys() to remove duplicated from list res = list(OrderedDict.fromkeys(test_list)) # printing list after removal print ("The list after removing duplicates : " + str(res))
Output:
The original list is : [1, 5, 3, 6, 3, 5, 6, 7, 1]
The list after removing duplicates : [1, 5, 3, 6, 7]
Real-World Scenario– Remove Duplicates from List Python Using OrderedDict
Let’s say you’re fetching product IDs or tags from a database, and due to repeated entries or joins, you end up with duplicate values. Maintaining the original order of appearance is crucial—especially if the order reflects user preferences or product ranking.
Here’s how you can clean the list using OrderedDict.fromkeys()
:
from collections import OrderedDict product_tags = ["laptop", "gaming", "laptop", "accessories", "gaming", "electronics"] unique_tags = list(OrderedDict.fromkeys(product_tags)) print(unique_tags)
Output:
['laptop', 'gaming', 'accessories', 'electronics']
🛠 Explanation:
OrderedDict.fromkeys()
removes the duplicates and keeps the first occurrence of each item.list()
converts the result back into a standard Python list.- This is highly useful in e-commerce platforms or product filtering systems.
Limitations of Remove Duplicates from List Python Using OrderedDict
While OrderedDict.fromkeys()
is a great solution, it has a few limitations you should be aware of:
- ❗ Requires importing
collections
: Slightly more overhead compared to native methods likeset()
. - ❗ Doesn’t support unhashable items: It fails if the list contains other lists, dictionaries, or sets (e.g.,
[[1, 2], [1, 2]]
), since dictionary keys must be hashable. - ❗ Not optimal for large, real-time data streams: For high-performance scenarios, consider using
set()
or external libraries optimized for speed.