Python List Remove Duplicates: Explanations and Examples

In this post, we'll cover six topics related to Python list remove duplicates, each of which is discussed below.

Defining the function to remove all duplicates in a list
In a nested list
How to remove duplicates when randomizing a list
Keep order during remove duplicates
Remove duplicates and sort
Remove duplicates in two lists

Each topic is briefly explained and illustrated with an example.

1. Defining a function to remove all duplicates from a list

To remove all duplicates in a Python list, we can use the set() function. Python's default data type, set, is special in that it does not allow duplicates of the items it contains.

💡

If you're interested in learning more, check out the post Lists vs Tuples vs Dictionaries vs Sets.

The set() function converts an iterable, such as a list, into a set and automatically removes duplicates. You can use it to write a function that removes all duplicates in a list, like this

Let's look at the sample code

def remove_duplicates(lst):
    return list(set(lst))
 
my_list = [1, 2, 3, 3, 4, 2, 1]
result = remove_duplicates(my_list)
 
print(result)
 
# Output
[1, 2, 3, 4]

The code above uses set() to remove duplicates, then converts them back to a list with list() and returns them.

One thing to note in this case is that the order of the existing list is not guaranteed. Since Python sets are an unordered data type, it is possible that the original order may not be preserved when converted back to a list.

If you want to preserve order when removing duplicates in lists, see Section 4.

2. Removing duplicates in nested list

Removing duplicates from a nested list in Python is a little different. You can't use set() to remove duplicates from a nested list, so you need to use a loop to remove duplicates.

The following is sample code for a function to remove duplicates from a nested list:

def remove_duplicates_nested(lst):
    result = []
    for sublist in lst:
        if sublist not in result:
            result.append(sublist)
    return result
 
my_list = [[1, 2, 3], [3, 4, 5], [1, 2, 3], [6, 7, 8]]
result = remove_duplicates_nested(my_list)
print(result)
 
# Output
[[1, 2, 3], [3, 4, 5], [6, 7, 8]]

The code above creates an empty list result and iterates over the double list lst, checking if each sublist already exists in result. Only if it does not exist in result will it be added to result.

This removes duplicate sub-lists and finally returns a unique result.

3. How to remove duplicates when randomizing a list

You can randomly extract items from a Python list while removing duplicates by following the steps below.

Create a new list with duplicates removed.
Randomly extract items from the new list.

Here is some sample code to implement this

import random
 
def get_random_elements(lst, num_elements):
    unique_elements = list(set(lst))
    random_elements = random.sample(unique_elements, num_elements)
    return random_elements
 
my_list = [1, 2, 3, 3, 4, 2, 1]
result = get_random_elements(my_list, 3)
print(result)
 
# Output
[1, 4, 2]

The code above uses set() to remove duplicates, then converts to list() to create a new list, unique_elements.

We then use the random.sample() function to generate random_elements by randomly extracting num_elements number of elements from unique_elements. It then returns random_elements.

💡

The random.sample() function randomly selects a certain number of non-duplicate items from a list.

4. Keep order during Removing duplicates

The deduplication method using the set() function results in ignoring the existing order of the list. Let's see how to avoid this and preserve the order when deduplicating.

In Python, you can use the OrderedDict class from the collections module and list() to remove duplicates from a list while preserving the order. Here's a code example that uses them

from collections import OrderedDict
 
def remove_duplicates(lst):
    return list(OrderedDict.fromkeys(lst))
 
my_list = [1, 2, 3, 3, 4, 2, 1]
result = remove_duplicates(my_list)
print(result)
 
# Output
[1, 2, 3, 4]

In the code above, the function OrderedDict.fromkeys() is used to create an ordered dictionary with duplicates removed, which is then converted back to a list.

An OrderedDict is a dictionary like class, similar to a dictionary, but keeping the order of the items. So if you use the fromkeys() method to create a dictionary with duplicates removed, the order of the items is preserved after the duplicates are removed. We then use the list() function to convert the dictionary back to a list.

5. Remove duplicates and sort list

To deduplicate and sort a list item in Python, you can do the following

create a new list with the duplicates removed.
sort the newly created list.

Here is some sample code to implement this.

def remove_duplicates_and_sort(lst):
    unique_elements = list(set(lst))
    sorted_elements = sorted(unique_elements)
    return sorted_elements
 
my_list = [3, 2, 1, 4, 3, 2, 1]
result = remove_duplicates_and_sort(my_list)
print(result)
 
# Output
[1, 2, 3, 4]

In the code above, we use set() to remove duplicates, then convert to list to create a new list, unique_elements.

We then use the sorted() function to create a sorted_elements that sorts unique_elements in ascending order. It then returns sorted_elements.

💡

The sorted() function returns a new list sorted by the list. For more information, see the post Sorting lists.

6. How to deduplicate two lists

How to remove duplicates from two lists is covered in Section 4 of the post Join lists.

Please read that section.

Conclusion

In this post, I have tried to answer some of your questions about Python list remove duplicates.

Hopefully it will help you in your practical work.

copyright for Python List Remove Duplicates: Explanations and Examples

join, combine print