Summary of several ways to delete duplicate elements in lists using Python

introduction

In Python programming, we often encounter duplicate elements in lists. For the accuracy of data processing and analysis, we need to clean up these duplicate elements. This article will introduce several ways to use Python to delete duplicate elements in a list and compare their pros and cons to help you choose the most suitable solution.

Method 1: Use the characteristics of sets

A set is an unordered and non-repetitive data structure. We can use this feature to convert the list into a collection and then convert it back to the list to easily remove duplicate elements.

# Example listmy_list = [1, 2, 2, 3, 4, 4, 5]

# Use collection to deduplicateunique_list = list(set(my_list))

# Output resultprint(unique_list)  # Output: [1, 2, 3, 4, 5]

advantage:

The code is concise and easy to understand.

High execution efficiency, especially suitable for processing large amounts of data.

shortcoming:

Changes the original order of elements in the list.

Method 2: Use List Comprehension

List comprehensions provide a concise way to create lists. We can use list comprehensions to traverse the original list and add only elements that have not appeared to the new list.

# Example listmy_list = [1, 2, 2, 3, 4, 4, 5]

# Use list comprehension to deduplicateunique_list = []
[unique_list.append(x) for x in my_list if x not in unique_list]

# Output resultprint(unique_list)  # Output: [1, 2, 3, 4, 5]

advantage:

The code is concise and has good readability.

You can keep the original order of elements in the list.

shortcoming:

For large-scale data, the efficiency may not be as high as the ensemble method.

Method 3: Use OrderedDict (before Python 3.7)

Prior to Python 3.7, the key order of the dictionary (dict) was uncertain. To keep order, we can use OrderedDict to and from heavy.

from collections import OrderedDict

# Example listmy_list = [1, 2, 2, 3, 4, 4, 5]

# Use OrderedDict to deduplicateunique_list = list((my_list))

# Output resultprint(unique_list)  # Output: [1, 2, 3, 4, 5]

advantage:

You can keep the original order of elements in the list.

shortcoming:

The code is relatively complex.

After Python 3.7, the dictionary has maintained the insertion order, and this method is no longer necessary.

Method 4: Use

Iterable objects can be grouped according to the specified key function. We can use it to group the sorted list and then take the first element of each group.

from itertools import groupby

# Example listmy_list = [1, 2, 2, 3, 4, 4, 5]

# Use Deduplicationunique_list = [x for x, _ in groupby(sorted(my_list))]

# Output resultprint(unique_list)  # Output: [1, 2, 3, 4, 5]

advantage:

You can keep the original order of elements in the list (need to be sorted first).

shortcoming:

The code is relatively complex.

The need to sort the list first, which may affect efficiency.

Summarize

The above methods can effectively delete duplicate elements in the list. Which method to choose depends on your specific needs:

If you need to keep order, you can use list comprehensions or OrderedDict (before Python 3.7).

If you do not need to keep order and pursue simplicity and efficiency, you can use collections.

For more complex requirements, such as deduplication based on specific conditions, it can be used.

This is the article about several methods to delete duplicate elements in a list using Python. For more related contents of Python deleting duplicate elements, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!