How python does de-duplication
Custom function de-duplication
Parsing Thoughts:
- 1. Determine the goal of de-weighting
- 2, to an empty list to receive the elements of the de-emphasized
- 3. Iterate through the sequence that needs to be de-duplicated, and filter the duplicate data
- 4、Print the data after de-weighting
l = [1,1,3,2,2,3,4,2,5] new = [] for i in l: if i not in new: (i) print(new)
Output results:
[1, 3, 2, 4, 5]
Built-in function de-duplication
l = [1,1,3,2,2,3,4,2,5] b = list(set(l)) print(b)
Output results:
[1, 2, 3, 4, 5]
It can be seen that de-weighting changes the order of the sequence, so after de-weighting you need to sort by element index to keep the original order of the sequence
The code is as follows:
l = [1,1,3,2,2,3,4,2,5] a = list(set(l)) (key=) print(a)
Output results:
[1, 3, 2, 4, 5]
5 common ways to de-duplicate python lists
List de-duplication is very common in the practical use of python and is the most basic key knowledge.
The following summarizes 5 common list de-duplication methods
First, use for loop to realize the list of de-emphasis
- The original order remains unchanged after this method of de-weighting.
# for loop for list de-duplication list1 = ['a', 'b', 1, 3, 9, 9, 'a'] list2 = [] for l1 in list1: if l1 not in list2: (l1) print(list2)
Result: ['a', 'b', 1, 3, 9]
II. Using list-deductive de-duplication
- The original order remains unchanged after this method of de-weighting.
# Use list-deductive de-duplication list1 = ['a', 'b', 1, 3, 9, 9, 'a'] res = [] [(i) for i in list1 if i not in res] print(res)
Result: ['a', 'b', 1, 3, 9]
Third, the use of set conversion function set () to achieve the list of de-emphasis
- Principle: Repetition between elements of the same set is not allowed
# set() list de-duplication list1 = ['a', 'b', 1, 3, 9, 9, 'a'] list2 = list(set(list1)) print(list2)
Result: [1, 3, 9, 'b', 'a']
Problem: After using the set() function to remove weight, it will automatically sort, then the order of the original list will be changed
There are 2 solutions:
- The first method, using the sort() method
# # The first method, sort() # list1 = ['a', 'b', 1, 3, 9, 9, 'a'] list2 = list(set(list1)) (key=) print(list2)
Result: ['a', 'b', 1, 3, 9]
Note: The sort() method has no return value and sorts the list elements in place
- The second method, using the sorted() function
# The second method, sored() list1 = ['a', 'b', 1, 3, 9, 9, 'a'] list2 = sorted(list(set(list1)), key=) print(list2)
Result: ['a', 'b', 1, 3, 9]
Note: The python built-in function sorted() function returns a new list and does not modify the original list in any way.
Fourth, the use of new dictionaries to achieve list de-emphasis
- Principle: Dictionary "keys" are not allowed to be duplicated.
- After this method of de-duplication, the original order remains unchanged.
# Implement list de-duplication using a new dictionary list1 = ['a', 'b', 1, 3, 9, 9, 'a'] dic = {} dic = (list1).keys() print(list(dic))
Result: ['a', 'b', 1, 3, 9]
V. Delete the existence of duplicate data in the list
- The four de-duplication methods above all keep one and remove the others
- The following method, on the other hand, retains none of the duplicates as long as they exist
# Delete values where duplicates exist, do not retain list1 = ['a', 'b', 1, 3, 9, 9, 'a'] list2 = [i for i in list1 if (i) == 1] print(list2)
Result: ['b', 1, 3]
Well, this is about the list of 5 methods of de-emphasis, you can choose the corresponding method according to the needs.
summarize
The above is a personal experience, I hope it can give you a reference, and I hope you can support me more.