In Python,list
、tuple
、set
anddict
They are four very commonly used data structures. Their underlying implementations have their own characteristics, and these implementations determine their performance and usage scenarios. The following is a detailed explanation of the underlying implementation of these four data structures:
1. list (list)
list
It is one of the most commonly used data structures in Python.Dynamic array, used to store an ordered set of elements.
Underlying implementation
-
Storage method:
list
It's oneContinuous blocks of memory, used to store references (pointers) of elements. This meanslist
The elements in it can be of any type because it is stored as a reference to the object, not the object itself. -
Dynamic expansion:
list
The size is dynamic. When the list space is insufficient, Python allocates a new larger memory block and copies the old data into the new memory block. Usually, the new memory block size is the current size1.125 times(Specific multiples may vary by Python version). -
Memory management:because
list
is continuous memory, so it supports fast random access (the time complexity of accessing elements through index is O(1)). But insertion and deletion operations (especially non-tail operations) may require moving a large number of elements, with a time complexity of O(n).
Performance Features
advantage:
- Supports fast random access (via index).
- The internal implementation is simple and suitable for storing ordered data.
shortcoming:
- Insert and delete operations (especially non-tail operations) are inefficient.
- Memory footprint may be high, as additional space needs to be reserved to support dynamic expansion.
2. tuple (tuple)
tuple
It's oneImmutable ordered collection, used to store a fixed set of data.
Underlying implementation
-
Storage method:and
list
similar,tuple
It's also throughContinuous blocks of memoryStores references to elements. The difference is,tuple
is immutable, which means its size and content cannot be modified after creation. -
Memory allocation:because
tuple
Immutable, Python is creatingtuple
When sufficient memory is allocated at once to store all elements without considering dynamic expansion. -
Performance optimization: Due to immutability,
tuple
In some cases,list
More efficient, especially when used as keys or elements of a dictionary, because their hash values are fixed.
Performance Features
advantage:
- Immutability ensures the security of the data.
- In some scenarios (such as keys as hash tables)
list
More efficient.
shortcoming:
- Dynamic modification is not supported, and the flexibility is poor.
3. set (set)
set
It's oneUnordered collection, used to store unique elements. It supports efficient member lookup, insertion and deletion operations.
Underlying implementation
-
Storage method:
set
The underlying implementation is based onHash tableof. Each element is mapped to a unique hash value through a hash function and stored in the hash table. - Conflict resolution: Python usesLink ListorOpen addressing methodto resolve conflicts. The implementation depends on the version of Python.
-
Dynamic expansion:and
list
similar,set
The size is also dynamic. When the load factor of the hash table (the ratio of the number of elements to the hash table size) exceeds a certain threshold, the hash table automatically expands, reassigns larger memory space and rehashes all elements.
Performance Features
advantage:
- The time complexity of member search, insertion, and deletion operations is O(1).
- Automatic deduplication, suitable for storing unique elements.
shortcoming:
- Duplicate elements are not supported.
- Orderly access is not supported.
4. dict (Dictionary)
dict
It's oneCollection of key-value pairs, used to store mapping relationships. It is one of the most powerful data structures in Python.
Underlying implementation
-
Storage method:
dict
The underlying implementation is also based onHash tableof. Each key is mapped to a unique hash value through a hash function and stored in the hash table. Values are stored in association with keys. -
Conflict resolution:and
set
similar,dict
Also use linked lists or open addressing to resolve hash conflicts. -
Dynamic expansion: When the load factor of the hash table exceeds a certain threshold,
dict
It will automatically expand, reassign larger memory space and rehash all key-value pairs. -
Uniqueness of keys:
dict
The keys must be of immutable type (such as integers, strings, tuples, etc.) because they need to support hashing operations.
Performance Features
advantage:
- The time complexity of key-value pair search, insert, and delete operations is O(1).
- Supports flexible key-value mapping relationships.
shortcoming:
- The key must be of immutable type.
- Orderly access is not supported (in Python 3.7+
dict
The insertion order is maintained, but this is not achieved through a hash table, but through an additional mechanism).
Summarize
Data structure | Underlying implementation | advantage | shortcoming |
---|---|---|---|
list |
Dynamic array | Fast random access, supports dynamic expansion | Insert and deletion efficiency, and may have high memory usage |
tuple |
Static array | Immutable, suitable for hashing | Dynamic modification is not supported |
set |
Hash table | Member search, insert and delete quickly, automatically de-repeat | Repeated elements are not supported, ordered access is not supported |
dict |
Hash table | Key-value pairs are fast to find, insert and delete | The key must be of immutable type and does not support orderly access |
Extended reading
- Python official documentation:Data Model
-
Python source code analysis: Can refer toPython source codeimplementation details, especially
、
、
and
document.
- Performance optimization: Understanding the underlying implementation of these data structures can help you better choose the right data structure to optimize code performance.
If you have more specific questions about the underlying implementation of a certain data structure, please continue to ask questions!
This is the article about the understanding of the underlying implementation of list, tuple, set, and dict in Python. For more related Python list, tuple, set, and dict, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!