1. Overview of glob module
glob
The module can find files and directories according to the specified pattern, and it supports fuzzy matching using wildcards. Mainly byglob
andiglob
Two functions to achieve pattern matching of file paths, and can handle simple file name patterns, such as*.txt
It can match all extensions as.txt
file.
2. Wildcard rules
(I) * Wild Card
- Function: Match any number of characters (including zero) of them.
- Example:
import glob # Match all files with the extension .txt in the current directorytxt_files = ('*.txt') for file in txt_files: print(file)
In the above code,*.txt
It means that the file name matches any of the current directory, but the extension is.txt
All files.
(II)? Wild Card Talisman
- Function: Match a single arbitrary character.
- Example:
import glob # Match files with only one character in the current directory with an extension of .pysingle_char_py_files = ('?.py') for file in single_char_py_files: print(file)
here,?.py
Indicates that the file name has only one character and the extension is.py
file.
(III)[] Wild Card
Function: Match any character specified in square brackets.
Example:
import glob # Match files with file names starting with a or b and extension .txt in the current directoryab_txt_files = ('[ab]*.txt') for file in ab_txt_files: print(file)
[ab]*.txt
Indicates the file name asa
orb
Start with the extension.txt
file.
(IV) ** Wildcard character (recursive matching)
-
Function: Used in the path
**
Represents a recursive matching directory. Supported in Python 3.5 and above. - Example:
import glob # Recursively match all files with .txt extension under the current directory and its subdirectoriesall_txt_files = ('**/*.txt', recursive=True) for file in all_txt_files: print(file)
**/*.txt
Combinedrecursive=True
Will recursively search the current directory and all subdirectories.txt
document.
3. glob function
(a)(pathname, *, recursive=False)
Function: Returns a list of file and directory paths that match the specified pattern.
-
parameter:
-
pathname
: The path pattern to match. -
recursive
: Whether to perform recursive matching, the default isFalse
. When set toTrue
When using**
Wildcards are searched recursively.
-
Example:
import glob # Match all files starting with test under the current directory and its subdirectoriestest_files = ('**/test*', recursive=True) for file in test_files: print(file)
4. Iglob function
(a)(pathname, *, recursive=False)
- Function: Returns an iterator that produces file and directory paths that conform to the specified pattern one by one.
-
parameter:and
same.
- Example:
import glob # Use iglob to iterate over all .py files in the current directorypy_files_iter = ('*.py') for file in py_files_iter: print(file)
iglob
It is suitable for handling a large number of matching results, because it does not generate all matching results at once, but generates them one by one, saving memory.
5. Comparison between glob and other file search methods
How to find | advantage | shortcoming | Applicable scenarios |
---|---|---|---|
glob Module |
Simple to use, supports wildcard matching, and can quickly find files and directories that match the pattern. | The pattern matching function is relatively simple and does not support complex regular expression matching. | Simple file and directory searches, such as searching by extension, file name prefix, etc. |
Function |
You can traverse the directory tree recursively and control the traversal process in detail. | It requires manual code to be written for file filtering, which is relatively complicated to use. | Scenarios that require deep traversal and complex filtering of the directory tree. |
re Module (regular expression) |
Supports complex pattern matching and powerful functions. | The learning cost is high and the code is relatively complex. | A scenario where complex file name pattern matching is required. |
6. Application scenarios
(I) Batch file processing
Can be usedglob
The module finds files that meet a specific pattern, and then batches these files, such as batch renaming, batch reading of file content, etc.
import glob # Batch rename all .txt files in the current directorytxt_files = ('*.txt') for file in txt_files: new_name = ('.txt', '_new.txt') import os (file, new_name)
(II) Data collection
When performing data collection, it may be necessary to read data from multiple files. Can be usedglob
The module looks for relevant files and then reads the data.
import glob # Read data of all .csv files in the current directory and its subdirectoriescsv_files = ('**/*.csv', recursive=True) for file in csv_files: with open(file, 'r') as f: data = () print(f"Data from {file}: {data[:100]}...")
Article summary
glob
The module provides Python developers with a convenient way to find files and directories. By using Unix-style wildcard rules, it can quickly locate files and directories that meet specific patterns.The function returns a list of matching results.
Function returns an iterator, suitable for different application scenarios. In simple file search and batch processing tasks,
glob
Modules are a very practical tool.
This is the article about the use of the glob module for Python file path pattern matching. For more related contents of the glob module for Python file path pattern matching, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!