Preface
In daily office, we often encounter.doc
and.docx
Format Word files. although.doc
It is the format used by the old version of Word, but for compatibility and functional integrity, modern office needs are more inclined to use.docx
Format. This article will introduce how to use Python to automatically batch.doc
Convert the file in format to.docx
Format, which facilitates us to quickly convert large amounts of files.
1. Environmental preparation
1. Install Python and necessary libraries
First, we need a Python environment and install itcomtypes
Library for interacting with COM components of Windows (such as Word):
pip install comtypes
2. Make sure Microsoft Word is installed
becausecomtypes
The COM interface of Word is called, so you need to make sure that Microsoft Word (for Windows systems) is installed on the system.
2. Code implementation
1. Import the required library
We first importos
and, used for file path operation and interaction with Word, respectively:
import os import
2. Specify the folder path
In the code, we specified include.doc
Folder path to the file. In the example, set the path toD:\1
, please replace the path you actually store the file as needed:
folder_path = r'D:\1'
3. Write conversion functions
Functionconvert_doc_to_docx(input_doc_path)
Will single.doc
Convert file to.docx
document:
- use
('')
Create a Word application instance. - Open
.doc
File and specify to save as.docx
Format. - Close documents and Word apps and free up resources.
def convert_doc_to_docx(input_doc_path): word = ('') = False doc = (input_doc_path) output_docx_path = input_doc_path.replace('.doc', '.docx') # Generate output path (output_docx_path, FileFormat=16) #16 means docx format () () return output_docx_path # Return the newly generated .docx file path
4. Batch conversion of files
Next, we traverse the folder.doc
File, callconvert_doc_to_docx
Functions are converted one by one. After each conversion is completed, the file name and path that was successfully converted will be output.
# Iterate through all .doc files in the folder and convert them to .docxfor filename in (folder_path): if ('.doc'): doc_path = (folder_path, filename) docx_path = convert_doc_to_docx(doc_path) # Use different variable names to receive the return value print(f"{filename} Converted successfully to {docx_path}") print("All files have been converted!")
3. Complete code
The following is to.doc
Batch conversion of files to.docx
The complete code of including comments for easy understanding:
import os import # Specify the folder pathfolder_path = r'D:\1' # Function converts .doc to .docxdef convert_doc_to_docx(input_doc_path): word = ('') = False # Set Word invisible doc = (input_doc_path) # Open the .doc file output_docx_path = input_doc_path.replace('.doc', '.docx') # Set the output file name (output_docx_path, FileFormat=16) # Save as .docx format () # Close the document () # Exit Word app return output_docx_path # Return the newly generated .docx path # Iterate through all .doc files in the folder and convert them to .docxfor filename in (folder_path): if ('.doc'): doc_path = (folder_path, filename) docx_path = convert_doc_to_docx(doc_path) # Call the conversion function print(f"{filename} Converted successfully to {docx_path}") print("All files have been converted!")
4. Code description
-
File traversal and judgment:
-
(folder_path)
Travel all files in the folder. -
if ('.doc')
Make sure to handle only.doc
Files, avoid mishandling other file formats.
-
-
Word app is not visible:
-
= False
Hide Word window to avoid popping up and affecting user operations.
-
-
File path replacement:
-
output_docx_path = input_doc_path.replace('.doc', '.docx')
Replace the extension of the original file path from.doc
Change to.docx
, generate a new save path.
-
-
File format settings:
-
FileFormat=16
Specify to save as.docx
Format.FileFormat Parameter value16
Corresponding.docx
File type.
-
-
Resource release:
-
()
Close the current document,()
Exit the Word application to ensure that resources are not occupied.
-
-
Output result:
- After each successful conversion, use
print()
Displays the file name that has been converted to facilitate tracking progress.
- After each successful conversion, use
5. Things to note
- Make sure Microsoft Word is installed on the system: The code depends on Word's COM components, and Word needs to be installed in the system to run normally.
-
File format and path: Please confirm
.doc
The path to the file to avoid specifying the wrong folder. -
Resource Management: When converting a large number of files, make sure
()
Being executed to prevent the Word process from occupying system resources. - Running environment: This code is suitable for Windows systems because it relies on COM components to interact with Word.
6. Summary
By using Python withcomtypes
library, we can implement batch.doc
Convert file to.docx
Document requirements. This method not only saves time, but also effectively solves the inefficiency and error rate problems caused by manual operation. I hope this article can help you in your daily office work!
This is the article about converting Word documents (.doc) into .docx format in Python batch conversion. For more related content related to Python batch conversion docx format, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!