Common ways to easily read Word document content in Java
Preface
In Java development, sometimes we have to read the contents of Word documents, which is particularly useful when handling contracts, reports and other files. We can use different libraries to implement the reading function according to the format of the Word document. Let's talk about it in detail below.doc
and.docx
The reading methods of these two common format documents.
1. Read Word documents in .doc format
Introduce dependencies
If you use Maven to manage your project,Add the dependencies of Apache POI:
<dependency> <groupId></groupId> <artifactId>poi-scratchpad</artifactId> <version>5.2.3</version> </dependency>
Code Example
import ; import ; import ; import ; public class ReadDocFile { public static void main(String[] args) { try (FileInputStream fis = new FileInputStream("")) { // Create an HWPFDocument object to represent a .doc document HWPFDocument document = new HWPFDocument(fis); // Create WordExtractor object to extract document content WordExtractor extractor = new WordExtractor(document); // Get the text content of the document String content = (); (content); } catch (IOException e) { (); ("Reading .doc file failed:" + ()); } } }
Code explanation
FileInputStream fis = new FileInputStream("")
: Create a file input stream to readdocument.
HWPFDocument document = new HWPFDocument(fis)
:useHWPFDocument
The class creates a document object that can handle.doc
Format document.WordExtractor extractor = new WordExtractor(document)
: CreateWordExtractor
Object, it can extract text content from document objects.String content = ()
: CallgetText()
Method to obtain all text content of the document and print it out.
2. Read Word documents in .docx format
Introduce dependencies
Also inAdd the dependencies of Apache POI:
<dependency> <groupId></groupId> <artifactId>poi-ooxml</artifactId> <version>5.2.3</version> </dependency>
Code Example
import ; import ; import ; import ; import ; public class ReadDocxFile { public static void main(String[] args) { try (FileInputStream fis = new FileInputStream("")) { // Create an XWPFDocument object to represent a .docx document XWPFDocument document = new XWPFDocument(fis); StringBuilder content = new StringBuilder(); // traverse each paragraph in the document for (XWPFParagraph paragraph : ()) { // Iterate through each text run object in the paragraph for (XWPFRun run : ()) { ((0)); } ("\n"); } (()); } catch (IOException e) { (); ("Reading .docx file failed:" + ()); } } }
Code explanation
FileInputStream fis = new FileInputStream("")
: Create file input stream readingdocument.
XWPFDocument document = new XWPFDocument(fis)
:useXWPFDocument
Class creates document objects, which are specially processed.docx
Format document.Through two-layer loops, the outer layer traverses each paragraph in the document, the inner layer traverses each text running object in the paragraph, and adds the text content to
StringBuilder
, finally print it out.
Hey, friends! With the above method, we can easily read Word document content in different formats using Java. Try it now so that your program can also "communicate" with Word documents!
Summarize
This is the introduction to this article about the common methods of easily reading Word document content in Java. For more related Java reading Word document content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!
Related Articles
Java implements simple poker game
This article mainly introduces the Java implementation of simple poker games. The sample code in the article is introduced in detail and has a certain reference value. Interested friends can refer to it.2020-09-09Singleton pattern in Java thread safety
This article mainly introduces the singleton pattern in Java thread safety. Friends who need it can refer to it.2015-02-02Detailed explanation of the Java backend separation of front and backend
This article mainly introduces the Java backend that explains the separation of front and back ends in detail. The editor thinks it is quite good. I will share it with you now and give you a reference. Let's take a look with the editor2017-05-05Detailed explanation of Kotlin modifier lateinit (delay initialization) case
This article mainly introduces a detailed explanation of the Kotlin modifier lateinit (delay initialization). This article explains the understanding and use of this technology through brief cases. The following is the detailed content. Friends who need it can refer to it.2021-09-09Brief analysis of java memory model jvm virtual machine
The main purpose of the Java memory model is to define access rules for various variables in the program, focusing on the underlying details of storing variable values into memory in the virtual machine and taking out variable values from memory.2021-09-09A brief analysis of the dynamic proxy method of Java implementation
This article mainly introduces a brief analysis of Java's dynamic proxy method. It is very practical. Friends who need it can refer to it.2014-08-08Kotlin Basic Tutorials: Object Oriented
This article mainly introduces object-oriented information about Kotlin's basic tutorial. Friends who need it can refer to it.2017-05-05Teach you how to sort Java List function examples in 20 seconds
This article mainly introduces a detailed explanation of the examples of teaching you to learn List function sorting operations in 20 seconds. Friends in need can refer to it for reference. I hope it can be helpful. I wish you more progress and get promoted as soon as possible to get a salary increase as soon as possible.2023-09-09Detailed explanation of the methods and steps of Spring Boot automatic assembly
This article mainly introduces the detailed explanation of the methods and steps of Spring Boot automatic assembly. The example code is introduced in this article in detail, which has a certain reference learning value for everyone's study or work. Friends who need it, please learn with the editor below.2019-06-06Java Mybatis framework from shallow to deep analysis
MyBatis is an excellent persistence layer framework. It encapsulates the process of jdbc's operating database, so that developers only need to pay attention to SQL itself, without spending energy to deal with the complicated process code of jdbc such as registering drivers, creating connections, creating statements, manually setting parameters, and retrieving results. This article will introduce the use of MyBatis in depth.2022-07-07