Preface
Commonly used plug-ins for PDF to text include: pdfbox, itextpdf and these,
itextpdf
Importing maven dependencies of itextpdf
<!--rely--> <dependency> <groupId></groupId> <artifactId>itextpdf</artifactId> <version>5.5.13.3</version> </dependency>
Extract text code
import ; import ; import .*; public class Main_itextPdf { public static void main(String[] args) throws Exception { ("--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------); // 2. Load PDF file File file = new File("C:/Users/Administrator/Desktop/Luo Kaiwei's resume.pdf"); PdfReader reader = new PdfReader(()); // 3. Parses PDF files and obtains page data int page = 1; // Get the first page String text = (reader, page); (text); // 4. Close PdfReader (); ("------------------------------------------------------------------------------------------------------------------------------); } }
Convert pdf to picture code
Both the local image address and the URL address of the online PDF support
import ; import ; import ; import ; import .*; import ; import ; import ; public class Main_itextPdf { public static void main(String[] args) throws Exception { ("--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------); // InputStream inputStream = readPdfFromUrl("/");//Online pdf file InputStream inputStream = new FileInputStream("C:/Users/Administrator/Desktop/Luo Kaiwei's resume.pdf") ; byte[] bytes = streamToByte(inputStream); InputStream newStream = new ByteArrayInputStream(bytes); //Convert pdf stream to png picture stream InputStream imgStream = pdfToImg(newStream); //Storage the image imgStream to the desktop ((imgStream), "png", new File("C:/Users/Administrator/Desktop/")); ("------------------------------------------------------------------------------------------------------------------------------); } //Support online pdf file address url public static InputStream readPdfFromUrl(String pdfUrl) throws IOException { URL url = new URL(pdfUrl); URLConnection connection = (); BufferedInputStream bufferedInputStream = new BufferedInputStream(()); return bufferedInputStream; } public static ByteArrayInputStream pdfToImg(InputStream pdfStream) throws Exception { // Convert InputStream to PDDocument PDDocument document = (pdfStream); // Create PDFRenderer object PDFRenderer pdfRenderer = new PDFRenderer(document); // Select the first page to generate the picture // You can change it to traverse all pages and save it as needed BufferedImage bufferedImage = (0, 300); // 0 indicates the first page, 300 DPI provides high-quality images // Convert BufferedImage to InputStream ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(); (bufferedImage, "PNG", byteArrayOutputStream); (); // Return a new InputStream return new ByteArrayInputStream(()); } public static byte[] streamToByte(InputStream inputStream) throws Exception { ByteArrayOutputStream buffer = new ByteArrayOutputStream(); int nRead; byte[] data = new byte[1024]; while ((nRead = (data, 0, )) != -1) { (data, 0, nRead); } (); return (); } }
This is the article about Java using itextpdf to implement PDF to text and pictures. For more related Java itextpdf PDF to convert text and pictures, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!