SoFunction
Updated on 2024-10-30

Python for pdf to add a watermark function

Create the desired watermark template

wps create

Output pdf

watermark pdf

Implementation steps

Installation of dependencies

pip install PyPDF2

coding

import os
from PyPDF2 import PdfFileReader as pr
from PyPDF2 import PdfFileWriter as pw


def write_watermark(watermark_pdf_path: str, target_pdf_path: str):
    result_pdf = pw()
    pdf_file_name = (target_pdf_path)
    f_target = open(target_pdf_path, 'rb')
    f_watermark = open(watermark_pdf_path, 'rb')
    target_pdf = pr(f_target)
    watermark_page = pr(f_watermark).getPage(0)
    for page in range(target_pdf.getNumPages()):
        try: # This try except paragraph is a bug that has been super troublesome for me to solve all day.
            target_pdf.getPage(page).mergePage(watermark_page)
            result_pdf.addPage(target_pdf.getPage(page))
        except Exception as e:
            result_pdf.addPage(watermark_page)
    if not ("output"):
        ("output")
    result_pdf.write(open("output/watermark added_" + pdf_file_name, 'wb'))
    f_target.close()
    f_watermark.close()


def folder_pdf_files(folder: str) -> list[str]:  # How many pdf files inside a folder
    file_list = []
    for a, b, c in (folder):
        if b == []:
            for filename in c:
                if filename[-3:].lower() == 'pdf':
                    file_path = (a, filename)
                    file_list.append(file_path)
    print(folder, ": with ", len(file_list), "A pdf file.")
    return file_list


def group_write_watermark(path_array: list[str], watermark_pdf_path: str):  # A group of pdf files to add a watermark
    for pdf_path in path_array:
        print(pdf_path, "Adding a watermark in...")
        write_watermark(watermark_pdf_path, pdf_path)
    print("Finish.")


if __name__ == '__main__':
    watermark_pdf_path = "Watermark file.pdf".
    folder_pdf = "Catalog"  # Directory of the pdf to be watermarked
    pdf_list = folder_pdf_files(folder_pdf)
    group_write_watermark(pdf_list, watermark_pdf_path)

concern

UnicodeEncodeError: ‘latin-1’ codec can’t encode characters in position 8-9: ordinal not in range(256)

If this error occurs, you can refer to the following.

Encoding problems with pypdf2

error message (computing)

‘latin-1’ codec can’t encode characters in position 8-11: ordinal not in range(256)

Usually this is a Chinese character encoding problem.

The following is the use of pypdf2 copy pdf when the error message

//Error message
<ipython-input-1-4f7e1b354328> in <module>()
     14      (p)
     15 with open('D:\\Program Files\\', 'wb') as f:
---> 16     (f)

D:\Program Files (x86)\anaconda\lib\site-packages\PyPDF2\ in write(self, stream)
    499                 md5_hash = md5(key).digest()
    500                 key = md5_hash[:min(16, len(self._encrypt_key) + 5)]
--> 501             (stream, key)
    502             (b_("\nendobj\n"))
    503 

D:\Program Files (x86)\anaconda\lib\site-packages\PyPDF2\ in writeToStream(self, stream, encryption_key)
    547             (stream, encryption_key)
    548             (b_(" "))
--> 549             (stream, encryption_key)
    550             (b_("\n"))
    551         (b_(">>"))

D:\Program Files (x86)\anaconda\lib\site-packages\PyPDF2\ in writeToStream(self, stream, encryption_key)
    470 
    471     def writeToStream(self, stream, encryption_key):
--> 472         (b_(self))
    473 
    474     def readFromStream(stream, pdf):

D:\Program Files (x86)\anaconda\lib\site-packages\PyPDF2\ in b_(s)
    236             return s
    237         else:
--> 238             r = ('latin-1')
    239             if len(s) < 2:
    240                 bc[s] = r

UnicodeEncodeError: 'latin-1' codec can't encode characters in position 8-11: ordinal not in range(256)

cure

1, modify the files in the pypdf2 package

Since I am using anaconda, the path is anaconda\Lib\site-packages\PyPDF2\

Original text of line 488 of the document

try:
   return NameObject(('utf-8'))
   except (UnicodeEncodeError, UnicodeDecodeError) as e:
   # Name objects should represent irregular characters
   # with a '#' followed by the symbol's hex number
   if not :
      ("Illegal character in Name Object", )
      return NameObject(name)
   else:
      raise ("Illegal character in Name Object")

adapt (a story to another medium)

try:
     return NameObject(('utf-8'))
 except (UnicodeEncodeError, UnicodeDecodeError) as e:
     try:
         return NameObject(('gbk'))
     except (UnicodeEncodeError, UnicodeDecodeError) as e:
         # Name objects should represent irregular characters
         # with a '#' followed by the symbol's hex number
         if not :
             ("Illegal character in Name Object", )
             return NameObject(name)
         else:
             raise ("Illegal character in Name Object")

2, modify the files in the pypdf2 package

Original utils.py238 line

 r = ('latin-1')
 if len(s) < 2:
   		bc[s] = r
 return r

modify to

try:
    r = ('latin-1')
    if len(s) < 2:
        bc[s] = r
    return r
except Exception as e:
    print(s)
    r = ('utf-8')
    if len(s) < 2:
        bc[s] = r
    return r

Problem solving

come to realize

The innovation of this code is that it enables folder traversal to add watermarks.

In fact, the first point is nothing, I feel that I have the most sense of success is write_watermark function in the section of the try except statements at this piece of code to solve the pdf blank page and send the error. Solved a day, there is no solution on the network. I feel the stone across the river.

One problem is that this code does not work forImage pdfThe watermark does not work well because the image pdf's are also a bit larger in page size than a regular text pdf, making it difficult to control the location of the watermark. What I want is to enlarge the size of the pdf when creating the watermark pdf.

to this article on the implementation of this Python pdf to add a watermark function of the article is introduced to this, more relevant Python pdf watermark content, please search for my previous articles or continue to browse the following articles I hope you will support me in the future more!