SoFunction
Updated on 2024-10-29

Python Automating Excel and Word for Auto Office

Today I come to share some Python office automation methods, welcome to collect and learn, like to praise support, welcome to chat.

Openpyxl

Openpyxl is arguably the most versatile tool module in Python, enabling interaction with Excel pip install openpyxl
pip install python-docx is like a walk in the park. With it, you can read and write all current and legacy excel formats, namely xlsx and xls.

Openpyxl allows filling rows and columns, executing formulas, creating 2D and 3D charts, labeling axes and titles, and tons of other features that can come in handy. Most importantly, this package allows you to iterate through countless rows and columns in Excel, thus saving you from all the annoying number crunching and plotting.

Python-docx

The Python-docx package is to Word what Openpyxl is to Excel. If you haven't looked into their documentation, you probably should. It's no exaggeration to say that Python-docx is one of the easiest and most self-explanatory toolkits I've used since I started using Python.

It allows you to automatically generate documents by inserting text, filling in forms and automatically rendering images into reports without any overhead.

Let's create our own automation pipeline. Go ahead and start Anaconda and install the following packages:

pip install openpyxl
pip install python-docx

Microsoft Excel Automation

We load an already created Excel workbook (shown below):

workbook = xl.load_workbook('')
sheet_1 = workbook['Sheet1']

在这里插入图片描述

We will iterate through all the rows in the spreadsheet and calculate by multiplying the current by the voltage, inserting the power value:

for row in range(2, sheet_1.max_row + 1):
    current = sheet_1.cell(row, 2)
    voltage = sheet_1.cell(row, 3)
    power = float() * float()
    power_cell = sheet_1.cell(row, 1)
    power_cell.value = power

Upon completion, we will use the calculated power values to generate a line graph that will be inserted into the specified cell as shown below:

values = Reference(sheet_1, min_row = 2, max_row = sheet_1.max_row, min_col = 1, max_col = 1)
chart = LineChart()
chart.y_axis.title = 'Power'
chart.x_axis.title = 'Index'
chart.add_data(values)
sheet_1.add_chart(chart, 'e2') 
('')

在这里插入图片描述

Extract Chart

Now that we have generated the chart, we need to extract it as an image so we can use it in our Word report.

First, we will declare the exact location of the Excel file and where the output chart image should be saved:

input_file = "C:/Users/.../"
output_image = "C:/Users/.../"

Then use the following method to access the spreadsheet:

operation = ("")
 = 0
 = 0
workbook_2 = (input_file)
sheet_2 = (1)

You can then iterate through all the chart objects in the spreadsheet and save them in the specified location, as shown below:

for x, chart in enumerate(sheet_2.Shapes):
    ()
    image = ()
    (output_image, 'png')
    pass
workbook_2.Close(True)
()

Microsoft word automation

Now that we have generated our chart image, we must create a template document, which is a plain Microsoft Word document (.docx) formulated exactly as we want the report to look, including fonts, font sizes, formatting, and page structure .

All we then need to do is create placeholders for our automated content, i.e. table values and images, and declare them using the variable names shown below.

在这里插入图片描述

Any automation content can be found in a pair of double curly braces {{variable_name}} declared within, including text and images. For tables, you need to create a table with template rows containing all columns, which then need to be appended to the previous and next rows using the following notation:

{%tr for item in variable_name %}

Last line:

%tr endfor %}

In the above figure, the variable name is

  • table_contents Python dictionary for storing tabular data
  • Index of dictionary key (first column)
  • Dictionary values for power, current and voltage (second, third and fourth columns)

We then import our template document into Python and create a dictionary to store the values of our table:

template = DocxTemplate('')
table_contents = []
for i in range(2, sheet_1.max_row + 1):
    table_contents.append({
        'Index': i-1,
        'Power': sheet_1.cell(i, 1).value,
        'Current': sheet_1.cell(i, 2).value,
        'Voltage': sheet_1.cell(i, 3).value
        })

Next, we will import the chart image previously generated by Excel and will create another dictionary to instantiate all the placeholder variables declared in the template document:

image = InlineImage(template,'',Cm(10))
context = {
    'title': 'Automated Report',
    'day': ().strftime('%d'),
    'month': ().strftime('%b'),
    'year': ().strftime('%Y'),
    'table_contents': table_contents,
    'image': image
    }

Finally, we will present the report using our table of values and graphical images:

(context)
('Automated_report.docx')

summarize

Just like that, the automatically generated Microsoft Word report contains figures and charts created in Microsoft Excel. With this, you have a fully automated pipeline for creating as many tables, charts, and documents as you may need.

在这里插入图片描述

The source code is as follows

import openpyxl as xl
from  import LineChart, Reference

import 
import PIL
from PIL import ImageGrab, Image
import os
import sys

from  import Cm
from docxtpl import DocxTemplate, InlineImage
from  import Cm, Inches, Mm, Emu
import random
import datetime
import  as plt


######## Generate automated excel workbook ########

workbook = xl.load_workbook('')
sheet_1 = workbook['Sheet1']
  
for row in range(2, sheet_1.max_row + 1):
    current = sheet_1.cell(row, 2)
    voltage = sheet_1.cell(row, 3)
    power = float() * float()
    power_cell = sheet_1.cell(row, 1)
    power_cell.value = power
  
values = Reference(sheet_1, min_row = 2, max_row = sheet_1.max_row, min_col = 1, max_col = 1)
chart = LineChart()
chart.y_axis.title = 'Power'
chart.x_axis.title = 'Index'
chart.add_data(values)
sheet_1.add_chart(chart, 'e2')
  
('')


######## Extract chart image from Excel workbook ########

input_file = "C:/Users/.../"
output_image = "C:/Users/.../"

operation = ("")
 = 0
 = 0
    
workbook_2 = (input_file)
sheet_2 = (1)
    
for x, chart in enumerate(sheet_2.Shapes):
    ()
    image = ()
    (output_image, 'png')
    pass

workbook_2.Close(True)
()


######## Generating automated word document ########

template = DocxTemplate('')

#Generate list of random values
table_contents = []

for i in range(2, sheet_1.max_row + 1):
    
    table_contents.append({
        'Index': i-1,
        'Power': sheet_1.cell(i, 1).value,
        'Current': sheet_1.cell(i, 2).value,
        'Voltage': sheet_1.cell(i, 3).value
        })

#Import saved figure
image = InlineImage(template,'',Cm(10))

#Declare template variables
context = {
    'title': 'Automated Report',
    'day': ().strftime('%d'),
    'month': ().strftime('%b'),
    'year': ().strftime('%Y'),
    'table_contents': table_contents,
    'image': image
    }

#Render automated report
(context)
('Automated_report.docx')

If you'd like to learn more about data visualization and Python, join the Tech Talk group.

technical exchange

Feel free to republish, bookmark, and like something to support it!

在这里插入图片描述

This article on Python automation processing Excel and Word to achieve automated office article is introduced to this, more related Python automated office content, please search my previous posts or continue to browse the following related articles I hope you will support me in the future more!