SoFunction
Updated on 2024-10-29

Python implementation of invoice automatic proofreading microsoft robot

Production Initials:

  • The field was invoiced to the company and found that the information was incorrect and could not be reimbursed;
  • The company's administration and finance are often asked for company invoicing information during the workday, which affects their mood and work;
  • Introducing the appropriate specialized app to resolve invoices is costly for the average company;
  • See friends Meng to go to bed early wrote scripts to solve this problem, but because the company scenarios are not the same, can not be reused, so the new wrote a

This code uses a simple encapsulation method with relatively walk-through annotations in the hope that it will provide some inspiration for beginners in Python, as well as allow people with practical needs to quickly modify and use it.

Source code address:/yc2code/WechatInvoiceParser

. The tool is based on the web version of WeChat, because WeChat has official restrictions on the account, the newly created account may not work and will report:KeyError: 'pass_ticket', as shown:

在这里插入图片描述

So the tool can only use accounts that were registered earlier

Invoice auto-checking microsoft robot code section

1. Tools - Utils
Includes three parts: Invoice, parsing data, DataParser, and Pushover.

  • Invoice calls the Baidu API to upload image information and get parsed data;
  • The DataParser organizes the parsed data to get the information to be sent to the user;
  • Pushover When there is an invocation problem, the first relevant information is pushed to the maintainer's device.
# -*- coding: utf-8 -*-
# 
import base64
import csv
import os
import time
import requests
from Config import config
class Invoice:
 """
 Invoice Recognition Class
 Use Baidu Invoice Recognition API for free
 Official address /docs#/OCR-API/5099e085
 For other functions and configurations, please go to the official website
 """
 @staticmethod
 def get_pic_content(image_path):
  """
  Methods - Opening Images
  Open in binary format
  """
  with open(image_path, 'rb') as pic:
   return ()
 @staticmethod
 def parse_invoice(image_binary):
  """
  Methods - Recognizing Images
  Call the Baidu interface to return the recognized invoice data
  The following content is basically written according to the requirements of the API call, no need to get entangled in the
  Various types of error codes are available in the official website documentation
  Baidu API registration and use tutorial: /forum/topic/show/867951
  """
  # Recognize the quality of optional high and normal
  # normal (default configuration) corresponds to the normal accuracy model, which is faster to recognize and consistent with the HIGH model in terms of the accuracy of the four elements.
  # high corresponds to a high precision recognition model, the corresponding latency will increase, as will the number of failures due to timeouts (error code 282000)
  access_token = "Your access_token."
  api_url = f"/rest/2.0/ocr/v1/vat_invoice?access_token={access_token}"
  quality = "high"
  header = {"Content-Type": "application/x-www-form-urlencoded"}
  # image data, base64 encoded and then urlencoded, the size after base64 encoding and urlencoding is required to be no more than 4M.
  # The shortest side is at least 15px, the longest side is up to 4096px, support jpg/jpeg/png/bmp format.
  image_data = base64.b64encode(image_binary)
  try:
   data = {"accuracy": quality, "image": image_data}
   response = (api_url, data=data, headers=header)
   if response.status_code != 200:
    print(()[:-5], "Failed to get info")
    return None
   else:
    result = ()["words_result"]
    invoice_data = {
     'Date of search': '-'.join(().split()[1:3]),
     'Invoice code': result['InvoiceCode'],
     'Invoice number': result['InvoiceNum'],
     'Invoicing date': result['InvoiceDate'],
     'Total amount': result['TotalAmount'],
     'Total price and tax': result['AmountInFiguers'],
     'Seller's name': result['SellerName'],
     'Seller's Tax ID': result['SellerRegisterNum'],
     'Name of purchaser': result['PurchaserName'],
     'Purchaser's Tax ID': result['PurchaserRegisterNum'],
     "Invoice type": result["InvoiceType"]
    }
    return invoice_data
  except:
   message = "An error occurred in the invoice recognition API call."
   Pushover.push_message(message)
   return None
  finally:
   print(()[:-5], "Produced a call.")
 @staticmethod
 def save_to_csv(invoice_data):
  """
  Method - log saving
  Write the identification log to the work_log.csv file in the folder
  If this file does not exist it is automatically created and written to the table header
  """
  if "work_log.csv" not in ():
   not_found = True
  else:
   not_found = False
  with open('./work_log.csv', 'a+') as file:
   writer = (file)
   if not_found:
    (invoice_data.keys())
   (invoice_data.values())
 @staticmethod
 def run(image_path):
  """
  Main method
  Returns a message when parsing is complete, otherwise returns None
  """
  image_binary = Invoice.get_pic_content(image_path)
  invoice_data = Invoice.parse_invoice(image_binary)
  if invoice_data:
   Invoice.save_to_csv(invoice_data)
   return invoice_data
  return None
class DataParser:
 """
 Data analysis class
 Organize the data returned from identification and compare it with the default information to see if there are any errors.
 Here is only a simple implementation of the method of organizing information and checking the name and tax code, interested in adding other rich methods
 """
 def __init__(self, invoice_data):
  self.invoice_data = invoice_data
 def get_detail_message(self):
  """
  Organize the format of the resulting invoice information
  :return: Returns the organized invoice information
  """
  values = [value for value in self.invoice_data.values()]
  detail_mess = f"The complete information is:" \
   f"\nInvoice Code: {values[1]}\nInvoice number: {values[2]}\ninvoice date: {values[3]}" \
   f"\nTotal amount: {values[4]}\nTotal price and tax: {values[5]}\nName of seller: {values[6]}" \
   f"\nSeller's tax code: {values[7]}\nName of purchaser: {values[8]}\nPurchaser's tax code:{values[9]}"
  return detail_mess
 def get_brief_message(self):
  """
  Compare the name and tax code in the message with the default values
  Only do right and wrong judgment, reader enrichment can be added to point out the wrong location of the information
  :return: return the information of the judgment
  """
  if self.invoice_data["Name of purchaser"] == config["company_name"]:
   brief_mess = "The name of the purchaser is correct"
  else:
   brief_mess = "! Wrong Purchaser Name!"
  if self.invoice_data["Purchaser's tax identification number"] == config["company_tax_number"]:
   brief_mess += "\n Purchaser's tax code is correct."
  else:
   brief_mess += "\n! Purchaser tax code error!"
  return brief_mess
 def parse(self):
  brief_mess = self.get_brief_message()
  detail_mess = self.get_detail_message()
  return brief_mess, detail_mess
class Pushover:
 """
 Pushover
 This time we use Pushover as push messaging software (30 RMB, permanent, recommended)
 Official website /
 You can push messages to different devices like WeChat.
 If you don't need it, you can comment out the code.
 """
 @staticmethod
 def push_message(message):
  message += ">>> from Python Invoice Verification"
  try:
   ("/1/", data={
    "token": "Your Token.",
    "user": "Your User.",
    "message": message
   })
  except Exception as e:
   print(()[:-5], "Pushover failed", e, sep="\n>>>>>>>>>>\n")

2. wechat robot file - Wechat
Contains a section: Wechat processing class Wechat
The role is to initialize the bot to process, analyze and respond to messages from WeChat.

# -*- coding: utf-8 -*-
# 
import os
from wxpy import *
class Wechat:
 """
 WeChat processing class
 Processing, analyzing and responding to WeChat messages
 """
 def __init__(self, group_name, admin_name):
   = Bot() # Instantiate the robot when the class is instantiated.
  self.group_name = group_name # Specify the group chat name
  self.admin_name = admin_name # Administrator microsoft name
  self.received_mess_list = [] # Filtered message list
  self.order_list = [] # Manage command list
  self.pic_list = [] # List of absolute paths of images to be parsed
 def get_group_mess(self):
  """
  Methods - Getting Messages
  Get all normal messages, filter them and store them in the message list
  """
  # Call this method to clear the data stored in the list from the last time it was called.
  self.received_mess_list = []
  for message in :
   # If the message is for a specified group chat or administrator, deposit it in group_mess
   sender = 
   # >>> Here's one thing to keep in mind, if you're using a microsoft as a bot and as an administrator <<<
   # >>> then use this micro-signal to send a message in the group chat, then the message sender will point to himself instead of the group chat <<<
   # >>> It is recommended to use a separate micro-signal as a bot
   if sender == self.group_name or sender == self.admin_name:
    self.received_mess_list.append(message)
   # Other messages filtered out
   (message)
  return None
 def parse_mess(self):
  """
  Method - Processing Group Chat Messages
  Filtering the acquired messages of the specified group chat
  Sets the absolute path to all added group chat images and text commands generated in the group chat
  """
  # Call this method to clear the data stored in the list from the last time it was called.
  self.pic_list = []
  self.order_list = []
  # self.group_order = []
  for message in self.received_mess_list:
   # If the message type is picture, save the picture and add it to the picture list
   if  == 'Picture' and message.file_name.split('.')[-1] != 'gif':
    self.pic_list.append(Wechat.save_file(message))
   # If the message type is text, it is considered a command and saved to the command list
   if  == 'Text':
    self.order_list.append(message)
  return None
 @staticmethod
 def save_file(image):
  """
  methodologies--Storing Pictures
  这里使用静态methodologies,是因为本methodologies和类没有内部交互,静态methodologies可以方便其他程序的调用
  Parse name,Setting the absolute path,stockpile
  :param image: Received images(It can be seen aswxpyGenerated image class,它具有methodologies和属性)
  :return: Returns the absolute path of the image
  """
  path = ()
  # If there is no Pictures folder under the path, create it to hold the received pictures to be recognized
  if "Pictures" not in ():
   ("Pictures")
  # Set a default image format suffix
  file_postfix = "png"
  try:
   # Try to split the name of the image to get the name and suffix separately
   file_name, file_postfix = image.file_name.split('.')
  except Exception:
   # Of course sometimes it may not split, just give it the default suffix #
   file_name = image.file_name
  # Assign absolute paths
  file_path = path + '/Pictures/' + file_name + '.' + file_postfix
  # Store the image under the specified path
  image.get_file(file_path)
  return file_path
 def send_group_mess(self, message):
  """
  Methods - sending a group message
  :param message: what needs to be sent
  """
  try:
   # If the name of the group chat is changed, an error will be reported when searching and the message will not be sent if the group chat is not found.
   group = ().search(self.group_name)[0]
   (message)
  except IndexError:
   print("The specified group chat could not be found. Message delivery failed.")
   return None
 def send_parse_log(self):
  """
  Methods - sending query logs
  Sending a query log to a group chat
  """
  try:
   # If the name of the group chat is changed, an error will be reported when searching and the message will not be sent if the group chat is not found.
   group = ().search(self.group_name)[0]
  except IndexError:
   print("The specified group chat could not be found and the query log failed to be sent.")
   return None
  try:
   group.send_file("./work_log.csv")
  except:
   ("Oops, no log yet")
  return None
 def send_system_log(self):
  """
  Methods - sending system logs
  Sending a query log to a group chat
  """
  try:
   # If the name of the group chat is changed, an error will be reported when searching and the message will not be sent if the group chat is not found.
   group = ().search(self.group_name)[0]
  except IndexError:
   print("The specified group chat could not be found. Syslog delivery failed.")
   return None
  try:
   group.send_file("./system_log.text")
  except:
   ("System log not found")
  return None

3. Main document - Main
Contains a main function, one part of which recognizes and processes invoices, and another part of which reacts to instructions.

# -*- coding: utf-8 -*-
# 
import time
from Utils import Invoice, DataParser
from Config import config
from Wechat import *
# Author : Dash
# Email : @
def main():
 """
 Main method
 One part recognizes and processes invoices, the other part reacts to instructions.
 """
 # Output redirection, write all print statements to syslog file
 file = open("./system_log.text", "a+")
  = file
 # Instantiate a microbot, pass in the group chat name and administrator name
 wechat = Wechat(config["group_name"], config["admin_name"])
 while True:
  (1)
  wechat.get_group_mess()
  wechat.parse_mess()
  # If the group chat has images to be processed, iteratively parse them
  if wechat.pic_list:
   for pic in wechat.pic_list:
    invoice_data = (pic)
    if invoice_data:
     data_parser = DataParser(invoice_data)
     brief_mess, detail_mess = data_parser.parse()
     wechat.send_group_mess(detail_mess) # Send invoice identification details first
     (0.5)
     wechat.send_group_mess(brief_mess) # Returns name and tax code for errors
    else:
     wechat.send_group_mess("The request was unsuccessful, please try again or contact an administrator")
  # Respond accordingly if there is a relevant command
  if wechat.order_list:
   for order in wechat.order_list:
    if "Invoicing information" in :
     wechat.send_group_mess(config["company_name"])
     (0.5)
     wechat.send_group_mess(config["company_tax_number"])
    elif "SEND LOG" in :
     wechat.send_parse_log()
    elif "SEND SYSTEM LOG" in :
     wechat.send_system_log()
    elif "BREAK" in :
     wechat.send_group_mess("Shutdown command received. Shutdown in progress.")
     ()
     return None
if __name__ == "__main__":
 main()

4. Configuration file - Config

Contains microsoft profile information

config = {
 "group_name": "ASAP for invoice validation", # Check the name of the group chat, since this code does not have a group chat with the same name by default, it is recommended to set it to a complex value
 "admin_name": "Dash.", # Administrator's WeChat name (not memo)
 "company_name": "Code Network Technologies Unlimited, Inc.", # Name of default purchaser
 "company_tax_number": "XXX00000000000XXX" # Default Purchaser Tax ID
}

在这里插入图片描述

In addition, the code will create a Picture folder in the same folder to store the images to be parsed, a work_log.csv file to store the log of the recognition information, and system_log.text to output the corresponding logs.

Because of its own needs less, so the above code is relatively thin, just as a small auxiliary script to use. If you want to optimize the perfect, wxpy library provides a lot of rich features, you can build on this basis more reasonable and perfect, in line with the individual needs of the WeChat robot.

summarize

This article on Python to make invoices automatically check WeChat robot is introduced to this article, more related Python to make invoices automatically check WeChat robot content, please search for my previous articles or continue to browse the following related articles I hope you will support me more in the future!