python opencv split and recognize table images by table frame lines
The following small program for the use of python + opencv will be a form of pictures, according to the form of segmentation, and recognition of segmented sub-pictures of the text, I hope that the need for some partners have some help. Specific implementation of the following code.
# -*- coding: utf-8 -*- """ Created on Tue May 28 19:23:19 2019 Split image into subimages according to table box line intersections (pass in image path) @author: hx """ import cv2 import numpy as np import pytesseract image = ('C:/Users/Administrator/Desktop/', 1) # Grayscale images gray = (image, cv2.COLOR_BGR2GRAY) # Binarization binary = (~gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 35, -5) #ret,binary = (~gray, 127, 255, cv2.THRESH_BINARY) ("Binarized Pictures:", binary) #Show Pictures (0) rows,cols= scale = 40 #Identify the horizontal line kernel = (cv2.MORPH_RECT,(cols//scale,1)) eroded = (binary,kernel,iterations = 1) #("Eroded Image",eroded) dilatedcol = (eroded,kernel,iterations = 1) ("Tabular Horizontal Line Presentation:",dilatedcol) (0) #Identify vertical lines scale = 20 kernel = (cv2.MORPH_RECT,(1,rows//scale)) eroded = (binary,kernel,iterations = 1) dilatedrow = (eroded,kernel,iterations = 1) ("Table Vertical Line Display:",dilatedrow) (0) #Mark the intersection bitwiseAnd = cv2.bitwise_and(dilatedcol,dilatedrow) ("Table Intersection Display:",bitwiseAnd) (0) # ("",bitwiseAnd) #Generate an image from binary pixel points to be saved # Logo Forms merge = (dilatedcol,dilatedrow) ("Overall presentation of the table:",merge) (0) #Two images subtracted to remove table box lines merge2 = (binary,merge) ("Pictures to remove table frame line display:",merge2) (0) #Identify white intersections in black and white plots, take out the horizontal and vertical coordinates ys,xs = (bitwiseAnd>0) mylisty=[] # vertical coordinate mylistx=[] # Horizontal coordinates # Get the x and y values of the jumps by sorting them to show that they are intersections, otherwise the intersections would have so many pixel values with similar values that I would just take the last point of the similar values # This 10 jump is not fixed, depending on the picture will be fine-tuned, basically for the height of the cell form (y-coordinate jump) and length (x-coordinate jump) i = 0 myxs=(xs) for i in range(len(myxs)-1): if(myxs[i+1]-myxs[i]>10): (myxs[i]) i=i+1 (myxs[i]) #To add the last point i = 0 myys=(ys) #print((ys)) for i in range(len(myys)-1): if(myys[i+1]-myys[i]>10): (myys[i]) i=i+1 (myys[i]) #To add the last point print('mylisty',mylisty) print('mylistx',mylistx) # Loop y-coordinate, x-coordinate split table for i in range(len(mylisty)-1): for j in range(len(mylistx)-1): # In segmentation, the first parameter is the y-coordinate and the second parameter is the x-coordinate ROI = image[mylisty[i]+3:mylisty[i+1]-3,mylistx[j]:mylistx[j+1]-3] The reason for the #minus 3 is because I narrowed down the ROI range ("Segmented sub-picture display:",ROI) (0) #special_char_list = '`~!@#$%^&*()-_=+[]{}|\\;:‘',。《》/?ˇ' .tesseract_cmd = 'E:/Tesseract-OCR/' text1 = pytesseract.image_to_string(ROI) #Read text, this is the default English #text2 = ''.join([char for char in text2 if char not in special_char_list]) print('Recognize segmented sub-picture information as:'+text1) j=j+1 i=i+1
This is the whole content of this article.
Related articles
Python using scapy to simulate packets to achieve arp attacks, dns amplification attack example
This article introduces the use of Python scapy simulation packets to achieve arp attacks, dns amplification attack examples, this article focuses on the use of scapy, the need for friends can refer to the next!2014-10-10Matplotlib animation module to implement dynamic diagrams
This article introduces the Matplotlib animation module to achieve the dynamic map, the text through the sample code is very detailed, for everyone's learning or work has a certain reference learning value, the need for friends below with the editorial to learn together!2021-02-02python basic tutorial project three of the universal XML
This article is mainly for you to introduce the python basic tutorial project three of the universal XML, with certain reference value, interested partners can refer to it2018-04-04Python Operator Overloading Explained and Example Code
This article introduces the Python operator overloading details and example code, you can refer to the following2017-03-03Django docking Alipay to realize Alipay recharge gold coin function example
Today, I'd like to share a Django docking Alipay to achieve Alipay recharge gold coin function example, has a good reference value, I hope to help you. Together follow the editor over to see it2019-12-12python decompile exe file to py file example code
This article introduces the python decompile exe file for py file example code, very good, with some reference value, need friends can refer to the following2019-06-06Python automatically creates Markdown tables to explore the use of example
Markdown table is one of the important ways to organize and display data in the document, however, manually write a large form may be time-consuming and error-prone, this article will introduce how to use Python to automatically create a Markdown table, through the sample code to show in detail a variety of scenarios under the creation of methods to improve the efficiency of the form generation2024-01-01python implementation of converting a read multidimensional list to a one-dimensional list
Today I will share with you a python implementation of the read multi-dimensional list into a one-dimensional list, has a good reference value, I hope to help you. Together follow the editor over to see it2018-06-06Steps to implement a python manipulation mobile app
This article mainly introduces the python operation of cell phone app implementation steps, this article will be combined with examples of code, has a certain reference value, interested partners can refer to it2021-07-07python opencv Implementation of Simple Thresholding Algorithm
This article introduces the realization of python opencv simple threshold algorithm, the text of the sample code through the introduction of the very detailed, for everyone to learn or work with a certain reference to the value of learning, the need for friends below with the editorial to learn together!2019-08-08