SoFunction
Updated on 2024-10-29

Python connecting to a database using matplotlib to draw bar charts

I. Introduction to bar charts

(1) Introduction

Histogram, also known as bar graph (English: bargraph), long bar graph (English: barchart), bar graph (Bar graph), is a type of statistical report graph that expresses the graph in terms of the length of a rectangle as a variable, and consists of a series of longitudinal stripes of varying heights that indicate the distribution of data, and is used to compare two or more values (different times or different conditions) with only one variable, and is usually utilized in the analysis of smaller data sets. Bar charts can also be arranged horizontally or presented in a multidimensional manner.

(2) Advantages, disadvantages

Pros:

  • ① Facilitates the user's understanding of large amounts of data and how it relates to each other.
  • ② The advantage is that it allows the user to read the raw data more quickly and intuitively by visualizing the symbols.

Drawbacks:

The limitation of bar charts is that they are only suitable for small to medium sized datasets.

(3) Scope of application

The occasion is a two-dimensional dataset, which is used to compare data changes over a period of time

II. Presentation of data

(1) Data composition

The data for this bar chart painting is provided by the order table (ORDER) in the database, where the table ORDER contains twenty-one columns such as order number (ORDER_ID), order date (ORDER_DATE), and store name (SITE).

(2) Data Selection

According to the definition and scope of the bar chart, the data we choose for this drawing is the data with statistical counts and can be compared, so we choose the sales manager and the profit of the order this time.

Statistics in Navicat via SQL statements to find out the profit from sales made by each sales manager in 2019.

SELECT MANAGER, SUM(PROFIT) as TotalProfit FROM orders where FY='2019' group by MANAGER

Third, python database connection configuration and data extraction settings

(1) Calling libraries and connection syntax

If you don't have the pymysql library, you can install it by saying pip install pymysql.

import pymysql 
import pandas as pd # Use for data import (pd.read_sql_query() executes sql statement to get result df)
import  as plt # Used to draw graphs (() line graphs, () bar graphs, ....)
# 1. Connecting to a MySQL Database: Creating a Database Connection
conn = (host='ip',port=port number,user='Username',password='User password',db='Connection table name')

(2) Explanation of grammar parameters

After calling the library by creating a connection, the connection parameters are as follows:

  • host:Hostname, also stored ip address
  • port:Database port number, general database port number 3306
  • user:user ID
  • password:user password
  • db:Database name

(3) Data extraction settings

Connecting to the database and extracting data from the database involves SQL querying of the database, and here you will also find simple database manipulation methods under Python.

# 2 Create a sql statement
# -- Tallying up each sales manager's total profit for 2019
sql = r"SELECT MANAGER, SUM(PROFIT) as TotalProfit FROM orders where FY='2019' group by MANAGER"
# 3 Execute sql statement to get statistical query results
df = pd.read_sql_query(sql, conn)

IV. Global Variable Configuration

(1) Font Canvas Configuration

This font canvas settings in the use of matplotlib drawing can be placed in the library after the import, as a fixed setting, the parameters of which have been introduced in the previous plot () function has made the introduction of drawing detailed please see the previous article.

['-serif'] = 'SimHei' # Set Chinese fonts to support Chinese display
['axes.unicode_minus'] = False # Support displaying the '-' sign in Chinese fonts
 
# figure Resolution 800x600
[''] = (6,4)  # 8x6 inches
[''] = 100        # 100 dot per inch

(2) Title and label settings

title() is the title setting, ylael() sets the label for the y-axis, grid() gridlines setting

#tags, title settings
("Total profit per sales manager for 2019")
("Amount of profit")
('Manager')
#Gridline Settings
(axis='y')

Introduction to gridline setup parameters:

() # Display gridlines 1=True=Default display; 0=False=No display
(1) # Display gridlines
(True) # Display gridlines
(b=True) # Display gridlines
(b=1) # Display gridlines
(b=True, axis='x') # Show only x-axis gridlines
(b=True, axis='y') # Show only y-axis gridlines
(b=1, which='major') # The default ismajor,for examplexThe maximum value of the axis is3.5(This is a very small percentage of the value,If it doesn't affect the drawing.),This part of the image will not be displayed;aswhich='both'failing agreement;as设置为minorthen the grid is not displayed(Actually, there's something I don't understand here.,,Since it doesn't show,So why not just set it tob=0particle signaling a pause, emphasize the preceding words and allow the listener time to take them on board????)

V. Database data mapping

(1) Drawing function calls and make graphs

The value corresponding to each manager is drawn onto the image by a for loop:

Display of #y-axis values
for index,value in df['TotalProfit'].items():
    (index,value,round(value),ha='center',va='bottom',color='k')
# Carry out the x, y carry over from the results of the above query
(df['MANAGER'], df['TotalProfit'])

Make a graph as shown:

(2) Full code

import pymysql
import pandas as pd # Use for data import (pd.read_sql_query() executes sql statement to get result df)
import  as plt # Used to draw graphs (() line graphs, () bar graphs, ....)
['-serif'] = 'SimHei' # Set Chinese fonts to support Chinese display
['axes.unicode_minus'] = False # Support displaying the '-' sign in Chinese fonts
# figure Resolution 800x600
[''] = (6,4)  # 8x6 inches
[''] = 100        # 100 dot per inch
# Establish connection
conn = (host='localhost',port=3306,user='root',password='9812yang',db='mydb')
# Setting up query statements
sql = r"SELECT MANAGER, SUM(PROFIT) as TotalProfit FROM orders where FY='2019' group by MANAGER"
# Execute sql statement to get the result of the statistics query and assign it a value
df = pd.read_sql_query(sql, conn)
#call function
(df['MANAGER'], df['TotalProfit'])
# Setting the gridlines for the y-axis
(axis='y')
#Set Title
("Total profit per sales manager for 2019")
#y-axis labels
("Amount of profit")
#x-axis labels
("Name of Manager")
# Write the corresponding values to the bar graph
for index,value in df['TotalProfit'].items():
    (index,value,round(value),ha='center',va='bottom',color='k')

to this article on Python connect to the database using matplotlib to draw bar graphs to this article, more related Python matplotlib bar graph content please search my previous posts or continue to browse the following related articles I hope you will support me in the future!