I. Download and use of third-party modules
1、What is the third-party module
- Third-party modules are modules written by others
- Third-party modules are generally more powerful
2、How to install third-party modules
Way 1: pip tool
Mounting method:
1, open the installation directory of the python interpreter, find the Scipts directory, the directory will be under the program, this program is used to install third-party modules
2. Add the Scipts file directory corresponding to the python version to the system environment variable.
3, open the cmd command prompt window, run in the cmd command window to download the third-party module of the sentence
Sentences for downloading third-party modules pip install module name Download third-party modules for temporary switching of repositories pip install module name -i Warehouse Address Download the specified version of the third-party module(Unspecified defaults to the latest version) pip install module name==version number -i Warehouse Address
Caution.
We may have multiple versions of the python interpreter installed on our computer, and each version has the pip tool, when we use pip to install a module, it needs to be downloaded in the pip tool of the version we are using, otherwise the downloaded module will not run
Way 2: Download in pycharm
Installation.
1, in pycharm click on the upper left corner File
2. Find Settings in the catalog below.
3. Find Project and open python Interpreter
4. Click on the '+' sign and enter the module you want to download in the window above.
5. Click Install Package and wait for the download to complete.
Caveats:
1, in the installation window on the right to find Specify versin can choose to download the module version number
2、Manage Repositories can be configured warehouse address
3. Cautions
1、Error report with warning message
WARNING: You are using pip version 20.2.1;
- The reason for this is that the pip version is too low, so you just need to copy the commands that follow and do an update.
d:\python38\ -m pip install --upgrade pip
- Just run the command to download the third-party module again after the update is complete
2、Error report, prompt keywords
Timeout
- This keyword indicates that the current computer network is unstable, just replace the network or wait for the network to stabilize the download can be
3. Error reporting, no keywords
Baidu-facing search
- Simply copy the error message to Baidu search.
- Usually it is necessary for the user to prepare some download environment in advance to be able to download smoothly!
4、Slow download speed
The default download addresses of pip are all foreign, we just need to switch the download addresses
- The way to switch download addresses is explained above.
- Frequently Downloaded Addresses
Tsinghua University :
/simple/Aliyun:
/pypi/simple/University of Science and Technology of China :
/simple/Huazhong University of Science and Technology:
/Doujinshi Source:
/simple/Tencent Source:
/pypi/simpleHuawei Mirror Source:
/repository/pypi/simple/
Second, the network crawler of the requests module
1. Introduction
- The renews module is a third-party module and needs to be installed additionally.
- It is a module for web requests, mainly used to send requests to the browser
2. Usage
Import Module.
import requests
1. Keywords: get( )
Function: Used to send a request to the browser
Code Usage.
url = "" res = (url)
2. Keywords: encoding
Function: Specify the encoding format, for some old websites, when you get the information of the webpage, if you don't specify the encoding format, it may be messy. If you don't specify the encoding format, it will use the system's encoding environment by default.
Code Usage.
res = encoding = 'utf8'
3. Keywords: content
What it does: it returns a native string of type bytes
Code Usage.
print() # The return is a native string of type bytes
4. Keywords: text
Function: Get web page data of string type (default is utf8).
Code Usage.
print() # Get web page data as a string (default is utf8)
5. Keywords: url
Role: Print request
Code Usage.
print()
6. Keywords: status_code
Function: Prints the status code
Code Usage.
print(res.staus_code)
Third, the network crawler practice
1、Crawl the chain home website housing information
import re import requests from openpyxl import Workbook url = '/ershoufang/pg1/' for i in range(1, 101): url = f"/ershoufang/pg{i}/" print(url) res = (url) url_data = home_biaoqian = ( """data-is_focus="" data-sl="">(.*?)</a>""", url_data) home_xiaoqu_name = ( """<a href="https:.*?" rel="external nofollow" target="_blank" data-log_index=".*?" data-el=".*?">(.*?)</a>""", url_data) home_xiaoqu_dir = ("""<a href="/ershoufang/.*?/" rel="external nofollow" target="_blank">(.*?)</a>""", url_data) home_jutixinxi = ("""<div class="houseInfo"><span class="houseIcon"></span>(.*?)</div>""", url_data) home_guanzhudu = ("""<div class="followInfo"><span class="starIcon"></span>(.*?)</div>""", url_data) home_zongjia = ("""<span class="">(.*?)</span>""", url_data) home_danjia = ("""<span>(.*?)</span>""", url_data) home_data = zip(home_xiaoqu_name, home_xiaoqu_dir, home_biaoqian, home_jutixinxi, home_guanzhudu, home_zongjia, home_danjia) with open(r'sh_.txt', 'w', encoding='utf8') as f: for i in home_data: (''' Subdivision name: %s Subdivision Address: %s Label of the neighborhood: %s Details:%s Attention: %s Total Price:%s Unit Price:%s\r '''%i)
IV. openpyxl module
1. Introduction
- openpyxl module belongs to a third-party module, is a module in python can handle excel file, there are more famous is xlrd, xlwt respectively control excel file read and write, these two can be compatible with all versions of the file. openpyxl for excel 2003 version of the compatibility may not be good before, but the function is more powerful! The following is a list of some of the most powerful programs available.
Caution.
Version and suffix of excel file
Prior to the 2003 version, excel files had an xls suffix.
After 2003 version excel file extension is xlsx, csv
2. Creation of documents related
2. 1. Create excel file
Keywords: workbook
Role:
- Use openpyxl to create the file. Simply import the Workbook class
Code Usage.
import module: form openpyxl import workbook Code Usage: from openpyxl import Workbook wb = Workbook # An excel file was created successfully
2. 2. Create a workbook
Keywords: cerate_sheet( )
Role:
- Using the openpyxl creator you can create workbooks with customized names. The parameter in parentheses is the name of the workbook, and another parameter is the location of the workbook, which can also be left out of the build file. Just import the Workbook class
- Multiple workbooks can be created and arranged in sequence
Code Usage.
import module: form openpyxl import workbook Code Usage: from openpyxl import Workbook wb = Workbook() ws1 = wb.cerate_sheet('Workbook name', 0)
sheet is automatically generated
2. 3. Modify the name of the workbook
Keywords: title
Role:
- You can change the name of the workbook
- Give a new name to the back of the workbook by 'dotting' the workbook with the name to be changed.
Code Usage.
import module: form openpyxl import workbook Code Usage: from openpyxl import Workbook wb = Workbook() ws1 = wb.cerate_sheet('User information table', 0) = 'user_infor'
2. 4. Modify the color of the workbook
Keywords: sheet_properties.tabColor
Role:
- Used to modify the color of the workbook background, need to use the RGB color gamut
Code Usage.
import module: from openpyxl import Workbook wb = Workbook() ws1 = wb.create_sheet('User information table', 0) ws1.sheet_properties.tabColor = 'FF6666' # Change title background to specified RRGGBB color code
2. 5. View all workbooks of the document
Keywords: sheetname
Role:
- Viewing all workbooks under an exexl file returns a list of
Code Usage.
import module: from openpyxl import Workbook Code Usage: wb = Workbook() ws1 = wb.create_sheet('uese_zhangzhang') ws2 = wb.create_sheet('user_kangkang') print() # ['user_zhangzhang', 'user_zhangzhang']
3、Write content related
3. 1. Write/modify content
Keywords: ws[]
Role:
- The content can be modified by 'pointing' through the workbook at the position where the content is filled in the back center brackets
Code Usage.
import module: from openpyxl import Workbook Code Usage: wb = Workbook() ws1 = wb.create_sheet('uese_zhangzhang') ws1['A1'] = 'kangkang' # ws1['A1'] represents the A1 position in the table, and we can assign the data directly to it
Keywords: cell( )
Role:
- Content can be modified by 'pointing' through the workbook and filling in the location of the content within the back parameters
- row: rows
- colum: column
- value: value (data for the corresponding position)
Code Usage.
import module: from openpyxl import Workbook Code Usage: wb = Workbook() ws1 = wb.create_sheet('uese_zhangzhang') (row = 1, column= 1, value = 'kangkang') # In the first row, first column, enter the value 'kangkang'.
3. 2. Write multiple data at once
Keywords: append()
Role:
- Multiple data values can be written at once, by means of rows, and data values are written as a list within the parameter after the keyword
Code Usage.
import module: from openpyxl import Workbook Code Usage: wb = Workbook() ws = wb.cerate_sheet('user_name', 0) (['Serial number', 'Name', 'Gender', 'Age']) (['1', 'kangkang', 'Male', '18'])
4. Read content related
4. 1. Access to multiple cells
Usage: Direct use of slicing operations
Code Usage:
x = ws['A1':'C2'] x1 = ws['C'] x2 = ws['C:D'] x3 = ws[10] x4 = ws[5:10]
4. 2. Get the value of the cell
Keywords: values
Role:
- Use a for loop to get the entire line of text data at once
Code Usage.
import module: from openpyxl import Workbook Code Usage: wb = Wordbook() ws1 = wb.create_sheet('nser_name', 0) (['name', 'age', 'hobby']) (['kangkang', 18, 'read']) for i in : print(i)
5. Saving documents
- When we are done editing the file, we need to save the file and also give it a name
Keywords: save( )
Role:
- Fill in the parameters after the keyword with the address where the file will be saved, and also give the file name.
Code Usage.
wb = Workbook() ('user_infor.xlsx') # is the path where it will be saved, i.e. the filename. # You have to save it after editing to make it work.
6. Open the file
The previous section describes how we create a file, edit the created file, and edit the created file.
What follows is how we can open an existing file and edit it
load_workbook
Module Import:
from openpyxl import load_workbook
6.1. Reading workbook data
Way one:
from openpyxl import load_workbook wb = load_workbook('ex_a.xlsx', read_only=True, data_only=True) ws = wb['User information table'] # Getting a workbook print(ws['A1'].value) # Fetch the value at position A1 in the workbook object
Way one:
from openpyxl import load_workbook wb = load_workbook('ex_a.xlsx', read_only=True, data_only=True) ws = wb['User information table'] print((row=2, column=1).value) # Taking values by rows and columns
Mode III: (line-by-line reading)
Mode 3(line by line reading) from openpyxl import load_workbook wb = load_workbook('ex_a.xlsx', read_only=True, data_only=True) ws = wb['User information table'] for row in : # Get the data for each row for data in row: # Get the data of the cells in each row print() # Printing cell values
Above is the detailed content of Python basic third-party module requests openpyxl, more information about Python module requests openpyxl please pay attention to my other related articles!