SoFunction
Updated on 2025-03-02

Python method to implement Chinese conversion url encoding

This article describes the method of implementing Chinese conversion url encoding in Python. Share it for your reference, as follows:

Today I want to deal with Baidu Tieba stuff. If you want to make a list of keywords, add it directly to the list every time you need it. However, if it is added to the list in Chinese (such as 'Lijiang'), the address code of the url is '%E4%B8%BD%E6%B1%9F', so a conversion is needed. Here we use the module urllib.

>>> import urllib
>>> data = 'Lijiang'
>>> print data
Lijiang
>>> data
'\xe4\xb8\xbd\xe6\xb1\x9f'
>>> (data)
'%E4%B8%BD%E6%B1%9F'

So what do we want to go back?

>>> ('%E4%B8%BD%E6%B1%9F')
'\xe4\xb8\xbd\xe6\xb1\x9f'
>>> print ('%E4%B8%BD%E6%B1%9F')
Lijiang

Careful students will find that the %C0%F6%BD%AD appears in the Tieba URL, rather than '%E4%B8%BD%E6%B1%9F', which is actually a coding problem. Baidu's gbk, and other general websites such as Google are from UTF8. Therefore, it can be implemented using the following statement.

>>> import sys,urllib 
>>> s = 'Lijiang'
>>> (().encode('gbk'))
'%C0%F6%BD%AD'
>>> (().encode('utf8'))
'%E4%B8%BD%E6%B1%9F'
>>>

For more information about Python-related content, please check out the topic of this site:Summary of Python image operation skills》、《Python data structure and algorithm tutorial》、《Summary of Python Socket Programming Tips》、《Summary of Python function usage tips》、《Summary of Python string operation skills》、《Python introduction and advanced classic tutorials"and"Summary of Python file and directory operation skills

I hope this article will be helpful to everyone's Python programming.