This article example describes how Python uses chardet to determine the character encoding. Shared for your reference. Specific analysis is as follows:
Python chardet used to implement string/file encoding detection templates
1、Download and install chardet
Download: /pypi/chardet
After downloading chardet, unzip the chardet archive, put the chardet folder directly under the application directory, then you can use import chardet to start using chardet, or you can copy chardet to the Python system directory, so that all of your python programs just use import chardet.
python install
2. Examples
In use, () returns the dictionary, where confidence is the precision of detection and encoding is the form of encoding
(1) Web page encoding judgment:
>>> import urllib >>> rawdata = ('/').read() >>> import chardet >>> (rawdata) {'confidence': 0.98999999999999999, 'encoding': 'GB2312'}
(2) Document encoding judgment
import chardet tt=open('c:\\','rb') ff=() #Trying to switch to read(5) here works fine, but switching to readlines() reports an error enc=(ff) print enc['encoding'] ()
I hope that what I have described in this article will help you in your Python programming.