SoFunction
Updated on 2024-10-29

jupyter read error format file solution

Error reading xml file with pandas

“ Unsupported format, or corrupt file: Expected BOF record; found b'<?xml ve' ”

Solution:

Convert the file format, use excel to open the xml file Select: File-> Save as ----> pop-up box

After saving, use pandas again to read the file in the corresponding format.

Supplement:

existjupyterretrieveCSVfile appears when the‘utf-8' codec can't decode byte 0xd5 in position 0: invalid continuation bytecure

import import pandas as pd

The following error occurs when reading a csv file using pd.read_csv():

UnicodeDecodeError: ‘utf-8' codec can't decode byte 0xd5 in position 0: invalid continuation byte

Reason for appearance:

CSV files are not encoded in UTF-8, but in gbk. jupyter-notebook uses the Python interpreter's system encoding, which defaults to UTF-8.

There are two ways to solve this

The first:

1. find the use of csv files ---> right mouse button ---> open the way ----> select Notepad

2. Open the file select "File" -----> "Save As", we can see the default encoding is: ANSI, select UTF-8 to save a copy, and then use pd.read_csv () to open it will not be saved!

The second:

Encoding when reading CSV files using ()

(filename,encoding='gbk')

For example:

The above is a personal experience, I hope it can give you a reference, and I hope you can support me more. If there is any mistake or something that has not been fully considered, please do not hesitate to give me advice.