SoFunction
Updated on 2025-04-11

How to implement the encoding format of ignoring file when reading csv files

1. Background introduction

When we read csv files on a daily basis, we often find that there are many formats of csv files. The common ones are [UTF-8]\[GBK]\[ANSI] format. When we read it, we will add encoding="xx" parameter. In order to facilitate us, we can use () to detect file encoding.

  • Coding detection: Automatically detect file encoding through chardet to ensure the correct reading of file content.
  • Exception handling: Handle possible encoding errors and provide alternative UTF-8 encoding reading scheme.
  • CSV processing: Use the standard library csv module to read and print the content of CSV file, including table headers and data rows.

2. Installation of the library

Library use Install
csv Reading and writing of csv files Built-in library does not require installation

3. Core code

①: The encoding format to arrive

def detect_encoding(file_path):
    with open(file_path, 'rb') as f:
         raw_data = ()
         result = (raw_data)
         return result['encoding']

②: Call the detect_encoding function to obtain the encoding format of the file

def main():
    file_path = 'Create a new XLSX worksheet.csv'
    encoding = detect_encoding(file_path)

    try:
        read_csv(file_path, encoding)
    except UnicodeDecodeError:
        # If the detected encoding format read fails, try to read using UTF-8 encoding        try:
            read_csv(file_path, 'utf-8')
        except Exception as e:
            print(f"An error occurred while reading a file: {e}")
    except Exception as e:
        print(f"An error occurred while reading a file: {e}")
    

4. Complete code

# -*- coding: UTF-8 -*-
'''
 @Project : Test
 @File: test2_read_csv.py
 @IDE: PyCharm
 @Author: A while Xiaotianhuan (278865463@)
 @Date: 2025/3/1 21:40
 '''

import csv
import chardet


def detect_encoding(file_path):
    with open(file_path, 'rb') as f:
        raw_data = ()
        result = (raw_data)
        return result['encoding']


def read_csv(file_path, encoding):
    with open(file_path, 'r', encoding=encoding) as f:
        reader = (f)
        head = next(reader)
        print("Table", head)
        for row in reader:
            print(row)


def main():
    file_path = 'Create a new XLSX worksheet.csv'
    encoding = detect_encoding(file_path)

    try:
        read_csv(file_path, encoding)
    except UnicodeDecodeError:
        # If the detected encoding format read fails, try to read using UTF-8 encoding        try:
            read_csv(file_path, 'utf-8')
        except Exception as e:
            print(f"An error occurred while reading a file: {e}")
    except Exception as e:
        print(f"An error occurred while reading a file: {e}")


if __name__ == "__main__":
    main()

This is the article about how Python implements the encoding format of ignoring the file when reading csv files. For more related Python reading csv content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!