SoFunction
Updated on 2025-03-01

Python reads super large files based on read(size) method

Python is very convenient to read files, but if the file is large and it is still a single line of files, it will be a pain. Fortunately, there is a read(size) method, which is to read the size data into memory every time it is read.

Here is an example

def readlines(f, separator):
  '''
   Read large files
   :param f: file handle
   :param separator: The separator for each line
   :return:
   '''
  buf = ''
  while True:
    while separator in buf:
      position = (separator) # The position of the delimiter      yield buf[:position] # Slice, from the start position to the delimiter position      buf = buf[position + len(separator):] # Slice again, cut out the yield data, and retain the remaining data
    chunk = (4096) # Read 4096 data into buf at one time    if not chunk: # If no data is read      yield buf # Return data in buf      break # Finish    buf += chunk # If read has data, add the read data to buf

with open('',encoding='utf-8') as f:
  for line in readlines(f,'|||'):
    # Why can readlines function be traversed using for loop? Because this function has the yield keyword, it is a generator function...    print(line)

Test files

fgshfsljflsjfls|||fyhdiyfdfhn|||fudofdb Qin hardcore jdlfdl|||tedstthfdskfdk

Print results

fgshfsljflsjfls
fyhdiyfdfhn
fudofdb Qin hardcore jdlfdl
tedsthfdskfdk

The above is all the content of this article. I hope it will be helpful to everyone's study and I hope everyone will support me more.