Detailed explanation of how to download super large files by JavaScript

This article starts from the front-end aspect to realize the function of downloading large files in the browser. The transmission interruption caused by network abnormalities, web page closure, etc. is not considered. Used to download fragmentsSerialMethod (parallel download requires computing hash for slices, comparing hash, losing retransmission, merging chunks in sequence, etc., which is very troublesome. It has the pursuit of transmission speed and can be done if bandwidth allows.Parallel shard download）。

The test found that after saving about one or two G of data to IndexedDB, the browser will indeed have too high memory usage and exit (I used the Chrome103 version browser for testing)

Implementation steps

Download using shards:Split large files into multiple small pieces for downloading, which can reduce the risk of memory usage and network transmission interruption. This can avoid performance problems caused by downloading the entire large file at once.
Breakpoint continues:Implement the breakpoint continuous transmission function, that is, after the download is interrupted during the download, you can continue to download from the downloaded part without re-downloading the entire file.
The progress bar displays:Display the download progress on the page, allowing users to clearly see the progress of file download. If you download all the parameters in one go, you can directly get the parameters from the process (very fine). If it is a shard download, it will also calculate the total size of the downloaded sum, but the downloaded will increase in pieces (not very fine).
Cancel download and pause download functions:Provides buttons to cancel download and pause download, allowing users to abort or pause the download process as needed.
Merge files:After the download is complete, merge all shard files into one complete file.

The following is an implementation example of a basic front-end large file download:

You can add a callback function to the class to update some external states. In the example, only the callback after download is displayed.

class FileDownloader {
  constructor({url, fileName, chunkSize = 2 * 1024 * 1024, cb}) {
     = url;
     = fileName;
     = chunkSize;
     = 0;
     = 0;
     = 0;
     = 0;
     = [];
     = new AbortController();
     = false;
     = cb
  }

  async getFileSize() {
    const response = await fetch(, { signal:  });
    const contentLength = ("content-length");
     = parseInt(contentLength);
     = ( / );
  }

  async downloadChunk(chunkIndex) {
    const start = chunkIndex * ;
    const end = (, (chunkIndex + 1) *  - 1);

    const response = await fetch(, {
      headers: { Range: `bytes=${start}-${end}` },
      signal: 
    });

    const blob = await ();
    [chunkIndex] = blob;
     += ;

    if (! &amp;&amp;  &lt;  - 1) {
      ++;
      ();
    } else if ( ===  - 1) {
      ();
    }
  }

  async startDownload() {
    if ( === 0) {
      await ();
    }
    ();
  }

  pauseDownload() {
     = true;
  }

  resumeDownload() {
     = false;
    ();
  }

  cancelDownload() {
    ();
    ();
  }

  async mergeChunks() {
    const blob = new Blob(, { type: "application/octet-stream" });
    const url = (blob);
    const a = ("a");
     = url;
     = ;
    (a);
    ();
    setTimeout(() =&gt; {
       &amp;&amp; ({
        downState: 1
      })
      ();
      (a);
      (url);
    }, 0);
  }
  
  reset() {
     = [];
     = '';
     = 0;
     = 0;
     = 0;
     = 0;
  }
}


// Use exampleconst url = "/";
const fileName = "";

const downloader = new FileDownloader({url, fileName, cb: });

// Update statusupdateData(res) {
  const {downState} = res
   = downState
}

// Start downloading();

// Pause download// ();

// Continue to download// ();

// Cancel download// ();

How to achieve breakpoint continuous transmission in shard download? How to store downloaded files?

The browser's security policy prohibits web pages (JS) from directly accessing and operating file systems on the user's computer.

During the shard download process, each downloaded file chunk needs to be cached or stored on the client, which facilitates the implementation of the breakpoint continuous transmission function, and also facilitates the subsequent merge of these file chunks into complete files. These file blocks can be temporarily stored in memory or stored in the client's local storage (such as IndexedDB, LocalStorage, etc.).

Generally speaking, in order to avoid taking up too much memory, it is recommended to temporarily save file blocks in the client's local storage. This ensures that performance issues will not occur due to excessive memory usage when downloading large files.

In the example code provided above, the file block is temporarily saved in an array, and finallymergeChunks()In the method, merge these file blocks into complete files. If you want to save the file blocks in local storage, you can modify the code as needed to save the file blocks to IndexedDB or LocalStorage.

IndexedDB local storage

IndexedDB Documentation:IndexedDB_API

IndexedDB browser storage restrictions and cleaning standards

Traceless mode is a privacy protection feature provided by the browser. It will automatically clear all browsing data after the user closes the browser window, including data in LocalStorage, IndexedDB and other storage mechanisms.

IndexedDB data is actually stored in the browser's file system and is one of the browser's privacy directories. Different browsers may have different storage locations. Ordinary users cannot directly access and manually delete these files because they are restricted by the browser's security. Can be useddeleteDatabaseMethod to delete the entire database, or usedeleteObjectStoreMethod to delete data in a specific object storage space.

The native indexedDB API is very troublesome to use. If you are not careful, various problems will arise. It is convenient to encapsulate it for later use.

This class encapsulates common operations of IndexedDB, including opening a database, adding data, obtaining data through ID, obtaining all data, updating data, deleting data and deleting data tables.

Encapsulate indexedDB class

class IndexedDBWrapper {
  constructor(dbName, storeName) {
     = dbName;
     = storeName;
     = null;
  }

  openDatabase() {
    return new Promise((resolve, reject) => {
      const request = ();
      
       = () => {
        ("Failed to open database");
        reject();
      };

       = () => {
         = ;
        resolve();
      };

       = () => {
         = ;
        
        if (!()) {
          (, { keyPath: "id" });
        }
      };
    });
  }

  addData(data) {
    return new Promise((resolve, reject) => {
      const transaction = ([], "readwrite");
      const objectStore = ();
      const request = (data);

       = () => {
        resolve();
      };

       = () => {
        ("Failed to add data");
        reject();
      };
    });
  }

  getDataById(id) {
    return new Promise((resolve, reject) => {
      const transaction = ([], "readonly");
      const objectStore = ();
      const request = (id);

       = () => {
        resolve();
      };

       = () => {
        (`Failed to get data with id: ${id}`);
        reject();
      };
    });
  }

  getAllData() {
    return new Promise((resolve, reject) => {
      const transaction = ([], "readonly");
      const objectStore = ();
      const request = ();

       = () => {
        resolve();
      };

       = () => {
        ("Failed to get all data");
        reject();
      };
    });
  }

  updateData(data) {
    return new Promise((resolve, reject) => {
      const transaction = ([], "readwrite");
      const objectStore = ();
      const request = (data);

       = () => {
        resolve();
      };

       = () => {
        ("Failed to update data");
        reject();
      };
    });
  }

  deleteDataById(id) {
    return new Promise((resolve, reject) => {
      const transaction = ([], "readwrite");
      const objectStore = ();
      const request = (id);

       = () => {
        resolve();
      };

       = () => {
        (`Failed to delete data with id: ${id}`);
        reject();
      };
    });
  }

  deleteStore() {
    return new Promise((resolve, reject) => {
      const version =  + 1;
      ();

      const request = (, version);

       = () => {
         = ;
        ();
        resolve();
      };

       = () => {
        resolve();
      };

       = () => {
        ("Failed to delete object store");
        reject();
      };
    });
  }
}

Example using indexedDB class

const dbName = "myDatabase";
const storeName = "myStore";

const dbWrapper = new IndexedDBWrapper(dbName, storeName);

().then(() => {
  const data = { id: 1, name: "John Doe", age: 30 };

  (data).then(() => {
    ("Data added successfully");

    (1).then((result) => {
      ("Data retrieved:", result);

      const updatedData = { id: 1, name: "Jane Smith", age: 35 };
      (updatedData).then(() => {
        ("Data updated successfully");

        (1).then((updatedResult) => {
          ("Updated data retrieved:", updatedResult);

          (1).then(() => {
            ("Data deleted successfully");

            ().then((allData) => {
              ("All data:", allData);

              ().then(() => {
                ("Object store deleted successfully");
              });
            });
          });
        });
      });
    });
  });
});

IndexedDB usage library - localforage

This library encapsulates several ways of browser local storage and automatically downgrades. However, it doesn't feel good to use indexedDB, and you can't add indexes, but the operation is indeed much more convenient.

Document address:localforage

The following shows the use of IndexedDB storage engine in LocalForage and combined with itasync/awaitPerform asynchronous operations

const localforage = require('localforage');

// Configure LocalForage({
  driver: , // Use IndexedDB storage engine  name: 'myApp', // Database name  version: 1.0, // Database version  storeName: 'myData' //Storage table name});

// Use async/await for asynchronous operation(async () =&gt; {
  try {
    //Storage data    await ('key', 'value');
    ('Data saved successfully');

    // Get data    const value = await ('key');
    ('The data obtained is:', value);

    // Remove data    await ('key');
    ('Data removal was successful');
    
    // Close IndexedDB connection    await ();
    ('IndexedDB closed');
  } catch (err) {
    ('Operation failed', err);
  }
})();

Modern browsers automatically manage the life cycle of IndexedDB connections, including automatically closing the connection when the page is closed, in most cases, there is no need to explicitly open or close the IndexedDB connection.

If you have special needs or have higher performance requirements, you can use()Method to close the connection.

Use LocalForage to delete all data in IndexedDB

import localforage from 'localforage';

// Use the clear() method to delete all data()
  .then(() =&gt; {
    ('All data in IndexedDB has been deleted');
  })
  .catch((error) =&gt; {
    ('An error occurred while deleting IndexedDB data:', error);
  });

The problem of using IndexedDB memory too high temporarily

There are many reasons why using IndexedDB may cause increased browser memory usage, and here are some possible reasons:

Too large data volume: If you store a lot of data in IndexedDB, the browser may need to consume more memory to manage and process this data. Especially when large amounts of data are read or written, the memory usage will increase significantly.
Unclosed connection: If the database connection is not closed correctly after using IndexedDB, memory leaks may occur. Make sure that the database connection is properly closed when you no longer need to use IndexedDB to free up the memory occupied.
Indexing and querying: If you create a large number of indexes in IndexedDB or perform complex query operations, it will lead to an increase in browser memory footprint, especially when dealing with large data sets.
cache: The browser may cache data in IndexedDB to improve access speed. This may lead to an increase in memory footprint, especially after large-scale data operations.
Browser implementation: There may be differences in the IndexedDB implementations of different browsers, and some browsers may take up more memory when processing IndexedDB data.

In order to reduce memory usage, you can consider optimizing the data storage structure, using indexes reasonably, and avoiding maintaining large data sets for a long time. In addition, using browser developer tools for memory analysis can help you find the specific reasons for the increase in memory usage and thus take corresponding optimization measures.

The above is a detailed explanation of the method of JavaScript to download super large files. For more information about JavaScript downloading super large files, please pay attention to my other related articles!