SoFunction
Updated on 2025-03-03

Detailed explanation of file compression and decompression in Java component development

origin

During the development of server-side project, various different data needs to be dealt with. Sometimes, in order to facilitate the download uploaded by users, the data will be packaged in the form of a compressed package and transmitted, so you will encounter the problems of compression and decompression. With so many open source components nowadays, compression itself is not a complicated task. You only need to find some suitable maven libraries to solve our problem. So why do we still need to make a common component that handles compression? Let's imagine some of the needs:

  • The compressed package contains many different files, and we only need to decompress some of the types of data;
  • Only files that meet the conditions in the directory to be compressed need to be added to the compressed package;
  • There are various types of compression formats and need to be adapted separately for each;
  • ...

If you analyze these requirements, you will find that these are all general functions. If encapsulated, the business party only needs to simply call, saving a lot of duplicate function development.

design

Through the analysis of the above requirements, we can see that requirements are roughly divided into two categories:

  • File filtering;
  • Compatible with different compressed file types;

OK, then we can continue to implement our components according to the needs.

How to deal with file filtering? For a public component, it is not smart enough to understand the needs of all business parties, and there is no need to understand it, because demand changes are like the sea. How can we achieve everything? Do you think so? Since that's the case, it can only expose the filtering function to the business party, and then receive the filtered results in the component and perform corresponding processing. Well...perfect! So how to achieve it? I believe that you may think of file traversal() in java8. Yes, we thought of one piece. We can follow this writing method and let the business party pass in the implementation of FileVisitor, so that the caller can understand how to use this public component with minimal awareness. This will make it easier when pushing later, do you think so?

On code:

/**
 * Compression example, all files in the directory are compressed without filtering
 */
(new File(parent), new File(parent, ""));
/**
 * Decompression example, decompress all, no filtering is required
 */
(new File(parent,""), new File(parent,"test"), true);
/**
 * Compression example, filter by demand
 */
(new File(parent), new File(parent, ""), new FileVisitor<Path>() {
    @Override
    public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) throws IOException {
        if (().toString().equals("testq")) {
            return FileVisitResult.SKIP_SUBTREE;
        }
        return ;
    }

    @Override
    public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
        if (().toString().endsWith("zip")) {
            return ;
        }
        return ;
    }

    @Override
    public FileVisitResult visitFileFailed(Path file, IOException exc) throws IOException {
        return ;
    }

    @Override
    public FileVisitResult postVisitDirectory(Path dir, IOException exc) throws IOException {
        return ;
    }
});
/**
 * Decompression example, filter by demand
 */
(new File(path), new SimpleFileVisitor<>() {
                @Override
                public FileVisitResult preVisitDirectory(FileEntry dir, BasicFileAttributes attrs) throws IOException {
                    File file = new File(parent, ());
                    if (!()) {
                        ();
                    }
                    return (dir, attrs);
                }

                @Override
                public FileVisitResult visitFile(FileEntry fileEntry, BasicFileAttributes attrs) throws IOException {
                    ("visitFile:" + ());
                    if (().endsWith("jpg")){
                        File file = new File(parent, ());
                        (file);
                    }
                    return (fileEntry, attrs);
                }

                @Override
                public FileVisitResult visitFileFailed(FileEntry<?> file, IOException exc) throws IOException {
                    //Read the entire path of the compressed package of the exception                    String path2 = ();
                    return ;
                }
            });

How about it, is it very convenient and flexible to use?

The file filtering problem has been solved, so let's go further and be compatible with the more commonly used compressed file types. It is impossible to write by yourself, don’t forget that we are calling engineers. In order to support commonly used formats such as zip/7z/rar/jar/tar/, we have to import several open source libraries:

  • (zip/7z/rar/jar/tar);
  • commons-io ()

Let's add a breakdown. Here we will give you a key praise for the significant performance advantages in decompressed file filtering. Where does the performance advantages come from? Here we will talk about how commons-io handles file streams that need to be skipped. It is really skipped, just like RandomAccessFile's seek function, while commons-io reads the skipped file streams completely, but does not process them. Which performance will be better? You are smart and you must have an answer. In fact, I have also considered changing the operation of reading streams in commons-io to skip. It is technically possible, but I am lazy... The idea is probably to use RandomAccessFile or SeekableByteChannel to achieve Seek jump. I feel that RandomAccessFile may be easier, after all, it is the same thing as the API used in it.

OK, we are already compatible with several different compression formats, so we can automatically use matching compression processing classes according to different file types. The caller can be completely insensible, which is very nice! Of course, I am lazy now, and it matches through the file extension. The better way is to match through the file meta information, such as magic number. There is no way, I'm just lazy, and it's really enough to use it now, haha.

This is the end of this article about the detailed explanation of file compression and decompression of Java component development. For more related Java file compression and decompression content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!