How to use Zlib compression and decompression in c#

introduce

Recently, the archive editing tool for developing a game using C# requires the decompression of data using the Zlib standard Deflate algorithm. After a walk around *, I found that the most commonly used methods in c# are SharpZipLib provided by Microsoft and ICSharpCode. I briefly tested and packaged it, and I will share the results and my personal opinions here.

Generally speaking, when using C# development, try to use the tools provided by Microsoft as much as possible. One is that there will be fewer bugs and more stable maintenance. In addition, the solutions provided by the official are often higher than those provided by third-party tools in terms of optimization.

Although Zlib is also used for compression and decompression in the Deflate format since .NET Framework 4.5, the compression and decompression results are different from other Zlib libraries after testing.
If you look closely, you will find that the data compressed with DeflateStream starts with two bytes less than the data compressed by Zlib and ends with four bytes less than Zlib; this output format is called Raw Deflate.
After verification, the DeflateStream provided by C# can only compress or decompress this Raw Deflate, but cannot handle the standard Zlib Deflate format (but it is said that it can be generated by itself); but in turn, Zlib can process or generate this Raw Deflate that does not contain head and tail data.
Of course, you can also choose to manually add header and trailer. For details, you can read the reference materials at the end of the article. Since it is not particularly important, I am just lazy.

Here is my code for compressing and decompressing data using this method simply wraps:

// Use Deflate compressionpublic static byte[] MicrosoftCompress(byte[] data)
{
    MemoryStream uncompressed = new MemoryStream(data); // Here is an example using data in memory; if text needs to be compressed, use FileStream    MemoryStream compressed = new MemoryStream();
    DeflateStream deflateStream = new DeflateStream(compressed, ); // Note: The first parameter here is where the compressed data should be output    (deflateStream); // Use CopyTo to enter the data that needs to be compressed at one time; you can also use Write for partial input    ();  // In Close, Finish and Flush operations will be performed successively.    byte[] result = ();
    return result;
}

// Use Deflate to decompresspublic static byte[] MicrosoftDecompress(byte[] data)
{
    MemoryStream compressed = new MemoryStream(data);
    MemoryStream decompressed = new MemoryStream();
    DeflateStream deflateStream = new DeflateStream(compressed, ); // Note: The first parameter here is also filled in the compressed data, but this time it is used as input data    (decompressed); 
    byte[] result = ();
    return result;
}

It is an open source third-party tool with a very small size. After my limited research and understanding, this library is actually more like a semi-finished product, and many of its functions are incomplete, but the advantage is that it is very lightweight and has the same effect as using boost::iostreams::zlib on the C++ side.

The following is the code to compress data using the provided ZOutputStream class

public static byte[] ZLibDotnetCompress(byte[] data)
{
    MemoryStream compressed = new MemoryStream();
    ZOutputStream outputStream = new ZOutputStream(compressed, 2); 
    (data, 0, ); // Here we use Write to write the data that needs to be compressed; you can also use the same method as above    ();
    byte[] result = ();
    return result;
}

The following is the code to decompress data using the provided ZInputStream class

public static byte[] ZLibDotnetDecompress(byte[] data, int size)
{
    MemoryStream compressed = new MemoryStream(data);
    ZInputStream inputStream = new ZInputStream(compressed);
    byte[] result = new byte[size];   // Since ZInputStream inherits BinaryReader instead of Stream, you can only prepare the output buffer in advance and then use read to obtain fixed-length data.    (result, 0, ); // Note that the first letter of read here is lowercase    return result;
}

You need to obtain the decompressed data through read, and at the same time, you need to provide external buffers in advance to store the output data when calling its decompressed method. The size of this buffer is a problem.
If you plan to use this, it is recommended to add pre-compressed data in a position that will not be compressed, in addition to storing compressed data.

But overall, I personally do not recommend using this tool.

/zyborg/
/zlib_.

SharpZipLib

I finally chose to use SharpZipLib. (Editor: I didn't do speed tests at that time, and the file I needed to decompress was not too large, and the speed was not very important. Otherwise, this solution would not be recommended...)

ICSharpCode is worthy of being the team that developed ILSpy. SharpZipLib provides powerful functions and is also very convenient to use. Due to the topic, only the Deflate format is used to compress data streams.

Simply put, all you need to do is compress it through DeflaterOutputStream and InflaterInputStream is decompressed. Except for the compression and decompression being divided into two different classes, the other operation methods can be exactly the same.
Moreover, the compression and decompression results are exactly the same as using the official Zlib library directly. When developing tools to assist other programs, you don’t have to worry about the head and tail data, which is very convenient.

Here is how I can use this scheme to simply wrap it:

public static byte[] SharpZipLibCompress(byte[] data)
{
    MemoryStream compressed = new MemoryStream();
    DeflaterOutputStream outputStream = new DeflaterOutputStream(compressed);
    (data, 0, );
    ();
    return ();
}

public static byte[] SharpZipLibDecompress(byte[] data)
{
    MemoryStream compressed = new MemoryStream(data);
    MemoryStream decompressed = new MemoryStream();
    InflaterInputStream inputStream = new InflaterInputStream(compressed);
    (decompressed);
    return ();
}

Speed comparison

In order to compare the advantages and disadvantages of the compression and decompression efficiency of several methods, I prepared two sets of data and conducted a simple test.

The first group is short data, which is a simple string "this is just a string for testing, see how this compression thing works."
The second group is long data, which is the English version of "Song of Ice and Fire: Game of Thrones" txt text downloaded online, with a size of about 1.7mb.

I used each method to compress and decompress short data 1000 times and long data 100 times respectively. The final result is as follows:

Length of Short Data: 144
Length of Long Data: 1685502

============================================
Compress and decompress with Microsoft Zlib Compression (1000 times): 54
Compress and decompress with Microsoft Zlib Compression (long data 100 times): 7924

============================================
Compress and decompress with  Compression (1000 times): 254
Compress and decompress with  Compression (long data 100 times): 9924

============================================
Compress and decompress with SharpZipLib Compression (1000 times): 442
Compress and decompress with SharpZipLib Compression (long data 100 times): 26782

Obviously, the methods provided in the compression and decompression of long data and short data are better than the other two methods.

The disadvantage in speed is not obvious, and the same algorithm SharpZipLib takes two to three times the time.

Summarize

Finally, as expected, the method provided by Microsoft has obvious advantages in speed; although it does not provide the head and tail information of Deflate, it can be found a way to generate it yourself, and this disadvantage can basically be completely ignored. Although it performs well in terms of speed and also generates head and tail information for Deflate compression, it is relatively inconvenient to use because its packaging is relatively sloppy. SharpZipLib is a pity. Although it is convenient in other aspects, the speed defect is quite fatal. It can only be used lazyly when Deflate is required instead of RawDeflate or the .Net Framework used is earlier than 4.5 (and the time consumption during operation is not important).

Reference and extension

About Zlib

About Deflate and Raw Deflate

/questions/37845440/net-deflatestream-vs-linux-zlib-difference
/rfc/
/rfc/

About CSharp

/en-us/dotnet/api/?view=net-5.0

One of the developers Mark Adler's answer on *

The difference between deflate and compress functions

/questions/10166122/zlib-differences-between-the-deflate-and-compress-functions/10168441#10168441

How to manually add header and trailer
/questions/39939869/data-format-for-system-io-compression-deflatestream

The above is the detailed content on how to use Zlib compression and decompression in C#. For more information on using Zlib compression and decompression in C#, please pay attention to my other related articles!