User Tools

Site Tools


archive_types

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
archive_types [2024/04/14 16:24] – [What's the difference between the ZSTD levels?] johnsancarchive_types [2024/04/27 09:15] (current) – [Table] johnsanc
Line 1: Line 1:
 ====== Archive Types ====== ====== Archive Types ======
-RomVault supports several different archive options for your ROM sets. Unlike most other ROM managers, RomVault focuses on consistency and normalizes archives. Everyone's archive for a particular set of files will be 100% hash equal if the same archive options are used.+RomVault supports several different archive options for your ROM sets. Unlike most other ROM managers, RomVault focuses on consistency and normalizes archives. All archives that RomVault creates are deterministic. Everyone's archive for a particular set of files will be 100% hash equal if the same archive options are used.
  
 ^ Icon                                                      ^ Type                  ^ Compression  ^ Date Normalization   ^ Pros                                                                                                    ^ Cons                                                                                      ^ Recommended for...                                                                                                       ^ ^ Icon                                                      ^ Type                  ^ Compression  ^ Date Normalization   ^ Pros                                                                                                    ^ Cons                                                                                      ^ Recommended for...                                                                                                       ^
Line 7: Line 7:
 | {{ :set:ziptdc.png?50&nolink |tdc zip}}                   | TDC Zip               | Deflate (9)  | Specified by DAT     | A new normalized standard that preserves dates                                                          | Archives may not match existing sources                                                   | Total DOS Collection. Automatically used with DOSCenter DATs.                                                            | | {{ :set:ziptdc.png?50&nolink |tdc zip}}                   | TDC Zip               | Deflate (9)  | Specified by DAT     | A new normalized standard that preserves dates                                                          | Archives may not match existing sources                                                   | Total DOS Collection. Automatically used with DOSCenter DATs.                                                            |
 | {{ :set:zipzstd.png?50&nolink |zstd zip}}                 | ZSTD Zip              | ZSTD (19)    | Null                 | Compression ratio between Deflate and LZMA, very fast decompression, multithreaded                      | Not widely supported yet, but works with WinRAR and 7-zip                                 | People who want a better compression ratio and faster decompression than Deflate.                                        | | {{ :set:zipzstd.png?50&nolink |zstd zip}}                 | ZSTD Zip              | ZSTD (19)    | Null                 | Compression ratio between Deflate and LZMA, very fast decompression, multithreaded                      | Not widely supported yet, but works with WinRAR and 7-zip                                 | People who want a better compression ratio and faster decompression than Deflate.                                        |
-| {{ :set:sevenzipnlzma.png?50&nolink |non-solid lzma 7z}}  | Non-Solid LZMA 7z     | LZMA (9)     | Null                 | Excellent compression, raw copy supported                                                               | Extremely slow, single threaded                                                           | People who want to sacrifice some compression ratio over solid LZMA for the ability to raw copy files between archives. +| {{ :set:sevenzipnlzma.png?50&nolink |non-solid lzma 7z}}  | Non-Solid LZMA 7z     | LZMA (9)     | Null                 | Excellent compression ratio, raw copy supported                                                         | Extremely slow, single threaded                                                           | People who want to sacrifice some compression ratio over solid LZMA for the ability to raw copy files between archives. 
-| {{ :set:sevenzipslzma.png?50&nolink |solid lzma 7z}}      | Solid LZMA 7z (RV7z)  | LZMA (9)     | Null                 | Excellent compression, especially for many files packed together                                        | Extremely slow, single threaded, must unpack completely for any fix                       | People who care about compression ratio only and don't care how slow LZMA is.                                            |+| {{ :set:sevenzipslzma.png?50&nolink |solid lzma 7z}}      | Solid LZMA 7z (RV7z)  | LZMA (9)     | Null                 | Excellent compression ratio, especially for many files packed together                                  | Extremely slow, single threaded, must unpack completely for any fix                       | People who care about compression ratio only and don't care how slow LZMA is.                                            |
 | {{ :set:sevenzipnzstd.png?50&nolink |non-solid zstd 7z}}  | Non-Solid ZSTD 7z     | ZSTD (19)    | Null                 | Compression ratio between Deflate and LZMA, very fast decompression, multithreaded, raw copy supported  | Poorly supported, only works with zstd fork of 7-zip                                      | Default 7z structure for archives that need to be repacked in ToSort directories.                                        | | {{ :set:sevenzipnzstd.png?50&nolink |non-solid zstd 7z}}  | Non-Solid ZSTD 7z     | ZSTD (19)    | Null                 | Compression ratio between Deflate and LZMA, very fast decompression, multithreaded, raw copy supported  | Poorly supported, only works with zstd fork of 7-zip                                      | Default 7z structure for archives that need to be repacked in ToSort directories.                                        |
 | {{ :set:sevenzipszstd.png?50&nolink |solid zstd 7z}}      | Solid ZSTD 7z         | ZSTD (19)    | Null                 | Compression ratio between Deflate and LZMA, very fast decompression, multithreaded                      | Poorly supported, only works with zstd fork of 7-zip, must unpack completely for any fix  | Archives with many small files that likely won't change often, and you want fast decompression speed.                    | | {{ :set:sevenzipszstd.png?50&nolink |solid zstd 7z}}      | Solid ZSTD 7z         | ZSTD (19)    | Null                 | Compression ratio between Deflate and LZMA, very fast decompression, multithreaded                      | Poorly supported, only works with zstd fork of 7-zip, must unpack completely for any fix  | Archives with many small files that likely won't change often, and you want fast decompression speed.                    |
Line 14: Line 14:
  
 ==== Which archive type should I use? ==== ==== Which archive type should I use? ====
-It depends. Each archive type has pros and cons, but you don't need to choose just one. Choose an archive type that best fits your use case for a particular collection. For example, if you plan to extract discs from archives to load into an emulator, you may prefer ZSTD Zip due its fast decompression speed. That being said, ZSTD Zip is essentially a modernized TrrntZip and seems to be the most popular choice since it was introduced with RomVault 3.7.0. +It depends. Each archive type has pros and cons, but you don't need to choose just one. Choose an archive type that best fits your use case for a particular collection. For example, if you plan to extract discs from archives to load into an emulator, you may prefer ZSTD Zip due its fast decompression speed. That being said, ZSTD Zip is essentially a modernized TrrntZip and seems to be the next most popular choice since it was introduced with RomVault 3.7.0. 
  
  
Line 26: Line 26:
   * Its designed to be highly tunable replacement for Deflate   * Its designed to be highly tunable replacement for Deflate
   * Its now well established and used for filesystem compression with ZFS and BTRFS   * Its now well established and used for filesystem compression with ZFS and BTRFS
-  * Its gaining popularity and support for ZSTD Zip archives was added to MAME v0.262 in January 2024 +  * Its gaining popularity and support for ZSTD Zip archives and CHDs was added to MAME v0.262 in January 2024 
   * It supports multi-threaded compression   * It supports multi-threaded compression
-  * Its designed to be seekable with block sizes up to 128 KiB (compared to Deflate's 32 KiB) 
   * Its standard levels are 1-19, and ultra levels are 20-22 which require significantly more memory   * Its standard levels are 1-19, and ultra levels are 20-22 which require significantly more memory
 +  * Its updated periodically, but RomVault uses ZSTD version 1.5.5, which will not change for the foreseeable future in order to ensure deterministic behavior
  
  
 ==== How does the compression ratio of ZSTD compare to Deflate and LZMA? ==== ==== How does the compression ratio of ZSTD compare to Deflate and LZMA? ====
-ZSTD level 19 has a compression ratio right in between Deflate level 9 and LZMA level 9. However, since ZSTD is multithreaded its much faster to compress compared to LZMA and often times faster than Deflate. ZSTD also decompresses much faster than both LZMA and Deflate.+ZSTD level 19 has a compression ratio right in between Deflate level 9 and LZMA level 9. However, since ZSTD is multi-threaded its much faster to compress compared to LZMA and often faster than Deflate. ZSTD also decompresses much faster than both LZMA and Deflate. Example statistics for the redump.org PlayStation collection:
  
 +^ Format          Size (GiB)  ^  Compression Ratio  ^  Space Savings  ^
 +| File            4,640.41    |                                     |
 +| TrrntZip        3,025.91    |  1.5336              34.79%         |
 +| ZSTD Zip        2,716.79    |  1.7080              41.45%         |
 +| Solid LZMA 7z  |  2,513.54    |  1.8462              45.83%         |
  
 +The diagram below shows the relationship of single-threaded compression speed and ratio. Most ZSTD use cases would involve multiple workers (threads) on modern hardware and compress much faster. The three callouts in the diagram represent the compression levels that RomVault uses for each method.  
 +{{:diagrams:zstd.png?nolink&1000|}}
 ==== Why is ZSTD slow when compressing archives with many smaller files? ==== ==== Why is ZSTD slow when compressing archives with many smaller files? ====
 RomVault's implementation of ZSTD uses compression level 19 without any additional parameters or long distance matching. ZSTD is multi-threaded but each thread or worker is assigned a job of a specific size. ZSTD level 19 uses a job size of 32 MiB. This means any compression stream under 32 MiB will only use a single worker. Conversely a stream of 500 MiB will use up to 16 workers simultaneously. RomVault's implementation of ZSTD uses compression level 19 without any additional parameters or long distance matching. ZSTD is multi-threaded but each thread or worker is assigned a job of a specific size. ZSTD level 19 uses a job size of 32 MiB. This means any compression stream under 32 MiB will only use a single worker. Conversely a stream of 500 MiB will use up to 16 workers simultaneously.
  
  
-==== Why does RomVault use ZSTD level 19? Why not level 22? Why can't I choose my own? ====+==== Why does RomVault use ZSTD level 19 instead of 22? Why can't I choose my own? ====
 There are several reasons: There are several reasons:
   * RomVault has always focused on consistency and normalization, and purposely provides a limited set of the options that best accommodates the use cases of the community.   * RomVault has always focused on consistency and normalization, and purposely provides a limited set of the options that best accommodates the use cases of the community.
-  * Levels 20-22 are considered "ultra" and have significantly higher memory requirements which may not work for many users. +  * Space savings are marginal above level 19 for most use cases. 
-  * The space savings above level 19 are marginal for most use cases+  * Levels 20-22 are considered "ultra" and have much higher memory requirements for compression which may not work for many users. (~3 GiB for a //single// worker with level 22) 
-  * Levels above 19 will often have significantly slower compression since the job size per worker is much larger.+  * Levels above 19 compress significantly slower due to the increased window size
 +  * Levels above 19 are not as performant in most multithreaded scenarios since the job size per worker is much larger. (128 MiB to 512 MiB)
  
  
Line 53: Line 61:
   * Window Log: The window size represented as an exponent, 2^X bytes.   * Window Log: The window size represented as an exponent, 2^X bytes.
   * Window Size: The size of the sliding window the algorithm uses. Larger window sizes can match patterns further away, but at the cost of higher memory utilization and speed.   * Window Size: The size of the sliding window the algorithm uses. Larger window sizes can match patterns further away, but at the cost of higher memory utilization and speed.
-  * Job Size: The amount of data allocated to a single worker (thread).+  * Job Size: The amount of data allocated to a single worker (thread).
   * Strategy: The specific compression approach used.   * Strategy: The specific compression approach used.
  
archive_types.1713137046.txt.gz · Last modified: 2024/04/14 16:24 by johnsanc