User Tools

Site Tools


rvzstd

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
rvzstd [2025/10/12 02:01] – created johnsancrvzstd [2025/10/12 03:29] (current) johnsanc
Line 1: Line 1:
 ~~NOTOC~~ ~~NOTOC~~
-====== RVZstd (WORK IN PROGRESS) ======+====== RVZstd ======
  
-This specification is intended to define the implementation of zip as used by the RVZstd standard. This standard is RomVault's implementation of deterministic ZIP archives compressed with zstd. +This specification defines the implementation of ZIP archives as used by the RVZstd standard. This standard is RomVault's implementation of deterministic ZIP archives compressed with zstd. 
  
 ===== Archive format ===== ===== Archive format =====
-==== General format of an RVZstd .zip file with n files: ====+==== General format of an RVZstd .zip archive with n files: ====
  
 ^Archive Start^ ^Archive Start^
Line 27: Line 27:
 ==== Local file header x: (Showing RVZstd default values): ==== ==== Local file header x: (Showing RVZstd default values): ====
 ^Type ^Attribute ^Value ^Description ^  ^Type ^Attribute ^Value ^Description ^ 
-|UInt32|Local file header signature|(0x04034b50)| | +|UInt32|Local file header signature|04034B50|Static signature value
-|UInt16|Version needed to extract|20 or 45|20 = File is compressed using Deflate compression / 45 if this record contains zip64 information+|UInt16|Version needed to extract|3F00|63: Used for modern compression algorithms like LZMA and zstd
-|UInt16|General purpose bit flag|2|Maximum compression option was used, bit 11 (0x800) is set for unicode filename| +|UInt16|General purpose bit flag| |Bit 2: (0x200) Maximum compression option was used. Bit 11(0x800) is set for unicode filename| 
-|UInt16|Compression method|8|The file is Deflated+|UInt16|Compression method|5D00|93: The file is compressed with zstd
-|UInt16|Last mod file time|48128|11:32 PM+|UInt16|Last mod file time|0000|zeroed time, 1/1/0001
-|UInt16|Last mod file date|8600|12/24/1996|+|UInt16|Last mod file date|0000|zeroed date, 12:00:00 AM|
 |UInt32|CRC32| |File CRC32| |UInt32|CRC32| |File CRC32|
 |UInt32|Compressed size| |File Compressed Size| |UInt32|Compressed size| |File Compressed Size|
Line 40: Line 40:
 |Byte[]|Filename (variable size)| |Byte array of filename| |Byte[]|Filename (variable size)| |Byte array of filename|
  
-<WRAP info>The default values are required to have consistent torrentzipped filesDefault time/date of 11:32pm 12/24/1996 is the date of the first ever MAME release.</WRAP>+<WRAP info>The default values are required to ensure consistent RVZstd archivesUnlike torrentzip, RVZstd uses zeroed values for date and time instead of the date/time of the first MAME release.</WRAP>
  
 ==== File data x: ==== ==== File data x: ====
-The data compression must be exactly as ZLib version 1.1.using maximum compression level 9. All files and empty directories must be compressed with the Deflate method, which is method 8.+The data compression must be exactly zstd version 1.5.using level 19 without long distance matching, training, or any-other parameters altered. All files and empty directories must be compressed with the zstd method, which is method 93.
  
  
-==== Central Directory file x: (Showing torrentzipped default values): ====+==== Central Directory file x: (Showing RVZstd default values): ====
  
 ^Type ^Attribute ^Value ^Description ^  ^Type ^Attribute ^Value ^Description ^ 
-|UInt32|Central file header signature|(0x02014b50)| | +|UInt32|Central file header signature|02014B50|Static signature value
-|UInt16|Version made by|0|MS_DOS and OS/2 (FAT/FAT32 file systems)| +|UInt16|Version made by|0000|MS_DOS and OS/2 (FAT/FAT32 file systems)| 
-|UInt16|Version needed to extract|20 or 45|20 = File is compressed using Deflate compression / 45 if this record contains zip64 information+|UInt16|Version needed to extract|3F00|63: Used for modern compression algorithms like LZMA and zstd
-|UInt16|General purpose bit flag|2|Maximum compression option was used, bit 11 (0x800) is set for unicode filename| +|UInt16|General purpose bit flag| |Bit 2: (0x200) Maximum compression option was used. Bit 11(0x800) is set for unicode filename| 
-|UInt16|Compression method|8|The file is Deflated+|UInt16|Compression method|5D00|93: The file is compressed with zstd
-|UInt16|Last mod file time|48128|11:32 PM+|UInt16|Last mod file time|0000|zeroed time, 1/1/0001
-|UInt16|Last mod file date|8600|12/24/1996|+|UInt16|Last mod file date|0000|zeroed date, 12:00:00 AM|
 |UInt32|CRC32| |File CRC32| |UInt32|CRC32| |File CRC32|
 |UInt32|Compressed size| |File Compressed Size| |UInt32|Compressed size| |File Compressed Size|
Line 61: Line 61:
 |UInt16|File name length| |Filename length| |UInt16|File name length| |Filename length|
 |UInt16|Extra field length| |Normally 0, Length of Extra field data if zip64 extra field information is included| |UInt16|Extra field length| |Normally 0, Length of Extra field data if zip64 extra field information is included|
-|UInt16|File comment length|0|No file comment| +|UInt16|File comment length|0000|No file comment| 
-|UInt16|Disk number start|0|Multi disk storage not used so set to disk 0| +|UInt16|Disk number start|0000|Multi disk storage not used so set to disk 0| 
-|UInt16|Internal file attributes|0|No internal attributes| +|UInt16|Internal file attributes|0000|No internal attributes| 
-|UInt32|External file attributes|0|No external attributes|+|UInt32|External file attributes|0000|No external attributes|
 |UInt32|Relative offset of local header| |File offset of this files Local Header| |UInt32|Relative offset of local header| |File offset of this files Local Header|
 |Byte[]|File name (variable size)| |Byte array of filename| |Byte[]|File name (variable size)| |Byte array of filename|
Line 70: Line 70:
 ==== End of Central Directory: ==== ==== End of Central Directory: ====
 ^Type ^Attribute ^Value ^Description ^  ^Type ^Attribute ^Value ^Description ^ 
-|UInt32|End of central dir signature|(0x06054b50)| | +|UInt32|End of central dir signature|06054B50|Static signature value
-|UInt16|Number of this disk|0|Multi disk storage not used so set to disk 0| +|UInt16|Number of this disk|0000|Multi disk storage not used so set to disk 0| 
-|UInt16|Number of the disk with the start of the central directory|0|Multi disk storage not used so set to disk 0|+|UInt16|Number of the disk with the start of the central directory|0000|Multi disk storage not used so set to disk 0|
 |UInt16|Total number of entries in the central directory on this disk|n|Total number of files| |UInt16|Total number of entries in the central directory on this disk|n|Total number of files|
 |UInt16|Total number of entries in the central directory|n|Total number of files| |UInt16|Total number of entries in the central directory|n|Total number of files|
 |UInt32|Size of the central directory|EOCD-SOCD|length of the central directories| |UInt32|Size of the central directory|EOCD-SOCD|length of the central directories|
 |UInt32|Offset of start of central directory with respect to the starting disk number|SOCD|Start of central directory| |UInt32|Offset of start of central directory with respect to the starting disk number|SOCD|Start of central directory|
-|UInt16|.ZIP file comment length|22|torrentzipped comment| +|UInt16|.ZIP file comment length|0F00|15: RVSztd comment| 
-|Byte[22]|.ZIP file comment|TORRENTZIPPED-XXXXXXXX|+|Byte[22]|.ZIP file comment|52565A5354442D...|RVZSTD-XXXXXXXX|
  
-<WRAP info>See above 'General format of a torrentzipped .zip file with n files' for SOCD & EOCD</WRAP>+<WRAP info>See above 'General format of an RVZstd .zip file with n files' for SOCD & EOCD</WRAP>
  
  
-===== The TorrentZipped files comment ===== +===== The RVZstd archive comment ===== 
-The .ZIP file comment in the End of Central Directory is used to check the validity of the torrentzipped file.+The ZIP file comment in the End of Central Directory is used to check the validity of the RVZstd file.
  
-The comment must be formatted as the 22 bytes of ''TORRENTZIPPED-XXXXXXXX''. The ''XXXXXXXX'' is the CRC32 of the central directory records stored as hexadecimal upper case text. (The CRC32 of the bytes in the file between SOCD & EOCD)+The comment must be formatted as the 15 bytes of ''RVZSTD-XXXXXXXX''. The ''XXXXXXXX'' is the CRC32 of the central directory records stored as hexadecimal upper case text. (The CRC32 of the bytes in the file between SOCD & EOCD)
  
-This comment ensures that if any change is made to the files within the zip this checksum will no longer match the byte data in the central directory, and in this way we can check the validity of a torrentzip file.+This comment ensures that if any change is made to the files within the zip this checksum will no longer match the byte data in the central directory, and in this way we can check the validity of an RVZstd file.
 ===== Filename Encoding ===== ===== Filename Encoding =====
-The filenames of the compressed files in a zip file are stored in the local header and the central directory as byte arrays. Zip was original build on early IBM PCs, and as such uses [[https://en.wikipedia.org/wiki/Code_page_437|code page 437]] to convert a string to a byte array to store the filenames. With the arrival of unicode multiple different methods where added to the official zip format to permit unicode filenames to be stored in a zip file. Trrntzip format uses the general purpose bit 11 method. So to store a filename in a trrntzip zip file you must first see if the filename can be stored using code page 437, if not then UTF8 encoding should be used in the byte arrays, this is then indicated by setting bit 11 of the General Purpose Bit Flags both in the local header and central directory.  +The filenames of the compressed files in a zip archive are stored in the local header and the central directory as byte arrays. Zip was original built on early IBM PCs, and as such uses [[https://en.wikipedia.org/wiki/Code_page_437|code page 437]] to convert a string to a byte array to store the filenames. With the arrival of Unicode, multiple different methods were added to the official zip format to permit Unicode filenames to be stored in a zip file. RVZstd format uses the general purpose bit 11 method. So to store a filename in an RVZstd zip file you must first see if the filename can be stored using code page 437, if not then UTF-8 encoding should be used in the byte arrays, this is then indicated by setting bit 11 of the General Purpose Bit Flags both in the local header and central directory.  
  
  
-===== File order with a TorrentZip ===== +===== File order with an RVZstd archive ===== 
-For the creation of consistent torrentzipped files, the file order is also very important. Files must be sorted by filename using a lower case sort.+For the creation of consistent RVZstd files, the file order is also very important. Files must be sorted by filename using a lower case sort.
  
 ===== Directory separator character ===== ===== Directory separator character =====
 As zips only store files (not directories), files in directories are represented by storing a relative path to the filename. For example file ''test1.rom'' in directory ''set1'' would be stored with a filename of ''set1/test1.rom''. Some zipping programs will store this as ''set1\test1.rom''. As zips only store files (not directories), files in directories are represented by storing a relative path to the filename. For example file ''test1.rom'' in directory ''set1'' would be stored with a filename of ''set1/test1.rom''. Some zipping programs will store this as ''set1\test1.rom''.
  
-This leads to a possible naming inconsistency. The zip file format states: //"All slashes should be forward slashes ‘/’ as opposed to backwards slashes ‘\’"//. So Torrentzip will change all ''\'' characters to ''/''. This must be done before sorting, to ensure that the sort is performed correctly.+This leads to a possible naming inconsistency. The zip file format states: //"All slashes should be forward slashes ‘/’ as opposed to backwards slashes ‘\’"//. So RomVault will change all ''\'' characters to ''/''. This must be done before sorting, to ensure that the sort is performed correctly.
  
 ===== Directory entries and empty directories ===== ===== Directory entries and empty directories =====
Line 114: Line 114:
  
 ===== Repeat files ===== ===== Repeat files =====
-Another test that could be performed is checking for repeat file entries inside the zip, most zip programs have a hard time handling this and will just ignore this repeat giving the user no way of knowing there is a repeat filename problem. So it would fix another possible inconsistency if torrentzip scanning at least warned about repeat filename being found inside a zip.+Another test that could be performed is checking for repeat file entries inside the zip, most zip programs have a hard time handling this and will just ignore this repeat giving the user no way of knowing there is a repeat filename problem. So it would fix another possible inconsistency if RVZstd scanning at least warned about repeat filename being found inside a zip.
  
 ===== Reference ===== ===== Reference =====
   * [[http://www.pkware.com/documents/casestudies/APPNOTE.TXT|Official ZIP file specification]]    * [[http://www.pkware.com/documents/casestudies/APPNOTE.TXT|Official ZIP file specification]] 
-  * [[http://sourceforge.net/projects/trrntzip/|Original TorrentZip source code]] +  * [[https://github.com/facebook/zstd|zstd compression]]
-  * [[http://zlib.net/|zlib compression]]+
rvzstd.1760234471.txt.gz · Last modified: 2025/10/12 02:01 by johnsanc