I have tried a lot of different things; but, at the end of the day, the setComment() method on the JarOutputStream (which extends ZipOutputStream) writes a "circumflex a" (i.e., an 'Â') before the extended ASCII characters.
when viewing the archive comments using WinZip or PKZIP or 7-ZIP or any other archive tool I have tried.
I have tried converting to Unicode; but, since the setComment() implementation only writes single bytes, I get a literal '\u00A9' string in the comment.
Does anyone have a solution for this? Or know how to write the comments to a zip file without using the setComment() method (appending the comment directly to the end of the file)? I have tried the latter, but I am somehow corrupting the archive when doing so.
I know that I could simply use '(R)' and '(c)' instead, but I would rather use the extended ASCII characters, as they look better. I also know that this can be done via WinZip's command line utility; but I would like to use Java so I don't have to buy a zip license just to add an archive comment to a jar file.
The java.util.zip package is not particularly Unicode-savvy (see this old bug, due to be fixed in Java 7). Unless you can wait for that, check out the Apache Commons Compress library (which includes a UnicodeCommentExtraField class that looks promising in this context).
Yeah, I saw that upcoming Charset arg in the JarFile constructor; however, the majority of my customers are still on 1.5 and it is unlikely that they will shift to 1.7 anytime soon.
I am currently investigating how to manually write/replace the comment in the zip file (without corrupting it). This is probably the only way to fix it without patching a whole lot of java.util.jar and java.util.zip classes.
I will post the solution when I figure it out. Anyone else who already knows who to manually add/replace an archive comment in the zip file can post it first
Joined: Mar 22, 2005
Anyone else who already knows who to manually add/replace an archive comment in the zip file can post it first
The Commons distro is fine; however, I wanted to find the solution for my own edification and as an intellectual exercise.
The solution is really very simple (hindsight, of course). After studying the Zip file format, I found that the comment length and content are appended to the end of the zip file. You must first find the "end" of the zip file entries and/directory has a "magic" byte sequence of "0x50, 0x4b, 0x05, 0x06". By finding this byte sequence in the zip file, you can read/write the comment. If there is no comment, you can simply append the comment length and comment string to the end of the zip file.
There is one small caveat to the length that was causing my corruption error. The comment length is written as a two-byte little Endian sequence. So, you need to write the length as such: