• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Using Apache Commons TarArchiveInputStream results in corrupt un-archived files

 
Marshal
Posts: 4501
572
VSCode Eclipse IDE TypeScript Redhat MicroProfile Quarkus Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have a web service which consumes a TAR archive of binary files and passes them as a List of files to another subsystem for processing. I am using the Apache Commons Compress library (version 1.11) to work with the TAR formatted input.

I am finding that when I use TarArchiveEntry#getSize to determine the size, allocate storage for the contents, and then use TarArchiveInputStream#read, that the file may be different than original files included in the uploaded archive. If I just read from the TarArchiveInputStream in chunks through, the resulting files are fine. I noticed that the smallest file (length of 594 bytes) in the archive of 3 was the same with both implementations.

This is my first time working with the library so I am probably missing something. Any ideas or suggestions?


Working Code
Console output:
Contents MD5: df194ba4f2fe114be709c5605839930f (9627051 bytes)
Contents MD5: 3996f04fc6a830520c336825ef5afc1b (508571 bytes)
Contents MD5: 1cf5fca3f6209042fac634f718d30d43 (594 bytes)


Problematic Code
Console output:
Contents MD5: 3ee34d1e3ad7761303107cf9c3a5f6ad (9627051 bytes)
Contents MD5: c5c5dd952977fa6068d717586e57d9a8 (508571 bytes)
Contents MD5: 1cf5fca3f6209042fac634f718d30d43 (594 bytes)
 
Bartender
Posts: 3323
86
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I've not used TarArchiveInputStream but if the read() method conforms to the standard InputStream read() (which it almost certainly will as it extends the InputStream class) then the problem you are facing is because read does not guarantee filling the buffer you pass in. The API docs state "Reads up to len bytes of data from the input stream into an array of bytes. An attempt is made to read as many as len bytes, but a smaller number may be read. The number of bytes actually read is returned as an integer." so you need to check the returned value and if it isn't the same as len then read again and again until the total number of bytes read in is equal to len.
 
Ron McLeod
Marshal
Posts: 4501
572
VSCode Eclipse IDE TypeScript Redhat MicroProfile Quarkus Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks Tony - I'm sure that's what is it.
 
Sheriff
Posts: 22783
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Which points out another (potential) bug in your code:

Ron McLeod wrote:Working Code


0 is a perfectly fine result for InputStream.read(byte) which does not indicate that the stream is exhausted. You should use >= 0, > -1 or != -1.
 
Ron McLeod
Marshal
Posts: 4501
572
VSCode Eclipse IDE TypeScript Redhat MicroProfile Quarkus Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yikes -- I'm getting sloppy. Thanks for pointing that out.
 
Rob Spoor
Sheriff
Posts: 22783
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You're welcome.
 
reply
    Bookmark Topic Watch Topic
  • New Topic