• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

serializing large object (hashmap)

 
Ranch Hand
Posts: 275
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello. If somebody has a minute, I seem to be going in circles on this one. I have a hashmap (referenced as data_map) which is a map of binary files, that i am serializing. However, when I load test it with larger files, i run out of java heap space...

ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutputStream out = new ObjectOutputStream(bos);
out.writeObject(data_map);

i've been trying to go about chunking the hashmap (similar to say, an outputstream for an http connection) into the ObjectOutputStream but it all comes back to the inability to read/write the object to any stream and/or convert it to bytes. Thank you for any input. I'll keep going at it or whatever.
[ August 22, 2007: Message edited by: Tom Griffith ]
 
Greenhorn
Posts: 26
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What you are doing is essentially creating a duplicate of your large map in memory. And it is even worse than that, because serialized version of this map is a lot larger than how the map is represented on the heap.

What are you doing with the ByteOutputStream after you are done? Writing it to a blob? A file? Sending it over the network? If so, I would skip the middleman of the ByteOutputStream and send it directly to the destination OutputStream.



Originally posted by Tom Griffith:
Hello. If somebody has a minute, I seem to be going in crocles on this one. I have a hashmap (referenced as data_map) that i am serializing, however, when I load test it, i run out of java heap space...

ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutputStream out = new ObjectOutputStream(bos);
out.writeObject(data_map);

i've been trying to go about maybe chunking the hashmap into the ObjectOutputStream but it all comes back to the inability to read/write the object as is to any stream (to convert it to bytes). I appreciate any help and input. Thank you.

[ August 22, 2007: Message edited by: Tom Griffith ]

 
Tom Griffith
Ranch Hand
Posts: 275
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi...thank you for reading my post or whatever. Yeah, what I am doing is using the ByteArrayOutputStream to convert the map to a byte array and then writing the resulting byte array in chunks (to avoid a memory heap problem there) over the network. I guess the middleman's purpose is to bridge the hashmap object with a byte array...

ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutputStream out = new ObjectOutputStream(bos);
out.writeObject(data_map);
byte[] buf = bos.toByteArray();

Is there another way I can convert the hashmap ~object~ directly to a byte array without going through the ByteArrayOutputStream (and ObjectOutputStream)? thank you again...
[ August 22, 2007: Message edited by: Tom Griffith ]
 
Tom Griffith
Ranch Hand
Posts: 275
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello. What I am able to get working, although the performance is not great, is to use writeObject on the hashmap and stream it to a local temporary dat file. This replaces the ByteArrayOutputStream middleman in memory by offloading the bytes...then I set an InputStream on the dat file and chunk the bytes across the network. I really don't see another way. Thank you again for your valuable input, it made me look at it with an eye on eliminating the redundant stream. Any additional input on doing this more efficently than a temp dat file would be appreciated, but i think i've exhausted all avenues. Thank you for reading this everybody.
 
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well you can do something with the objects you have stored in the hashtable,if you can use custom serialization i.e. writing only those properties which are useful to you while not writing every field of that object to the object stream. or by implementing Externalizable??

i could understand, if I will have following information:

which application server you are using ,if any? are you using java message services queues?
from where you get this hashtable with file objects in it?

regards,
 
Ranch Hand
Posts: 1970
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Even if you do want to write out all fields, you can still make significant savings in serialised size, by using custom serialisation. This is because default serialisation writes out stuff like field names, Java class names etc, to the stream. That stuff can easily take up more space than the actual data! In custom serialisation, you can just write the data. Take care though, because this reduces the chance of successfully reading the data from an old build into a new build.
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What are the keys and values in your HashMap? It sounds like the values might be large arrays of bytes from binary files - is that the case?
 
Tom Griffith
Ranch Hand
Posts: 275
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello. Thank you for reading everybody. Yeah, the map uses file names as keys and the respective binary array as the value...then at the http destination, ultimately streaming each value object (byte array) to a new file object using the key value (file name). I wanted to use the map in order to allow for > 1 files to be transferred in a single call (although the large files, ie. large maps, will require a chunking loop to stream to the http destination)...

i'm going to look at custom serialization but i'm not so sure that applies because i need both the key and value from the map...
[ August 23, 2007: Message edited by: Tom Griffith ]
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hm, I think chances are good that most of the time is being spent transmitting large arrays of bytes. Custom serialization will probably not help much in this case, since a byte array is pretty easy to serialize. It's just the volume of bytes that's the problem.

I think that the reason you need to break this into chunks is because of the way ObjectOutputStream and ObjectInputStream work. They both keep internal maps of all the objects that have been written through them, which is necessary so they can detect references to already-written objects and represent these with references to the already-written objects, rather than serializing new copies of those objects. But your client doesn't need or want to have all your Map's contents in memory at once. So you've had to break the map into chunks. Does that sound right?

From what you've described, I think you might be best off not using serialization at all, and instead use a simple protocol with DataOutputStream and DataInputStream. E.g. the server could do something like this:

And the client could do something like this:

Here it's important that whatever doSomethingWith() does, it should avoid saving any reference to the byte[] array. That way each one can be collected when you're done with it, and the required memory is only a little bigger than the largest single file you transfer.
 
Tom Griffith
Ranch Hand
Posts: 275
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Jim. It still seems to pose the same problem of how to set the inputstream (and subsequently, the DataInputStream) on the map object. I still think i would have to convert the map to bytes first in order to stream it. Is that right?...
[ August 24, 2007: Message edited by: Tom Griffith ]
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I don't think I understand the question. That code does convert all the data in the map into bytes. The call to dos.writeUTF() converts the file names to bytes, and the byte arrays are already in bytes. I don't know what "how to set the inputstream (and subsequently, the DataInputStream) on the map object" means.
 
Tom Griffith
Ranch Hand
Posts: 275
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Jim. I think I'm the one thats kinda confused. I'll really mess with integrating data streams into this today and see how it pans out. I've used them before (in primitive times) to force xml into services. Thank you.
[ August 27, 2007: Message edited by: Tom Griffith ]
reply
    Bookmark Topic Watch Topic
  • New Topic