This week's book giveaway is in the Agile and other Processes forum. We're giving away four copies of The Mikado Method and have Ola Ellnestam and Daniel Brolund on-line! See this thread for details.
Can I hold 6 GB of data in a String? How much stuff can a string hold? My hands are tied I don't have any other options at this time than passing data as string at this time. I am dealing with vendor code.
Ernest Friedman-Hill
author and iconoclast
Marshal
Many 32-bit Java implementations limit the Java heap to under 2GB, and so to even get that much data in memory, you'd need a 64-bit machine and a 64-bit JVM. Unfortunately, even then, because String is implemented using a char[], and Java array indices are ints, I believe Integer.MAX_VALUE is the greatest number of characters a String can store -- i.e., about 2 billion characters (or 4GB of data, since a char is two bytes.)
I believe that there is an earlier limit. Not sure if this has been fixed in the latest JVM, but there is a 2 gig limit per object -- so a char[] can only be 2 gig is size.
Either way though, INTEGER_MAX is the largest number of elements any array can have, and that's definite. So 4GB is the largest any char[] array could ever be, regardless of possible bug fixes that may or may not affect a lower limit. A 6 GB string will never be possible.
1.5 Gb for jdk 1.3, but still limitted to 2.0Gb for Windows unless you run Server 2003 or later, last time I checked.
I still don't think it changes the fact that you'll have trouble getting 6Gb into memory. You could store it as a huge compressed byte array, but is this really an improvement?
Yes you have 12Gb of RAM, but you won't be able to get a single Java process to see all of that at once. I'm not sure where 'half the size' comes from, but if it isn't stored as a String (eg huffman codes) it can still be streamed but you won't be able to do simple String operations on it. This is where I was asking whether simply getting it into memory is an improvement.
Adam Teg
Ranch Hand
Joined: Jul 10, 2007
Posts: 58
posted
0
Thanks to all. But to answer Pat's question, I can't read the file in chunks because I need to pass the entire file to the vendor code so that it can do some minipulations on it. But 2 GB should be fine. I was just running some tests and woundered how much will blow things up.
Also, keep in mind that it is likely that your vendor code doesn't expect a 2 gig string, when it was coded. A few operations like concats, upcases, trims, etc. are all it takes to fill up your heap.