aspose file tools*
The moose likes Beginning Java and the fly likes 6GB data as String Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "6GB data as String" Watch "6GB data as String" New topic
Author

6GB data as String

Adam Teg
Ranch Hand

Joined: Jul 10, 2007
Posts: 58
Can I hold 6 GB of data in a String? How much stuff can a string hold? My hands are tied I don't have any other options at this time than passing data as string at this time. I am dealing with vendor code.
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

Many 32-bit Java implementations limit the Java heap to under 2GB, and so to even get that much data in memory, you'd need a 64-bit machine and a 64-bit JVM. Unfortunately, even then, because String is implemented using a char[], and Java array indices are ints, I believe Integer.MAX_VALUE is the greatest number of characters a String can store -- i.e., about 2 billion characters (or 4GB of data, since a char is two bytes.)


[Jess in Action][AskingGoodQuestions]
David O'Meara
Rancher

Joined: Mar 06, 2001
Posts: 13459

Ernest++
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18896
    
  40

I believe that there is an earlier limit. Not sure if this has been fixed in the latest JVM, but there is a 2 gig limit per object -- so a char[] can only be 2 gig is size.

Henry


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Either way though, INTEGER_MAX is the largest number of elements any array can have, and that's definite. So 4GB is the largest any char[] array could ever be, regardless of possible bug fixes that may or may not affect a lower limit. A 6 GB string will never be possible.


"I'm not back." - Bill Harding, Twister
David O'Meara
Rancher

Joined: Mar 06, 2001
Posts: 13459

1.5 Gb for jdk 1.3, but still limitted to 2.0Gb for Windows unless you run Server 2003 or later, last time I checked.

I still don't think it changes the fact that you'll have trouble getting 6Gb into memory. You could store it as a huge compressed byte array, but is this really an improvement?
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4659
    
    5

What do you mean "hands are tied".
In general 6GB of anything in memory is a bad idea.

I just paid $6000 for a server with 12GB of RAM, but even on that, I would not try to load 6GB into memory.

May I ask why you want it in a String? if its raw binary data, a byte[] would make more sense, and be only half the size.

But a better design would be to read a portion of it, say 100KB and process the chunks.
David O'Meara
Rancher

Joined: Mar 06, 2001
Posts: 13459

Yes you have 12Gb of RAM, but you won't be able to get a single Java process to see all of that at once. I'm not sure where 'half the size' comes from, but if it isn't stored as a String (eg huffman codes) it can still be streamed but you won't be able to do simple String operations on it. This is where I was asking whether simply getting it into memory is an improvement.
Adam Teg
Ranch Hand

Joined: Jul 10, 2007
Posts: 58
Thanks to all. But to answer Pat's question, I can't read the file in chunks because I need to pass the entire file to the vendor code so that it can do some minipulations on it. But 2 GB should be fine. I was just running some tests and woundered how much will blow things up.

-Thanks again
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18896
    
  40

Also, keep in mind that it is likely that your vendor code doesn't expect a 2 gig string, when it was coded. A few operations like concats, upcases, trims, etc. are all it takes to fill up your heap.

Henry
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: 6GB data as String