At first I thought it was because all String objects are referenced from the String Constant Pool and so a lot of memory would be used up if you had lots of String manipulation to do. But after reading this article on Strings and String literals I understand that it is only String Literals that are referenced from the String Constant Pool and that this is all done at compile time, whereas if a string is created at runtime and the reference to it is lost, then it will be eligible for Garbage Collection just like any other object.
So is it just that if you did loads of String manipulation work then there would be so many discarded objects that memory would be used up and Garbage Collection would be forced to run?
I'm also a bit confused by a paragraph in K&B SCJP 6 that says:
String objects are immutable, so if you chose to do a lot of manipulations with String objects, you will end up with a lot of abandoned String objects in the String pool. (Even in these days of gigabytes of RAM, it's not a good idea to waste precious memory on discarded String pool objects) One the other hand, objects of type StringBuffer and StringBuilder can be modified over and over again without leaving behind a great effluenceof discarded String objects.
It is the bit that says "you will end up with a lot of abandoned String objects in the String pool" that is confusing me. I thought that only literals were referenced from the String Pool (or is that different from String Literal Pool?). I read this post on Garbage Collection for Strings and Joe Ess says that
The literal pool cannot get "full". The compiler creates it by pulling the literals out of your code, so it is a set size when you start your program. It neither grows nor shrinks.
So as i understand it any Strings created at runtime are not referenced from the String Literal Pool, but instead are discarded and eligible for GC if they are abandoned. So the only advantage of using the StringBuilder/Buffer classes is that you save the GC having to clean up memory because the StringBuilder/Buffer objects use the same memory over and over again and you don't end up with so much wasted memory that needs GC to free it. So it is more efficient to use StringBuilder/Buffer because GC doesn't have to run as often. But, it is *not* true that the String Literal Pool will get bigger and bigger because the String Literal Pool references are all created at compile time.
It used to be that string concatenation was a relatively expensive process in that for each concatenation a new string was created. String concatenation now uses StringBuilder internally, so it's nowhere near as big of an issue as it used to be, although I haven't actually benchmarked it in a long time.
Especially if you concatenate strings in a loop, it's more efficient to use StringBuilder than to concatenate String objects with +. Look at the following code:
The Java compiler translates this code to something like this:
Note what happens inside the loop: for every iteration, a new StringBuilder object is created, and each time it is initialized with the content of result at that moment. That means that in each iteration, the content of result is copied into the new StringBuilder. And at the end of the loop, the content of that StringBuilder is again copied into a new String object, which is then assigned to result. There's a lot of unnecessary copying going on.
If you write it using a StringBuilder outside the loop yourself, you can make it much more efficient:
Now, only one StringBuilder object needs to be created, and there is no copying of the whole content of the StringBuilder at all in the loop.
David Newton wrote:String concatenation now uses StringBuilder internally, so it's nowhere near as big of an issue as it used to be, although I haven't actually benchmarked it in a long time.
It has always worked like that (except that previously, StringBuffer was used instead of StringBuilder). But this in itself doesn't solve the inefficiency problem when concatenating strings in a loop.
Because String objects are immutable, new String objects have to be created whenever you want to change the string value.
Say you want the 'x' gone in String object with value 'rrrrrrrrxrrrrrr', you'll have to read the first r's, the latter r's, catenate them
and store the result in a new String object. String's methods that appear to change the string do not, they return a new String
object with the result. The original string value is still the same.
If you use a StringBuilder, you can actually remove the x in the value. The value is not immutable now. That saves time and
new String objects.
Joined: Oct 24, 2008
Great. Thanks very much to everyone for you answers.
Jesper, can I just ask that when you say that appending to a StringBuilder is more efficient in your examples. Do you mean that it is more efficient for the reason that it takes more time to create new String objects than it does to append to the StringBuilder, or that it is more efficient because of the way less memory is being used, or is it both?
Joe Lemmer wrote:Jesper, can I just ask that when you say that appending to a StringBuilder is more efficient in your examples. Do you mean that it is more efficient for the reason that it takes more time to create new String objects than it does to append to the StringBuilder, or that it is more efficient because of the way less memory is being used, or is it both?
Look carefully at the second and third code snippets in my post.
In the second, a new StringBuilder is created for every iteration of the loop. The content of the string has to be copied into that StringBuilder, and at the end of the loop the content of the StringBuilder is copied into a new string. So it's copying the whole content of the string two times. With 10 iterations, it's creating 10 StringBuilders, and it is copying the content 20 times.
In the third code snippet, all of that copying is eliminated. It only needs to create 1 StringBuilder object and no back and forth copying is necessary at all in the loop; only on the end it copies the StringBuilder content to the String.