String concatenation: concat() vs "+" operator
String concatenation: concat() vs "+" operator
Question
Assuming String a and b:
a += b
a = a.concat(b)
Under the hood, are they the same thing?
Here is concat decompiled as reference. I'd like to be able to decompile the +
operator as well to see what that does.
public String concat(String s) {
int i = s.length();
if (i == 0) {
return this;
}
else {
char ac[] = new char[count + i];
getChars(0, count, ac, 0);
s.getChars(0, i, ac, count);
return new String(0, count + i, ac);
}
}
Accepted Answer
No, not quite.
Firstly, there's a slight difference in semantics. If a
is null
, then a.concat(b)
throws a NullPointerException
but a+=b
will treat the original value of a
as if it were null
. Furthermore, the concat()
method only accepts String
values while the +
operator will silently convert the argument to a String (using the toString()
method for objects). So the concat()
method is more strict in what it accepts.
To look under the hood, write a simple class with a += b;
public class Concat {
String cat(String a, String b) {
a += b;
return a;
}
}
Now disassemble with javap -c
(included in the Sun JDK). You should see a listing including:
java.lang.String cat(java.lang.String, java.lang.String);
Code:
0: new #2; //class java/lang/StringBuilder
3: dup
4: invokespecial #3; //Method java/lang/StringBuilder."<init>":()V
7: aload_1
8: invokevirtual #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
11: aload_2
12: invokevirtual #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
15: invokevirtual #5; //Method java/lang/StringBuilder.toString:()Ljava/lang/ String;
18: astore_1
19: aload_1
20: areturn
So, a += b
is the equivalent of
a = new StringBuilder()
.append(a)
.append(b)
.toString();
The concat
method should be faster. However, with more strings the StringBuilder
method wins, at least in terms of performance.
The source code of String
and StringBuilder
(and its package-private base class) is available in src.zip of the Sun JDK. You can see that you are building up a char array (resizing as necessary) and then throwing it away when you create the final String
. In practice memory allocation is surprisingly fast.
Update: As Pawel Adamski notes, performance has changed in more recent HotSpot. javac
still produces exactly the same code, but the bytecode compiler cheats. Simple testing entirely fails because the entire body of code is thrown away. Summing System.identityHashCode
(not String.hashCode
) shows the StringBuffer
code has a slight advantage. Subject to change when the next update is released, or if you use a different JVM. From @lukaseder, a list of HotSpot JVM intrinsics.
Read more… Read less…
Niyaz is correct, but it's also worth noting that the special + operator can be converted into something more efficient by the Java compiler. Java has a StringBuilder class which represents a non-thread-safe, mutable String. When performing a bunch of String concatenations, the Java compiler silently converts
String a = b + c + d;
into
String a = new StringBuilder(b).append(c).append(d).toString();
which for large strings is significantly more efficient. As far as I know, this does not happen when you use the concat method.
However, the concat method is more efficient when concatenating an empty String onto an existing String. In this case, the JVM does not need to create a new String object and can simply return the existing one. See the concat documentation to confirm this.
So if you're super-concerned about efficiency then you should use the concat method when concatenating possibly-empty Strings, and use + otherwise. However, the performance difference should be negligible and you probably shouldn't ever worry about this.
I ran a similar test as @marcio but with the following loop instead:
String c = a;
for (long i = 0; i < 100000L; i++) {
c = c.concat(b); // make sure javac cannot skip the loop
// using c += b for the alternative
}
Just for good measure, I threw in StringBuilder.append()
as well. Each test was run 10 times, with 100k reps for each run. Here are the results:
StringBuilder
wins hands down. The clock time result was 0 for most the runs, and the longest took 16ms.a += b
takes about 40000ms (40s) for each run.concat
only requires 10000ms (10s) per run.
I haven't decompiled the class to see the internals or run it through profiler yet, but I suspect a += b
spends much of the time creating new objects of StringBuilder
and then converting them back to String
.
Most answers here are from 2008. It looks that things have changed over the time. My latest benchmarks made with JMH shows that on Java 8 +
is around two times faster than concat
.
My benchmark:
@Warmup(iterations = 5, time = 200, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 5, time = 200, timeUnit = TimeUnit.MILLISECONDS)
public class StringConcatenation {
@org.openjdk.jmh.annotations.State(Scope.Thread)
public static class State2 {
public String a = "abc";
public String b = "xyz";
}
@org.openjdk.jmh.annotations.State(Scope.Thread)
public static class State3 {
public String a = "abc";
public String b = "xyz";
public String c = "123";
}
@org.openjdk.jmh.annotations.State(Scope.Thread)
public static class State4 {
public String a = "abc";
public String b = "xyz";
public String c = "123";
public String d = "[email protected]#";
}
@Benchmark
public void plus_2(State2 state, Blackhole blackhole) {
blackhole.consume(state.a+state.b);
}
@Benchmark
public void plus_3(State3 state, Blackhole blackhole) {
blackhole.consume(state.a+state.b+state.c);
}
@Benchmark
public void plus_4(State4 state, Blackhole blackhole) {
blackhole.consume(state.a+state.b+state.c+state.d);
}
@Benchmark
public void stringbuilder_2(State2 state, Blackhole blackhole) {
blackhole.consume(new StringBuilder().append(state.a).append(state.b).toString());
}
@Benchmark
public void stringbuilder_3(State3 state, Blackhole blackhole) {
blackhole.consume(new StringBuilder().append(state.a).append(state.b).append(state.c).toString());
}
@Benchmark
public void stringbuilder_4(State4 state, Blackhole blackhole) {
blackhole.consume(new StringBuilder().append(state.a).append(state.b).append(state.c).append(state.d).toString());
}
@Benchmark
public void concat_2(State2 state, Blackhole blackhole) {
blackhole.consume(state.a.concat(state.b));
}
@Benchmark
public void concat_3(State3 state, Blackhole blackhole) {
blackhole.consume(state.a.concat(state.b.concat(state.c)));
}
@Benchmark
public void concat_4(State4 state, Blackhole blackhole) {
blackhole.consume(state.a.concat(state.b.concat(state.c.concat(state.d))));
}
}
Results:
Benchmark Mode Cnt Score Error Units
StringConcatenation.concat_2 thrpt 50 24908871.258 ± 1011269.986 ops/s
StringConcatenation.concat_3 thrpt 50 14228193.918 ± 466892.616 ops/s
StringConcatenation.concat_4 thrpt 50 9845069.776 ± 350532.591 ops/s
StringConcatenation.plus_2 thrpt 50 38999662.292 ± 8107397.316 ops/s
StringConcatenation.plus_3 thrpt 50 34985722.222 ± 5442660.250 ops/s
StringConcatenation.plus_4 thrpt 50 31910376.337 ± 2861001.162 ops/s
StringConcatenation.stringbuilder_2 thrpt 50 40472888.230 ± 9011210.632 ops/s
StringConcatenation.stringbuilder_3 thrpt 50 33902151.616 ± 5449026.680 ops/s
StringConcatenation.stringbuilder_4 thrpt 50 29220479.267 ± 3435315.681 ops/s
Tom is correct in describing exactly what the + operator does. It creates a temporary StringBuilder
, appends the parts, and finishes with toString()
.
However, all of the answers so far are ignoring the effects of HotSpot runtime optimizations. Specifically, these temporary operations are recognized as a common pattern and are replaced with more efficient machine code at run-time.
@marcio: You've created a micro-benchmark; with modern JVM's this is not a valid way to profile code.
The reason run-time optimization matters is that many of these differences in code -- even including object-creation -- are completely different once HotSpot gets going. The only way to know for sure is profiling your code in situ.
Finally, all of these methods are in fact incredibly fast. This might be a case of premature optimization. If you have code that concatenates strings a lot, the way to get maximum speed probably has nothing to do with which operators you choose and instead the algorithm you're using!
How about some simple testing? Used the code below:
long start = System.currentTimeMillis();
String a = "a";
String b = "b";
for (int i = 0; i < 10000000; i++) { //ten million times
String c = a.concat(b);
}
long end = System.currentTimeMillis();
System.out.println(end - start);
- The
"a + b"
version executed in 2500ms. - The
a.concat(b)
executed in 1200ms.
Tested several times. The concat()
version execution took half of the time on average.
This result surprised me because the concat()
method always creates a new string (it returns a "new String(result)
". It's well known that:
String a = new String("a") // more than 20 times slower than String a = "a"
Why wasn't the compiler capable of optimize the string creation in "a + b" code, knowing the it always resulted in the same string? It could avoid a new string creation. If you don't believe the statement above, test for your self.