Gccgo in 2019
Faster, but still yielding (much) slower code than the standard compiler
Back in 2013, when Go 1.2 was on the cusp of release, Dave Cheney benchmarked gccgo against the standard Go compiler, gc. The results were rather disappointing. Code produced by gccgo was much slower than code produced by gc for all but the most CPU bound of workloads.
It’s been five years, and much has changed. It’s high time for these benchmarks to be updated for the Go 1.11 era. Gccgo has learned some new tricks, like escape analysis, but gc has seen a continual stream of improvements, from both the Go team and the community. (Gccgo sees activity from just a handful of folks.)
The present, in summary
To enable comparison with Dave’s results, I ran the same “go1” benchmark suite1 under Go 1.11. The results are below.
There’s a lot more red on the board this time around. Gccgo used to eke out a
win on a half dozen of the CPU-intensive benchmarks, but in the last five years
gc has closed the gap. The only remaining benchmark where gccgo has the upper
Fannkuch11, and it’s a very small margin at that.
What’s happened is that, while both compilers are improving, gc is improving faster than gccgo, and so gccgo looks worse in comparison. For proof, we can compare gccgo against itself. Here’s today’s gccgo compared to gccgo 4.9:
This figure paints a much rosier picture. Gccgo is, indeed, improving. Today’s version of gccgo results in double-digit improvements over 2013’s gccgo for most go1 benchmarks. That’s actually quite the achievement, given how resource-constrained gccgo development appears to be.
The present, in detail
But what’s up with those six benchmarks that have gotten slower? It’s not
immediately clear whether gccgo is to blame. Since we’re comparing benchmark
results across Go versions, we’re not just measuring changes to the compiler;
we’re also measuring changes to the runtime and standard library. The
performance degradation in
HTTPClientServer, for example, could just as easily
be the result of a change to the
net/http package as a change to the gccgo
In fact, it’s nearly impossible to isolate just the compiler improvements, as each Go compiler is tightly coupled to its contemporaneous runtime and standard library. But we can extract at least a fuller picture by comparing the evolution of gccgo performance to the evolution of gc performance. I want to take a look at two representative examples.
It turns out that the first benchmark,
HTTPClientServer, has gotten slower
with gc, too. As unfortunate as this is, it’s reassuring evidence that
there is nothing particularly wrong with gccgo. I suspect that a bug, or a
series of bugs, was discovered in the runtime or
net/http whose solution(s)
forced a performance regression.2 Performance regressions in the standard Go
toolchain do not go unnoticed for long, so it is likely that this regression
The results for the
BinaryTree17 benchmark, on the other hand, are downright
strange. Gc managed a nearly 25% speedup on this workload, while gccgo yielded a
50% slowdown over the same time frame. There is clearly something in gccgo to
investigate here, especially considering that the benchmark makes use of no
standard library features (proof). Since the benchmark does
depend heavily on the garbage collector and memory allocator, I suspect that
something’s gone amiss in gccgo’s runtime.
I’ve begun gathering a list of the known performance bottlenecks as a starting point for investigation of these performance problems. If you know of additional bottlenecks, or know of code that behaves particularly pathologically with gccgo, please chime in!
In 2019, performance is still a sore spot for gccgo. Gc yields faster code than gccgo on nearly every workload. Unless you’re compiling for an esoteric platform that gc doesn’t support, or you need faster interop between Go and C than cgo provides, there is little reason (yet!) to choose gccgo over the standard Go toolchain.
Raw benchmarking data for all the figures in this post, as well as reproduction instructions, are available as a GitHub Gist.
I’m not entirely clear on what the Go team uses the go1 benchmark suite for, but the commit that introduced the suite (6e88755) claims the “intent is to have mostly end-to-end benchmarks timing real world operations,” which is exactly what we’re after. ↩
For a good example of how bug fixes can force performance regressions, take a look at golang/go#18964. Note that this particular issue could only be responsible for about 1% of the total regression observed in the