Golang: Having fun with Inlining and Cgo
Inlining is an optimization made by the Go compiler: it tries to find out which functions should be called and which should be inlined (‘embedded’) to avoid the cost of a function call (allocating a stack, jumping to the function’s address, etc.). How does this work?
Take the following program:
func main() {
PrintSomething()
}
func PrintSomething() {
println("Hello, World!")
}
When compiled the PrintSomething function will probably be inlined, effectively ‘copying’ the function instead of calling it. It is compiled as if it were written as follows:
func main() {
println("Hello, World!")
}
The result is the same, but the performance is (marginally,in this case) better. Side note: modern Java environments actually do the same, but actually at runtime instead of compile time using the JIT (Just In Time) compiler.
Having fun with nm
How to find out what gets inlined by the compiler and what not? One way would be to use a disassembler, but alternatively you can inspect the symbol table. The symbol table lists all names or symbols in a program or library, including our functions. When you run it on a native program or library (in this case a Go program), it’ll output something like this:
... 000000000403c8b0 t runtime.step 000000000401fac0 t runtime.stkbucket 0000000004027d60 t runtime.stopTheWorldWithSema 0000000004029e90 t runtime.stoplockedm 0000000004029740 t runtime.stopm 0000000004004250 t runtime.strequal 0000000004003a30 t runtime.strhash 00000000040dc2b1 s runtime.support_erms 00000000040dc2b2 s runtime.support_popcnt ...
Fun fact: Go uses static linking, which is proved by nm’s output above. Go’s runtime is actually included in the compiled program!
Inlining with Cgo
However, when you mix in C-calls (which are compiled by Cgo), inlining appears to work differently, take the following program:
package main
import "C"
func main() {
InlinedOrProbablySkippedExample()
InlinedExample()
NotInlinedExample()
}
func InlinedOrProbablySkippedExample() {
// Empty function
}
func InlinedExample() {
println("Hello, World!")
}
func NotInlinedExample() {
println("Hello, Universe!")
C.CString("test")
}
Now compile this run nm, looking for our *Example functions:
$ go build main.go $ nm main | grep Example
00000000040510a0 t main.NotInlinedExample
It appears that the plain Go functions were inlined: they don’t appear in the symbol table. But the interesting part is that the one containing the call to a C-function wasn’t inlined!
Why is this important?
Well, maybe it isn’t that important. Since inlining is a performance optimization, you probably won’t notice the difference. Unless of course you have a Go application which makes heavy use of a C-library and call performance for those functions is super important, but this is an edge case. When performance is so important you probably shouldn’t be mixing platforms anyways.
The examples in this post can be found on GitHub.