Because you might still be using that memory ...
We all know that Go is extremely fast and very easy to develop with but as with any managed language, it’s easy to inadvertently generate large quantities of garbage. No, I’m not talking about poorly written code (yeah, I’m guilty of that!) but garbage as in “memory that has to be reclaimed”. This is done via Garbage Collection and the GC in Go keeps getting faster with each release but you still want to avoid it when you can.
One common culprit is creating temporary buffers for things like rendering, encoding and compression so an easy fix is to re-use these buffers instead of creating new ones each time. There’s a sync.Pool
in the standard library that can be used for this but there is a subtle “gotcha” which I’ll explain.
Here’s a typical implementation of a bytes.Buffer
pool in Go using the standard library sync.Pool
:
package engine
import (
"bytes"
"sync"
)
// buffer pool to reduce GC
var buffers = sync.Pool{
// New is called when a new instance is needed
New: func() interface{} {
return new(bytes.Buffer)
},
}
// GetBuffer fetches a buffer from the pool
func GetBuffer() *bytes.Buffer {
return buffers.Get().(*bytes.Buffer)
}
// PutBuffer returns a buffer to the pool
func PutBuffer(buf *bytes.Buffer) {
buf.Reset()
buffers.Put(buf)
}
Very simple and hopefully fairly common.
Here’s a simple example of getting a buffer from the pool and using defer
to add it back when the function completes. In this case, simply rendering a template into a byte slice:
func render(id string) []byte {
data := storage.Get(id)
w := GetBuffer()
defer PutBuffer(w)
template.Execute(w, data)
return w.Bytes()
}
Great. We get a boost by re-using the buffers and reduce some of the GC pressure in our app - that not only saves memory but also some CPU as well.
Did you notice the issue? No, not that I didn’t handle the error that the template Execute method might return. More subtle - returning the result from w.Bytes()
.
Why is this bad and why is it a gotcha?
The problem is that this is returning a slice from inside the buffer and at the same time we’ve told the pool that it can re-use it. The internals of how slices in go are implemented means that something else might start writing to the same underlying memory before we’ve completely finished with it.
It can lead to subtle issues like content from one item suddenly appearing half-way through the rendered content of something else. If you have code that loops through items processing and writing one after the other then it may not be noticeable but if you change the code to write all the items at the end in a batch then it will be more likely to trigger (as will more load from concurrent requests on the system).
We might be tempted to solve the problem by making a copy of the bytes from the buffer before we return:
b := make([]byte, w.Len())
copy(b, w.Bytes())
return b
While this does appear to solve the problem, it’s only really addressing the symptom and kind of defeats the purpose of creating the buffer pool in the first place - we’re back to creating temporary slices that will have to be Garbage Collected in addition to the buffers + pool we added. Oh dear.
What we really need to do to solve the issue is to make sure the scope of the buffer matches our use of it. We do this by moving the buffer creation and cleanup outside of the method.
As an aside, we don’t want to pass in the buffer as a buffer, we should instead make use of interfaces - there is no reason to tie our method to the implementation of what we’re using to write to, just that we want to write something. So instead our method should just expect an io.Writer
that it can use. This makes our function much more useful because we may want to write to other writers in future.
Now the method looks like this:
func render(w io.Writer, id string) {
data := storage.Get(id)
template.Execute(w, data)
}
We still make use of the buffer pool but it’s moved outside of this method:
for _, id := range ids {
w := GetBuffer()
defer PutBuffer(w)
render(w, id)
// do something with w.Bytes()
}
Of course this example is contrived and over-simplified but hopefully, if you ever have buffers being corrupted, it might help explain what is happening and some possible solutions.