Non-blocking I/O in Go
18 Feb 2019Whether you know it or not, if you are using Go you are probably using non-blocking I/O. This post will dig in a little into that, but go further into how you can actually take more control of the I/O handling in Go. This is especially nice as go1.11 and go1.12 add some very interesting interfaces to help with this. This had a slightly different conclusion than I’d expected, but 🤷♂, stuff happens
What is non-blocking I/O? A simple explanation: It allows you to read()
and
write()
to a file descriptor (that is, any type of open file be it a socket,
pipe, a file on disk, whatever) without having these calls block just because
the file is not ready. How does this work? A little something like this:
fd, _ := syscall.Open("/foo", syscall.O_CREAT|syscall.O_RDWR|syscall.O_CLOEXEC|syscall.O_NONBLOCK, 0644)
This is instructing the system to:
open the file at /foo, create it if it does not exist (
O_CREAT
)close the file if executing a new processes (
O_CLOEXEC
)… this is important to not copy file descriptors between processes unexpectedlyOpen the file with both read and write access
Use non-blocking mode (
O_NONBLOCK
)Set the permissions on new files
That’s cool, but what does it actually mean? Well, for a regular file not much because they are always readable and always writable…. but… other types of files, such as a pipe this gets very interesting, so instead we can do this before we open the file:
syscall.Mkfifo("/foo", 0644)
This will create a fix-sized pipe buffer. Without the O_NONBLOCK
flag, when a
read()
is performed, the caller will block until there is data to read.
Likewise when a write() is performed the caller will be blocked if the pipe is
full. Here we are using O_NONBLOCK
, and so will have slightly different
semantics. Instead of blocking, a call to read()
on an empty pipe or write()
to
a full pipe will return an EAGAIN
error. This is a nice way of saying the pipe
is not ready for that action (the error message might like resource temporarily
unavailable). EAGAIN
really means “try again”, there is an alias for this error
called EWOULDBLOCK
.
From here you might want to use a polling mechanism such as epoll to be notified of when the pipe is ready for read or write (depending on what you need).
So, Go does all this for you. When you call os.Open(...), Go opens the file with the non-blocking flag, sets up watches for the file descriptor to know when it’s ready for read/write/is closed, and then provides a blocking API on top of non-blocking I/O for a natural flow like so:
buf := make([]byte, 32)
n, _ := f.Read(buf)
fmt.Println(string(buf[:n]))
So, the fmt.Println
doesn’t happen until Read has completed. If the file is not
ready for read()
, it pauses the goroutinue and allows other goroutines to run
while it waits for it to be ready, then wakes up our goroutine when it is ready
so it can continue. This is really nice and simple, don’t have to think about
callbacks, or polling API’s, or any low-level details and get all the benefit
of asynchronous I/O.
The trouble is, a blocking API isn’t always what you want. Sometimes you actually need lower-level control than what you might see in a typical Go program. A relatively simple example of this is this:
buf := make([]byte, 32)
go func() { f2.Close() }
for {
n, err := f1.Read(buf)
if err != nil {
break
}
f2.Write(buf[:n])
}
This looks harmless, but what if f1
just blocks because there is never any data
(or is just not closed… for good reason), the goroutine running this will run
forever, blocking on the call to Read … this happens even if f2
is closed.
note: this may actually be more pervasive in the go ecosystem than is realized, especially a problem in Docker’s code base… the above code is essentially what io.Copy does.
Other cases where one might need this level of control is implementing
semantics for custom read/write behavior, perhaps those using zero-copy
techniques such as splice()
.
So, what’s the alternative? Bypass the go runtime and do our own file polling
and switching? Oh no. This would be horribly annoying. Before go1.11
, though,
this would be precisely what one needed to do, except for some few cases where
you can get access to the underlying file descriptor.
Starting with go1.11
, there are two new interfaces:
These interfaces essentially allow Go to expose the raw file descriptor but
without sacrificing any control by the runtime itself to do weird things (such
as swap out file descriptors from beneath you) AND allows the caller to still
utilize the built-in runtime poller so you don’t have to deal with these
semantics yourself. The Read and Write methods on this interface take a
function which gets called in a loop when the file descriptor is ready for the
operation, and it’s up to you to determine when to hand off back to Go,
normally you’d do this when you receive EAGAIN
.
This was utilized in go1.11
to support transparently copying (even via io.Copy
)
between two TCP connections using zero-copy techniques (splice()
on linux).
note: this is implemented in the ReadFrom method
I recently added support for syscall.Conn to containerd’s fifo
package (go1.12
only), which is
used for buffering stdio from containers. I’ve also been working on a similar
ReadFrom
implementation
as above for this package to get the same sort of zero-copy behavior, which
shows extremely promising
results
when copying between two pipes (very common in container-land)… however I’m
still trying to decide if this complexity is worth it in the library vs just
having some supporting library deal with this… where we’d want to use tee()
in
addition to splice()
to copy I/O to multiple destinations anyway. The benefit
here is getting zero-copy performance without setting up my own poller.
So… back to the problem stated above… what does this look like with our above
example? Honestly still not all that simple because we’d need to setup our own
epoll on f2 to know when the file has been closed and then go ahead and cleanup
f1
.
To do that we’d use syscall.RawConn
's Control(func(uintptr))
definition to
get access to the file descriptor. This is because there is no means of
getting a close notification from the runtime poller… writing this I think I
may just open a feature request 😃, but at least it is sort of possible without
such a feature.