Linux Asynchronous Communication epoll
Linux provides several alternatives to the traditional blocking and non-blocking I/O models.
One of them is the epoll API.
The epoll API
epoll() is a Linux kernel system call that provides an efficient way to monitor multiple file descriptors to check if I/O is possible on any of them. It is a new mechanism and introduced in Linux 2.6. It is similar to other system calls, such as select() and poll(), but it is more performant. epoll() scales much better when monitoring large numbers of file descriptors.
Due to its performance and flexibility, epoll() is commonly used in modern network and server applications to manage a large number of concurrent connections with minimal latency.
An epoll instance maintains two lists:
Interest list. Contains a list of file descriptors to be monitored.
Ready list. Contains a list of file descriptors that are ready for I/O.
With this design, epoll is able to scale to thousand of file descriptors without any performance penalties.
Notification Modes
epoll() supports two types of notification modes:
Level-Triggered (LT). This is the default mode in which as long as a file descriptor is ready for I/O, subsequent calls to epoll_wait() will continue to report it.
Edge-Triggered(ET). In this mode, epoll_wait() only reports an event when a change occurs. The subsequent calls to epoll_wait() will block although there’s still data available on a file descriptor.
How It Works
This API consists of three system calls:
epoll_create(). Create a new epoll instance and returns a file descriptor referring to the instance.
epoll_ctl(). Add, remove, or modify a file descriptor into the interest list associated with an epoll instance.
epoll_wait(). Wait until I/O is ready for any of the file descriptors in the interest list.
To make it clear, we are going to implement a simple example of using the epoll API.
#include <sys/epoll.h>
#include <fcntl.h>
extern "C"
{
#include "lib/error_functions.h"
#include "lib/tlpi_hdr.h"
}
#define MAX_BUF 1000 // Maximum bytes fetched by a single read()
#define MAX_EVENTS 5 // Maximum number of events to be returned from a single epoll_wait call
int main(int argc, char *argv[])
{
printf("Run %d\n", argc);
int epfd, ready, fd, s, j, numOpenFds;
struct epoll_event ev;
struct epoll_event evlist[MAX_EVENTS];
char buf[MAX_BUF];
epfd = epoll_create(argc - 1);
if (epfd == -1)
{
errExit("epoll_create");
}
// Open each file and add it into the "interest list"
for (j = 1; j < argc; j++)
{
printf("Opening file descriptor for \"%s\"\n", argv[j]);
fd = open(argv[j], O_RDONLY);
if (fd == -1)
{
errExit("open");
}
printf("Opened \"%s\" on fd %d\n", argv[j], fd);
ev.events = EPOLLIN;
ev.data.fd = fd;
if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) == -1)
{
errExit("epoll_ctl");
}
}
numOpenFds = argc - 1;
while (numOpenFds > 0)
{
// Fetch up to MAX_EVENTS items for the ready list
printf("About to epoll_wait()\n");
ready = epoll_wait(epfd, evlist, MAX_EVENTS, -1);
if (ready == -1)
{
if (errno == EINTR)
{
continue;
}
else
{
errExit("epoll_wait\n");
}
}
printf("Ready %d\n", ready);
// Deal with returned list of events
for (j = 0; j < ready; j++)
{
printf(" fd=%d; events: %s%s%s\n", evlist[j].data.fd,
(evlist[j].events & EPOLLIN) ? "EPOLLIN " : "",
(evlist[j].events & EPOLLHUP) ? "EPOLLHUP " : "",
(evlist[j].events & EPOLLERR) ? "EPOLLERR " : "");
if (evlist[j].events & EPOLLIN)
{
s = read(evlist[j].data.fd, buf, MAX_BUF);
if (s == -1)
{
errExit("read");
}
printf(" read %d bytes: %.*s\n", s, s, buf);
}
else if (evlist[j].events & (EPOLLHUP | EPOLLERR))
{
printf("Closing fd %d\n", evlist[j].data.fd);
if (close(evlist[j].data.fd) == -1)
{
errExit("close");
}
numOpenFds--;
}
}
}
printf("All file descriptors closed; bye\n");
exit(EXIT_SUCCESS);
}