Linux provides several alternatives to the traditional blocking and non-blocking I/O models.

One of them is the epoll API.

The epoll API

epoll() is a Linux kernel system call that provides an efficient way to monitor multiple file descriptors to check if I/O is possible on any of them. It is a new mechanism and introduced in Linux 2.6. It is similar to other system calls, such as select() and poll(), but it is more performant. epoll() scales much better when monitoring large numbers of file descriptors.

Due to its performance and flexibility, epoll() is commonly used in modern network and server applications to manage a large number of concurrent connections with minimal latency.

An epoll instance maintains two lists:

Interest list. Contains a list of file descriptors to be monitored.
Ready list. Contains a list of file descriptors that are ready for I/O.

With this design, epoll is able to scale to thousand of file descriptors without any performance penalties.

Notification Modes

epoll() supports two types of notification modes:

Level-Triggered (LT). This is the default mode in which as long as a file descriptor is ready for I/O, subsequent calls to epoll_wait() will continue to report it.
Edge-Triggered(ET). In this mode, epoll_wait() only reports an event when a change occurs. The subsequent calls to epoll_wait() will block although there’s still data available on a file descriptor.

How It Works

This API consists of three system calls:

epoll_create(). Create a new epoll instance and returns a file descriptor referring to the instance.
epoll_ctl(). Add, remove, or modify a file descriptor into the interest list associated with an epoll instance.
epoll_wait(). Wait until I/O is ready for any of the file descriptors in the interest list.

To make it clear, we are going to implement a simple example of using the epoll API.

#include <sys/epoll.h>
#include <fcntl.h>

extern "C"
{
#include "lib/error_functions.h"
#include "lib/tlpi_hdr.h"
}

#define MAX_BUF 1000 // Maximum bytes fetched by a single read()
#define MAX_EVENTS 5 // Maximum number of events to be returned from a single epoll_wait call

int main(int argc, char *argv[])
{
    printf("Run %d\n", argc);
    int epfd, ready, fd, s, j, numOpenFds;
    struct epoll_event ev;
    struct epoll_event evlist[MAX_EVENTS];
    char buf[MAX_BUF];

    epfd = epoll_create(argc - 1);
    if (epfd == -1)
    {
        errExit("epoll_create");
    }

    // Open each file and add it into the "interest list"
    for (j = 1; j < argc; j++)
    {
        printf("Opening file descriptor for \"%s\"\n", argv[j]);
        fd = open(argv[j], O_RDONLY);
        if (fd == -1)
        {
            errExit("open");
        }
        printf("Opened \"%s\" on fd %d\n", argv[j], fd);

        ev.events = EPOLLIN;
        ev.data.fd = fd;
        if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) == -1)
        {
            errExit("epoll_ctl");
        }
    }

    numOpenFds = argc - 1;

    while (numOpenFds > 0)
    {
        // Fetch up to MAX_EVENTS items for the ready list

        printf("About to epoll_wait()\n");
        ready = epoll_wait(epfd, evlist, MAX_EVENTS, -1);
        if (ready == -1)
        {
            if (errno == EINTR)
            {
                continue;
            }
            else
            {
                errExit("epoll_wait\n");
            }
        }
        printf("Ready %d\n", ready);

        // Deal with returned list of events
        for (j = 0; j < ready; j++)
        {
            printf(" fd=%d; events: %s%s%s\n", evlist[j].data.fd,
                   (evlist[j].events & EPOLLIN) ? "EPOLLIN " : "",
                   (evlist[j].events & EPOLLHUP) ? "EPOLLHUP " : "",
                   (evlist[j].events & EPOLLERR) ? "EPOLLERR " : "");

            if (evlist[j].events & EPOLLIN)
            {

                s = read(evlist[j].data.fd, buf, MAX_BUF);
                if (s == -1)
                {
                    errExit("read");
                }
                printf("  read %d bytes: %.*s\n", s, s, buf);
            }
            else if (evlist[j].events & (EPOLLHUP | EPOLLERR))
            {
                printf("Closing fd %d\n", evlist[j].data.fd);
                if (close(evlist[j].data.fd) == -1)
                {
                    errExit("close");
                }
                numOpenFds--;
            }
        }
    }

    printf("All file descriptors closed; bye\n");
    exit(EXIT_SUCCESS);
}

Linux Asynchronous Communication epoll

The epoll API

Notification Modes

How It Works

Comments

More from this blog

Linux IPC: UNIX Domain Socket

Linux Inter Process Communication API

BLE: A Deep Dive into GATT

Bluetooth Low Energy (BLE) 101

Command Palette

The epoll API

Notification Modes

How It Works

Comments

More from this blog