The Problem#

Simplest Case: One Source#

  • Read from file descriptor until end-of-file (see here)

  • Terminate

  • File descriptor might refer to anything that lets me read from it

Pipe, TTY, …#

$ cat /etc/passwd|grep jfasch
jfasch:x:1000:1000:Joerg Faschingbauer:/home/jfasch:/bin/bash
../../../../../../../_images/simple-pipe.svg
$ grep jfasch
pattern not on this line
pattern (jfasch) on this line
pattern (jfasch) on this line
^D
../../../../../../../_images/simple-tty.svg

A “Real Life” Example#

  • Read records (id, firstname, lastname) from standard input

  • Insert into database

  • On EOF, commit and terminate

../../../../../../../_images/db-simple.svg

A “Real Life” Example: Code#

#pragma once

#include <string>
#include <regex>
#include <print>

struct Record
{
public:
    Record() = default;
    Record(int id, const std::string& firstname, const std::string& lastname)
    : id(id), firstname(firstname), lastname(lastname) {}

    operator bool() const { return id != -1; }

    const int id = -1;
    const std::string firstname; 
    const std::string lastname;
};

class Database
{
public:
    void insert(const Record& r) {
        std::println("insert id={}, firstname={}, lastname={}", r.id, r.firstname, r.lastname);
    }
    void commit() {
        std::println("commit");
    }
    void rollback() {
        std::println("rollback");
    }
};

Record split_line(const std::string& line)
{
    static const std::regex re_line("^(\\d+)\\s+(\\w+)\\s+(\\w+)\\s*$");

    std::smatch match;
    if (std::regex_search(line, match, re_line)) {
        return Record(std::stoi(match[1].str()), match[2].str(), match[3].str());
    }
    else
        return Record();
}
#include "database.h"
#include <unistd.h>

int main()
{
    Database db;

    bool quit = false;
    while (!quit) {
        char line[64];
        ssize_t nread = read(STDIN_FILENO,             // <-- blocking read from fd 0
                             line, sizeof(line)-1);
        if (nread == -1) {
            perror("read");
            return 1;
        }
        if (nread == 0) {                              // <-- graceful shutdown on eof
            quit = true;
            continue;
        }

        std::string sline(line, nread);                // <-- (zero-termination!)
        if (Record r = split_line(sline))
            db.insert(r);
        else
            std::println(stderr, "invalid line: \"{}\"", sline);
    }

    db.commit();
    return 0;
};

And Multiple Sources?#

../../../../../../../_images/db-two-sources.svg
  • Performing I/O on just one file descriptor at a time is fine

  • How would we use two input sources?

  • Two loops, each with a blocking read in the middle?

  • Multithreading (see here) is not an option

    • Thread safety of business code (Database) is not always clear

    • Usually programmers don’t quite understand the nature of race conditions (see here for the mother of all race conditions - load-modify-store conflict)

  • Non-blocking I/O?

    • Set file descriptors to non-blocking

    • In a tight loop, see if any of them has data available

    • No!

The Problem#

  • Two file descriptors (STDIN_FILENO and a UDP socket)

  • … and only one loop

#include "database.h"

#include <unistd.h>
#include <sys/socket.h>
#include <netinet/in.h>

int main()
{
    Database db;


    // <setup UDP socket>
    int sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
    if (sock == -1) {
        perror("socket");
        return 1;
    }
    struct sockaddr_in addr = {
        .sin_family = AF_INET,
        .sin_port = htons(1234),
        .sin_addr = INADDR_ANY,
    };
    int error = bind(sock, (struct sockaddr *)&addr, sizeof(addr));
    if (error == -1) {
        perror("bind");
        return 1;
    }
    // </setup UDP socket>


    bool quit = false;
    while (!quit) {
        char line[64];
        ssize_t nread = read(STDIN_FILENO,             // <-- and now? how read from socket?
                             line, sizeof(line)-1);
        if (nread == -1) {
            perror("read");
            return 1;
        }
        if (nread == 0) {
            quit = true;
            continue;
        }

        std::string sline(line, nread);
        if (Record r = split_line(sline))
            db.insert(r);
        else
            std::println(stderr, "invalid line: \"{}\"", sline);
    }

    db.commit();
    return 0;
};

Spoiler: The Solution#

  • If I knew which of the input sources has data …

  • … then I could do I/O on it without blocking