Non-blocking connect and select with Linux and MS-Windows

Author: Andre Adrian
Version: 29may2022

Index

Source code download

You find the source code and executables here. The license for this source code is the 3-Clause BSD License. Copyright 2022 Andre Adrian.

For Linux there is a Makefile in the Zip file. The Zip file contains binary 64-bit versions of the programs, too. My GNU C compiler is version 7.4.1, the Linux kernel is 4.12.14. Both versions are neither old nor new.

For MS-Windows I suggest that you use GCC and MinGW from WinLibs as development environment. I use GCC 11.3.0 + MinGW-w64 10.0.0 (UCRT) - release 3 without LLVM/Clang/LLD/LLDB. My MS-Windows is Windows 10, version 21H1. A little make.bat batch file in the Zip file compiles the software. You can use Code::Blocks or Visual Studio to compile the software, too.

Introduction

A chat application is the "hello world" of network programming. The chat application is a very primitive WhatsApp. The users have a TCP client program that allows them to enter text and see what other users have entered. The system operator has a TCP server that receives the entered text as TCP messages and sends copies of this text out to the TCP clients.

This series of articles describes nine different versions ranging from "FORTRAN style" C to C++14 using smart pointers, subclasses, and more. All versions can handle IPv4 and IPv6. Most versions can handle the situation of a TCP client that starts before the TCP server.

The basic idea is I/O-multiplexing. This is a single-thread method for doing things pseudo-parallel. Like all solutions, it has its pros and cons. Because you have only one thread, you don't need thread synchronization (message queue, semaphore, monitor). Because you have only one thread, you can not utilize the power of a multi-core computer. IO-multiplexing is very good for tightly coupled services. Non-coupled services can use a multi-process/multi-thread approach as the Apache HTML server does.

First, we look at the chat application from a user and system operator point of view. TCP client operation and TCP server operation are done with the same program. With the program netcat, you can start as a server or a client, too.

Let us start the program and look at the build-in documentation:

> ./liomux1
usage server: liomux1 s port [hostname]
usage client: liomux1 c hostname port
example server IPv4: ./liomux1 s 60000
example client IPv4: ./liomux1 c 127.0.0.1 60000
example server IPv6: ./liomux1 s 60000 ::1
example client IPv6: ./liomux1 c ::1 60000

I use the ">" character as a symbol for the prompt. The examples use "localhost" communication, that is all programs execute on the same computer. Because of TCP and the internet, you can use different computers at different locations.

Now we start version 1 TCP server with IPv6:

> ./liomux1 s 60000 ::1

In another console window, we start the first TCP client. The server and the client have to use the same version of Internet Protocol.

> ./liomux1 c ::1 60000

In the third console, we start a second TCP client. with the same command. We shall now see in the consoles:

> ./liomux1 s 60000 ::1
II server_open: port=60000 hostname=::1
II server_open: listen on socket 3
II server_handle: new connection ::1 on socket 4
II server_handle: new connection ::1 on socket 5
> ./liomux1 c ::1 60000
II client_open: port=60000 hostname=::1
II client_open1: connect try to ::1 (::1) port 60000 socket 3
II client_connect: connect success to ::1 port 60000 socket 3
> ./liomux1 c ::1 60000
II client_open: port=60000 hostname=::1
II client_open1: connect try to ::1 (::1) port 60000 socket 3
II client_connect: connect success to ::1 port 60000 socket 3

We see some logging information. The "II" tag tells us these are "normal" logging messages. The TCP server opened a listen port and connected this listen port to the file descriptor (socket) number 3. Later the TCP server used file descriptors 4 and 5 for communication with the two TCP clients.

The TCP clients use file descriptor number 3 for communication with the TCP server. Every program has its own file descriptors resource and they all start with the number 3. By the way, file descriptor 0 is standard in, 1 is standard out and 2 is standard error.

If we now enter text in the first TCP client and press the enter key, the entered text appears in the second TCP client. We can start more TCP clients and can verify that we have a chat application: one user writes something, and all other users get a copy of this writing.

More interesting for the system operator is if one TCP clients leaves the chat. We use CTRL-C to terminate the first TCP client. The TCP server logs:

II server_handle: connection closed on socket 4

After we start the TCP client again, we see:

II server_handle: new connection ::1 on socket 4

And chat communication is again possible between the TCP clients.

If we terminate the TCP server with CTRL-C, we get on both TCP clients the logging:

WW client_handle: connect fail to ::1 port 60000 socket 3 rv 0: Success
II client_open1: connect try to ::1 (::1) port 60000 socket 3
WW client_connect: connect fail to ::1 port 60000 socket 3: Connection refused
II client_open1: connect try to ::1 (::1) port 60000 socket 3
WW client_connect: connect fail to ::1 port 60000 socket 3: Connection refused

The "WW" tag tells that the following message is a warning. The last two logging messages "connect try" and "connect fail" repeat every 5 seconds. The first "connect fail" message is different from the others.

If we start the TCP server again, the TCP server logs:

> ./liomux1 s 60000 ::1
II server_open: port=60000 hostname=::1
II server_open: listen on socket 3
II server_handle: new connection ::1 on socket 4
II server_handle: new connection ::1 on socket 5

The two last logging messages from the TCP clients are:

II client_open1: connect try to ::1 (::1) port 60000 socket 3
II client_connect: connect success to ::1 port 60000 socket 3

With many network applications, you have to start the TCP server before the TCP client. Our chat application does not need a start order. If you start the TCP client before the TCP server, the TCP client will try every 5 seconds to connect to the TCP server. Version 0 of the chat application has no "server polling".

Design goals

The design goal is a framework around the select() function. This framework is less than 500 lines of source code. The framework provides a TCP client, TCP server, and one-shot timer. Because I/O-Multiplexing works with a single thread, all functions have to be non-blocking. We use non-blocking connect and after() is the non-blocking alternative to sleep().

There are 9 Linux C and C++ solutions liomux0.c to liomux8.cpp and 9 MS-Windows C and C++ solutions wiomux0.c to wiomux8.cpp. I go from "FORTRAN style" C programming to object-oriented framework (plugin) style programming in C++. I don't use templates.

Version
Description
0
C11 solution (Fortran style, like file descriptor open())
uses Array of Connection objects
uses Array element index
uses no class hierarchy (type switch)
uses no business logic plugin
uses no Timer events (after)
1
C11 uses Timer events (after)
2
C11 uses an array element pointer instead of an array element index
3
C11 solution (C framework/library style)
uses business logic plugin via callbacks
4
C++11 solution (OO style)
uses Object Oriented hidden pointer
uses no business logic plugin
5
C++14 uses an array of CONN objects pointers instead of an array of CONN objects
6
C++14 uses class hierarchy (vtable "switch") instead of no class hierarchy (type switch)
7
C++14 solution (better OO framework/library style)
uses business logic plugin via derived classes
8
C++14 solution (better OO framework/library style)
uses business logic plugin via callbacks

The 3 solutions 3, 7, and 8 are framework solutions. The other solutions show how you can change your source code in small steps.

The framework can support multiple TCP servers, multiple TCP clients, and multiple timers in one program. The select() function needs information from all these objects. Therefore I use one array of CONN objects to have the information about TCP clients and servers in one place and I use one array of TIMER objects to have the timer information in one place. Using fix size C arrays is "old school". You, the reader, can use deque or something else. For me, as an author, this data structure was not important. A simple array and a factory function to handle this array is fine.

The last design goal is to make the source code easy for source code to compare. A nice MS-Windows "diff" tool is WinMerge. On Linux I use mgdiff. The Microsoft Visual Studio Code has a nice build-in diff.

Linux Business logic

As a programmer you know the situation: Your customer wants to have some features, but you have to program a lot more than only the customer's requirements. This is the business logic of version 1:

/** @brief callback client read available
 * @retval >0 okay
 * @retval <=0 close or error
 */
int cb_client_read(CONN_ID id, int fd) {
  assert(id >= 0 && id < CONNMAX);
  assert(fd >= 0 && fd < FDMAX);
  // Compute where to write data. If we're stdin (0),
  // we'll write to the sockfd. If we're the sockfd, we'll
  // write to stdout (1).
  int outfd = (STDIN_FILENO == fd)? conn[id].sockfd: STDOUT_FILENO;

  // We use read() and write() in here since those work on
  // all fds, not just sockets. send() and recv() would
  // fail on stdin and stdout since they're not sockets.
  char buf[BUFSIZE];
  int readbytes = read(fd, buf, sizeof buf);
  if (readbytes > 0) {
    // Write all data out
    int writebytes = write(outfd, buf, readbytes);
    assert(writebytes == readbytes && "write");
  }
  return readbytes;
}

/** @brief callback server read available
 * @retval >0 okay
 * @retval <=0 close or error
 */
int cb_server_read(CONN_ID id, int fd) {
  assert(id >= 0 && id < CONNMAX);
  assert(fd >= 0 && fd < FDMAX);
  char buf[BUFSIZE];    // buffer for client data
  int readbytes = read(fd, buf, sizeof buf);
  if (readbytes > 0) {
    // we got some data from a client
    for (int i = 0; i < FDMAX; ++i) {
      // send to everyone!
      if (FD_ISSET(i, &conn[id].fds)) {
        // except the listener and ourselves
        if (i != conn[id].sockfd && i != fd) {
          int writebytes = write(i, buf, readbytes);
          if (writebytes != readbytes) {
            perror("WW send");
          }
        }
      }
    }
  }
  return readbytes;
}

int main(int argc, char* argv[]) {
  if (argc < 3) {
    char* name = basename(argv[0]);
    fprintf(stderr,"usage server: %s s port [hostname]\n", name);
    fprintf(stderr,"usage client: %s c hostname port\n", name);
    fprintf(stderr,"example server IPv4: ./%s s 60000\n", name);
    fprintf(stderr,"example client IPv4: ./%s c 127.0.0.1 60000\n", name);
    fprintf(stderr,"example server IPv6: ./%s s 60000 ::1\n", name);
    fprintf(stderr,"example client IPv6: ./%s c ::1 60000\n", name);
    exit(EXIT_FAILURE);
  }

  switch(argv[1][0]) {
  case 'c': {
    CONN_ID id = conn_factory();
    client_open(id, argv[3], argv[2]);
    conn_add_fd(id, STDIN_FILENO);
  }
  break;
  case 's': {
    CONN_ID id = conn_factory();
    server_open(id, argv[2], argv[3]);
  }
  break;
  default:
    fprintf(stderr,"EE %s: unexpected argument %s\n", FUNCTION, argv[1]);
    exit(EXIT_FAILURE);
  }

  conn_event_loop();  // start inversion of control
  return 0;
}

The main() function implements the build-in documentation and the start of a TCP client after the "case 'c':" line or the start of a TCP server after the "case 's':" line.

The argc and argv variables in the main() function forward the parameters from the console input to the program. The console input "./liomux1 s 60000 ::1" transforms into argc=4, argv[1]="s", argv[2]="60000" and argv[3]="::1". The variable argv[0] contains the program name with full path. The function basename() puts away the path.

The cb_client_read() reads console input from standard ina and forwards that data to the network or reads data from the network and forwards that to standard out. The clever idea of setting outfd depending on the source of the data is from Beej's Guide to Network Programming.

Function cb_server_read() reads data from one network connection and copies this data to all other network connections except the listen port and the originator. The "list of all network connections" is a fd_set (file descriptor set) data structure. The constant FD_SETSIZE tells the maximum number of file descriptors in a fd_set. This number is 1024 for Linux and GNU C library.

The chat application uses an "event programming" or "inversion of control" design. Some call it "Hollywood law": "Don't call us, we call you". The program control disappears in the function conn_event_loop() and re-appears again in the functions cb_client_read() and cb_server_read(). The "cb_" prefix is a convention to remind me that these are "callback" functions (from Hollywood).

Before Hollywood can call us back, we have to give Hollywood some information. This "register the callback" is done with the functions client_open(), conn_add_fd() and server_open().

Before we can register something, we have to create something. This is the purpose of the conn_factory() function. We use the "factory pattern" in our chat application.

MS-Windows Business logic

As a programmer, you expect that the low-level functions of your application are more operating system dependent than the high-level functions. Linux and MS-Windows are different in critical details for our chat application. With Linux, we can use a file descriptor for the console, file system, and network. For MS-Windows, this is not possible. We can use a file pointer for the console and file system, but for the network, we have to use the type SOCKET. MS-Windows has a very special way to tell "read data available" from console input.

One difference is that Linux uses a signed integer for the file descriptor. The MS-Windows SOCKET type is an unsigned long long (64-bit) integer. Somebody at Microsoft was thinking big! Because the SOCKET type is unsigned, the MS-Windows system uses the "magic constants" INVALID_SOCKET or SOCKET_ERROR to return the error state.

/** @brief callback client read available
 * @retval >0 okay
 * @retval <=0 close or error
 */
int cb_client_read(CONN_ID id, SOCKET fd) {
  assert(id >= 0 && id < CONNMAX);
  assert(fd < FDMAX);
  char buf[BUFSIZE];
  int readbytes = recv(fd, buf, sizeof buf, 0);
  if (readbytes > 0 && readbytes != SOCKET_ERROR) {
    // Write all data out
    int writebytes = fwrite(buf, 1, readbytes, stdout);
    assert(writebytes == readbytes && "fwrite");
  }
  return readbytes;
}

/** @brief callback server read available
 * @retval >0 okay
 * @retval <=0 close or error
 */
int cb_server_read(CONN_ID id, SOCKET fd) {
  assert(id >= 0 && id < CONNMAX);
  assert(fd < FDMAX);
  char buf[BUFSIZE];    // buffer for client data
  int readbytes = recv(fd, buf, sizeof buf, 0);
  if (readbytes > 0 && readbytes != SOCKET_ERROR) {
    // we got some data from a client
    for (SOCKET i = 0; i < FDMAX; ++i) {
      // send to everyone!
      if (FD_ISSET(i, &conn[id].fds)) {
        // except the listener and ourselves
        if (i != conn[id].sockfd && i != fd) {
          int writebytes = send(i, buf, readbytes, 0);
          if (writebytes != readbytes) {
            perror("WW send");
          }
        }
      }
    }
  }
  return readbytes;
}

void poll_keyboard(CONN_ID id) {
  assert(id >= 0 && id < CONNMAX);
  if (_kbhit()) {   // very MS-DOS
    char buf[BUFSIZE];
    char* rv = fgets(buf, sizeof buf, stdin);
    assert(rv != NULL);
    send(conn[id].sockfd, buf, strlen(buf), 0);
  }
  after(TIMEOUT, (cb_timer_t)poll_keyboard, id);
}

int main(int argc, char* argv[]) {
  WSADATA wsaData;
  int rv = WSAStartup(MAKEWORD(2, 2), &wsaData);
  assert(0 == rv && "WSAStartup");

  if (argc < 3) {
    char* name = basename(argv[0]);
    fprintf(stderr,"usage server: %s s port hostname\n", name);
    fprintf(stderr,"usage client: %s c hostname port\n", name);
    fprintf(stderr,"example server IPv4: %s s 60000 127.0.0.1\n", name);
    fprintf(stderr,"example client IPv4: %s c 127.0.0.1 60000\n", name);
    fprintf(stderr,"example server IPv6: %s s 60000 ::1\n", name);
    fprintf(stderr,"example client IPv6: %s c ::1 60000\n", name);
    exit(EXIT_FAILURE);
  }

  switch(argv[1][0]) {
  case 'c': {
    CONN_ID id = conn_factory();
    client_open(id, argv[3], argv[2]);
    after(TIMEOUT, (cb_timer_t)poll_keyboard, id);
  }
  break;
  case 's': {
    CONN_ID id = conn_factory();
    server_open(id, argv[2], argv[3]);
  }
  break;
  default:
    fprintf(stderr,"EE %s: unexpected argument %s\n", FUNCTION, argv[1]);
    exit(EXIT_FAILURE);
  }

  conn_event_loop();  // start inversion of control

  WSACleanup();
  return 0;
}

The MS-Windows main() function has the same general structure as the Linux main(). MS-Windows specific is the WSAStartup() function that starts the Winsock2 subsystem version 2.2. The build-in documentation is close to the Linux version. Linux allows no hostname for the IPv4 TCP server, but MS-Windows needs a hostname for the getaddrinfo() function.

The console input "./wiomux1 c ::1 60000" transforms into argc=4, argv[1]="c", argv[2]="::1" and argv[3]="60000". The variable argv[0] contains the program name with full path. The function basename() puts away the path. The switch statement in the main() function uses the "case 'c':" path. The program executes the functions conn_factory(), client_open() and after(). The "case 's':" path calls the functions conn_factory() and server_open(). All these functions are discussed below.

The program control disappears in the function conn_event_loop() and re-appears again in the functions cb_client_read(), poll_keyboard(), and cb_server_read(). This is typical for a framework or a library.

The MS-Windows cb_client_read() reads data from the network and forwards that to standard out. The function poll_keyboard() reads console input and forwards that data to the network. The _kbhit() function shows clearly the MS-DOS legacy of MS-Windows. I still remember MS-DOS interrupt 21H and function number 0BH for Check Keyboard Status. If the operating system does not provide a console input event, you have to use the ancient method of device polling.

The MS-Windows cb_server_read() works like the Linux version but you can not use read() or write(), you have to use recv() and send().

The function WSACleanup() terminates the Winsock2 subsystem. We have this function at the end of main(), but the program never executes this line. The program terminates with either an exit() function, an assert() macro or with a signal, that is created in most cases by CTRL-C. I assume that program termination does automatically perform Winsock subsystem termination.

FORTRAN style programming

Version 0 and version 1 of the chat application using "FORTRAN style" programming. This kind of programming is used e.g. for the UNIX (Linux) functions open(), read(), write() and close(), the basic UNIX input/output to console, file system, network and more. All these functions use "file descriptor" as a resource identifier. This "file descriptor" is just the index into an array of a struct. This array of a struct is maintained by the C runtime library. Today the array has typical 1024 elements. This is the maximum number of open file descriptors for one program.

"FORTRAN style" programming is programming with an array index (early FORTRAN had no pointers). To make the chat application array of struct index prominent, I gave this index its type CONN_ID. CONN_ID is a typedef to int, that is you can write int instead of CONN_ID at every location. But this syntax sugar helps me a little.

The variable id in function main() is set by the conn_factory() factory function and is used by the following function in the "object index" position.

Definition of CONN object

For this article I use the following definition for an object, "object pointer" and "object index": First of all, an object has some data. For the programming language C, an object is a struct, for C++ an object is a class. Second, an object has some functions. C has no "instance functions", but we can fake them by convention: The first argument of a C "instance function" is the "object pointer" or "object index".

The chat application Linux version 1 object definition is:

typedef int CONN_ID;

typedef struct {
  fd_set fds;       // read file descriptor set
  int sockfd;       // network port for client, listen port for server
  int isConnecting; // non-blocking connect started, but no finished
  int typ;          // tells client or server object
  char hostname[STRMAX];
  char port[STRMAX];
} CONN;

CONN_ID conn_factory()
void conn_event_loop()

void conn_add_fd(CONN_ID id, int fd)

void client_open(CONN_ID id, const char* port, const char* hostname)
void client_open1(CONN_ID id)
void client_reopen(CONN_ID id)
void client_connect(CONN_ID id)
void client_handle(CONN_ID id, int fd)
int cb_client_read(CONN_ID id, int fd)

void server_open(CONN_ID id, const char* port, const char* hostname)
void server_handle(CONN_ID id, int fd)
int cb_server_read(CONN_ID id, int fd)

The chat application design has an object tree with base object CONN (connection) and child objects CLIENT and SERVER. The version 1 implementation has no object tree - but the function names are clustered into conn_, client_, and server_.

The design has class functions and instance functions, too. A class function has no "object index" as the first argument, an instance function has.

With C++ we can express the design more direct. By the way, the design and implementation of all versions formed a feedback loop - several iterations were done, mostly with refactoring. In the beginning, the function client_open() was named open_client(). A naming convention of first write the struct name and second write function name is not necessary for the C implementation but makes the transition to a C++ implementation easier.

Linux TCP client open

File system input/output is easy: you have an open() function that gets information about a resource and returns a file descriptor to that resource. The TCP library or socket library has no open() function. You have to write your own open() function.

void client_open(CONN_ID id, const char* port, const char* hostname) {
  assert(id >= 0 && id < CONNMAX);
  assert(port != NULL);
  assert(hostname != NULL);
  printf("II %s: port=%s hostname=%s\n", FUNCTION, port, hostname);

  FD_ZERO(&conn[id].fds);
  conn[id].typ = CONN_CLIENT;

  strcpy_s(conn[id].port, sizeof conn[id].port, port);
  strcpy_s(conn[id].hostname, sizeof conn[id].hostname, hostname);
  void client_open1(CONN_ID id);
  client_open1(id);
}

void client_open1(CONN_ID id) {
  assert(id >= 0 && id < CONNMAX);
  struct addrinfo hints;
  memset(&hints, 0, sizeof hints);
  hints.ai_family = AF_UNSPEC;
  hints.ai_socktype = SOCK_STREAM;

  struct addrinfo* res;
  int rv = getaddrinfo(conn[id].hostname, conn[id].port, &hints, &res);
  if (rv != 0) {
    fprintf(stderr, "EE %s getaddrinfo: %s\n", FUNCTION, gai_strerror(rv));
    exit(EXIT_FAILURE);
  }

  // loop through all the results and connect to the first we can
  struct addrinfo* p;
  for (p = res; p != NULL; p = p->ai_next) {
    conn[id].sockfd = socket(p->ai_family, p->ai_socktype, p->ai_protocol);
    if (-1 == conn[id].sockfd) {
      perror("WW socket");
      continue;
    }

    // UNPv2 ch. 15.4, 16.2 non-blocking connect
    int val = 1;
    rv = ioctl(conn[id].sockfd, FIONBIO, &val);
    if (rv != 0) {
      perror("WW ioctl FIONBIO ON");
      close(conn[id].sockfd);
      continue;
    }

    rv = connect(conn[id].sockfd, p->ai_addr, p->ai_addrlen);
    if (rv != 0) {
      if (EINPROGRESS == errno) {
        conn[id].isConnecting = 1;
      } else {
        perror("WW connect");
        close(conn[id].sockfd);
        continue;
      }
    }

    break;  // exit loop after socket and connect were successful
  }

  assert(p != NULL && "connect try");

  char dst[NI_MAXHOST];
  rv = getnameinfo(p->ai_addr, p->ai_addrlen, dst, sizeof dst, NULL, 0, 0);
  assert(0 == rv && "getnameinfo");

  freeaddrinfo(res);

  // don't add sockfd to the fd_set, client_connect() will do

  printf("II %s: connect try to %s (%s) port %s socket %d\n", FUNCTION,
         conn[id].hostname, dst, conn[id].port, conn[id].sockfd);
}

Our client_open() function does return nothing. The function changes variables in the CONN struct or the function aborts the program with exit() or assert(). I use this "fail fast" programming style in my production code, too. There is always a sentinel program around my production program. The sentinel program will start the production program after a failure. And no, there is no sentinel program for the sentinel program. The sentinel program is very simple and very robust. Joe Armstrong made in the programming language Erlang the "Fail Fast and Noisily, Fail Politely" style prominent.

The TCP client open function is split into two parts. The first part, client_open(), checks the parameters with assert() and copies the hostname and port C-strings into the CONN object. We need this information for "TCP server polling". The first part calls the second part, client_open1(), which is doing the real work.

The function getaddrinfo() is the core of IPv6 functionality in our chat application. This function transforms IPv4 and IPv6 hostname and port/service information, provided as C-strings, into the internal representation. The HTTP service has port number 80. Our chat application works on port number 60000. The internal representation needs 32 bits for an IPv4 address and 128 bits for an IPv6 address. A port needs always 16 bits. Depending on the computer configuration, getaddrinfo() can return multiple internal representations or none. The for (p = res; p != NULL; p = p->ai_next) loop iterates through the possibilities. Our chat application uses the first working possibility. The TCP client "open" needs the functions socket() and connect(). I recommend reading Beej's Guide to Network Programming or the UNP bible: "Unix Network Programming, Volume 1: The Sockets Networking API (3rd Edition)" from W. Richard Stevens et al.

For non-blocking connect we need a fcntl() or a ioctl() function. I use the ioctl() function. The fcntl() solution needs two calls: one to get the old state and a second to set the new state. The ioctl() solution changes the state with only one call. MS-Windows has the ioctl() function as ioctlsocket(), but no fcntlsocket() function.

The connect() in the case of non-blocking connect returns a "good error": A return value different from zero is an error, but the error EINPROGRESS tells us that the kernel has accepted the further processing of the non-blocking connect, it is in progress.

The function getnameinfo() transforms from internal representation to user-readable representation as C-string. One interesting detail is, that the hostname "localhost" comes back as "::1" after it was processed by getaddrinfo() and getnameinfo(). The string "::1" is the IPv6 localhost address. I expected that "localhost" comes back as "127.0.0.1", the IPv4 localhost address.

The function freeaddrinfo() frees the memory that getaddrinfo() allocated.

MS-Windows TCP client open

The client_open() function transforms hostname and port/service into a SOCKET descriptor. The Winsock2 library has no open() function, we have to build our own with the functions getaddrinfo(), socket(), ioctlsocket(), connect(), inet_ntop() and freeaddrinfo().

void client_open(CONN_ID id, const char* port, const char* hostname) {
  assert(id >= 0 && id < CONNMAX);
  assert(port != NULL);
  assert(hostname != NULL);
  printf("II %s: port=%s hostname=%s\n", FUNCTION, port, hostname);

  FD_ZERO(&conn[id].fds);
  conn[id].typ = CONN_CLIENT;

  strcpy_s(conn[id].port, sizeof conn[id].port, port);
  strcpy_s(conn[id].hostname, sizeof conn[id].hostname, hostname);
  void client_open1(CONN_ID id);
  client_open1(id);
}

void client_open1(CONN_ID id) {
  assert(id >= 0 && id < CONNMAX);
  struct addrinfo hints;
  memset(&hints, 0, sizeof hints);
  hints.ai_family = AF_UNSPEC;
  hints.ai_socktype = SOCK_STREAM;

  struct addrinfo* res;
  int rv = getaddrinfo(conn[id].hostname, conn[id].port, &hints, &res);
  if (rv != 0) {
    fprintf(stderr, "EE %s getaddrinfo: %d\n", FUNCTION, rv);
    exit(EXIT_FAILURE);
  }

  // loop through all the results and connect to the first we can
  struct addrinfo* p;
  for (p = res; p != NULL; p = p->ai_next) {
    conn[id].sockfd = socket(p->ai_family, p->ai_socktype, p->ai_protocol);
    if (INVALID_SOCKET == conn[id].sockfd) {
      perror("WW socket");
      continue;
    }

    // UNPv2 ch. 15.4, 16.2 non-blocking connect
    unsigned long val = 1;
    rv = ioctlsocket(conn[id].sockfd, FIONBIO, &val);
    if (rv != 0) {
      perror("WW ioctlsocket FIONBIO ON");
      closesocket(conn[id].sockfd);
      continue;
    }

    rv = connect(conn[id].sockfd, p->ai_addr, p->ai_addrlen);
    if (rv != 0) {
      if (WSAEWOULDBLOCK == WSAGetLastError()) {
        conn[id].isConnecting = 1;
      } else {
        perror("WW connect");
        closesocket(conn[id].sockfd);
        continue;
      }
    }

    break;  // exit loop after socket and connect were successful
  }

  assert(p != NULL && "connect try");
  void* src = get_in_addr((struct sockaddr*)p->ai_addr);
  char dst[INET6_ADDRSTRLEN];
  char* rv2 = inet_ntop(p->ai_family, src, dst, sizeof dst);
  assert(rv2 != NULL && "inet_ntop");

  freeaddrinfo(res);

  FD_SET(conn[id].sockfd, &conn[id].fds);

  printf("II %s: connect try to %s (%s) port %s socket %llu\n", FUNCTION,
         conn[id].hostname, dst, conn[id].port, conn[id].sockfd);
}

The client_open() function prepares the CONN object (Connection object). The FD_ZERO() macro initializes the read fd_set (file descriptor set). The fd_set contains all SOCKET descriptors that are part of one connection. A TCP client connection has only one SOCKET descriptor. A TCP server connection has one listen-to port SOCKET descriptor and one SOCKET descriptor for every connected TCP client.

The client_open1() is called from client_open() and does the real work. The getaddrinfo()  function transforms the hostname and port C-string information into the internal representation. See chapters "Linux TCP client open" and "Linux miscellaneous" for details about internal representation. The for loop after getaddrinfo() iterates over the "connection candidates" that getaddrinfo() found. We use the first "hit" of the candidates.

The socket() function uses the internal representation and returns a SOCKET descriptor.

The function ioctlsocket() with option FIONBIO and value 1 enables non-blocking connect. I/O-multiplexing works best if every function runs only for some microseconds. Slow system calls like blocking connect are not compatible with I/O-multiplexing.

This is another annoying difference between Linux and the MS-Windows system: some socket functions have an additional socket in the name, while others have not.

As in the Linux source code, non-blocking connect starts with the connect() function that returns immediately. The connect success or connect fail decision is done later in the function select(), see chapter "MS-Windows TCP non-blocking connect select" below.

The function inet_ntop() transforms from internal representation to C-string representation. The functions getaddrinfo() and inet_ntop() can handle IPv4 and IPv6 hostnames. The MS-Windows system has the getnameinfo() function, but I wasn't successful getting working TCP connecting using this function.

The function freeaddrinfo() frees the memory that getaddrinfo() has allocated.

The function client_open1() does not return a value. The "return" of this function is changing the CONN object. In case of error, the program is terminated by exit() or assert(). I use this "fail fast" programming style for my production code, too. Every program is started from a sentinel program. After the "work" program terminates, the "sentinel" program re-starts it.

Linux TCP non-blocking connect select

Please fasten your seat belts, there is a complicated topic ahead! The function conn_event_loop() performs two difficult jobs.

void conn_event_loop() {
  for(;;) {
    // virtualization pattern: join all read fds into one
    fd_set read_fds = conn[0].fds;
    for (int iconn = 1; iconn < CONNMAX; ++iconn) {
      for (int fd = 0; fd < FDMAX; ++fd) {
        if (FD_ISSET(fd, &conn[iconn].fds)) {
          FD_SET(fd, &read_fds);
        }
      }
    }
    // virtualization pattern: join all connect pending into one
    fd_set write_fds;
    FD_ZERO(&write_fds);
    for (int iconn = 0; iconn < CONNMAX; ++iconn) {
      if (conn[iconn].isConnecting) {
        FD_SET(conn[iconn].sockfd, &write_fds);
      }
    }

    struct timeval tv = {0, TIMEOUT * 1000};
    int rv = select(FDMAX, &read_fds, &write_fds, NULL, &tv);
    if (-1 == rv && EINTR != errno) {
      perror("EE select");
      exit(EXIT_FAILURE);
    }

    if (rv > 0) {
      // looking for data to read available
      for (int fd = 0; fd < FDMAX; ++fd) {
        if (FD_ISSET(fd, &read_fds)) {
          for (int iconn = 0; iconn < CONNMAX; ++iconn) {
            if (FD_ISSET(fd, &conn[iconn].fds)) {
              switch (conn[iconn].typ) {
              case CONN_CLIENT:
                client_handle(iconn, fd);
                break;
              case CONN_SERVER:
                server_handle(iconn, fd);
                break;
              }
            }
          }
        }
      }
      // looking for connect pending success or fail
      for (int iconn = 0; iconn < CONNMAX; ++iconn) {
        if (FD_ISSET(conn[iconn].sockfd, &write_fds)) {
          client_connect(iconn);
        }
      }
    }
    timer_walk();
  }
}

First, it implements the "virtualization pattern". In our program, we have only one select() function as the interface between the operating system and our application. But in our event programming application, we support the illusion that there are many different CONN objects. That is, our application can have 3 TCP servers listening on 3 different ports and 2 TCP clients running pseudo-parallel on top of this single select(). The definition of virtualization for this article is: multiply "something" at the same abstraction level. The real select() has one read fd_set. Every CONN object has its read fd_set. All these CONN objects read fd_set get merged into one temporary read_fds variable.

After the select() function returns, the read_fds variable is changed. Only the file descriptors that have read data available are still set. Now the "virtualization pattern" works the opposite way: from the one read_fds variable to the many CONN objects and their read callback functions. At the moment the read callback functions are still "hard-wired". But chat application versions 3, 7, and 8 show different callback function plugin solutions.

void client_handle(CONN_ID id, int fd) {
  assert(id >= 0 && id < CONNMAX);
  assert(fd >= 0 && fd < FDMAX);
  int cb_client_read(CONN_ID id, int fd);
  int rv = cb_client_read(id, fd);
  if (rv < 1) {
    int optval = 0;
    socklen_t optlen = sizeof optval;
    int rv = getsockopt(conn[id].sockfd, SOL_SOCKET, SO_ERROR, &optval, &optlen);
    assert(0 == rv && "getsockopt SOL_SOCKET SO_ERROR");
    fprintf(stderr, "WW %s: connect fail to %s port %s socket %d rv %d: %s\n",
            FUNCTION, conn[id].hostname, conn[id].port, conn[id].sockfd, rv,
            strerror(optval));
    client_reopen(id);
  }
}

The client_handle() function calls the TCP client business logic, that part of the framework that the application programmer provides. Like in all production code, the highlights are hidden in a lot of checking and logging. Call the cb_client_read() function and react on the return value of this call is the important stuff. A cb_client_read() return value of 0 tells us that the TCP connection was closed. A negative return value shows an error. Do you remember this TCP client logging line from chapter Introduction:

WW client_handle: connect fail to ::1 port 60000 socket 3 rv 0: Success

The strerror() function translated the errno value into "Success", but the message was "connect fail". The reason for connect() fail was a read() with return value (rv) zero. That read() was successful, but a socket read() of 0 bytes tells us that the connection is closed. Every detail in this logging line makes perfect sense, but together it is a little puzzler. For the function client_reopen() see below in this chapter.

The second big job of conn_event_loop() is to handle the non-blocking connect in progress. The Linux select() uses the write fd_set for this purpose, and the MS-Windows select() uses the except fd_set. Here, I talk about the Linux select: As long as the non-blocking connect is in progress, the file descriptor for this connection in the write fd_set is cleared by select().

As long as the isConnecting flag is set in the CONN object, we set the file descriptor in the write_fds variable before select(). The non-blocking connect file descriptor SHALL NOT be set in the read_fds variable. We don't want to mix "read data is available" with "connect is pending". The simple rule is: A file descriptor with the value X can only be set in the read_fds variable or the write_fds variable or none. The flag isConnecting controls the setting.

For many select() calls, the non-blocking connect remains pending and select() clears that entry in the write_fds variable. The client_handle() or server_handle() functions are not called.

After some seconds in a larger network, the operating system can say if the connect() was a success or a failure. The next select() that comes the way will not clear the relevant file descriptor in the write_fds variable and the function client_connect() is called.

void client_connect(CONN_ID id) {
  assert(id >= 0 && id < CONNMAX);
  conn[id].isConnecting = 0;
  int optval = 0;
  socklen_t optlen = sizeof optval;
  int rv = getsockopt(conn[id].sockfd, SOL_SOCKET, SO_ERROR, &optval, &optlen);
  assert(0 == rv && "getsockopt SOL_SOCKET SO_ERROR");
  if (0 == optval) {
    FD_SET(conn[id].sockfd, &conn[id].fds);   // now we read on this socket
    printf("II %s: connect success to %s port %s socket %d\n", FUNCTION,
           conn[id].hostname, conn[id].port, conn[id].sockfd);
  } else {
    fprintf(stderr, "WW %s: connect fail to %s port %s socket %d: %s\n", FUNCTION,
            conn[id].hostname, conn[id].port, conn[id].sockfd, strerror(optval));
    client_reopen(id);
  }
}


The job of client_connect() is easy. Clear the isConnecting flag. Get the success/failure information from the operating system with the function getsockopt() option SO_ERROR. If connect was a success, add file descriptor sockfd to the CONN object fd_set, else call client_reopen() for "server polling".

void client_reopen(CONN_ID id) {
  assert(id >= 0 && id < CONNMAX);
  close(conn[id].sockfd);
  FD_CLR(conn[id].sockfd, &conn[id].fds);
  conn[id].sockfd = -1;
  after(5000, (cb_timer_t)client_open1, id);
  // ugly bug with after(0, ...
}


The function client_reopen() closes and opens again the client connection. For some unknown reason to me, there must be some time  between close() and open(). I use the function after() to sleep for 5000 milliseconds in an I/O-multiplexing-friendly way. After 5 seconds the function client_open1() is called with the parameter id. The function pointer and the object index together are a closure. I implemented the Timer object with its class function after() on top of select(). For details see below.

MS-Windows TCP non-blocking connect select

The MS-Windows function select() uses the except fd_set (SOCKET descriptor set) to return "connect fail" for non-blocking connect.

void conn_event_loop() {
  for(;;) {
    // virtualization pattern: join all read fds into one
    fd_set read_fds = conn[0].fds;
    for (int iconn = 1; iconn < CONNMAX; ++iconn) {
      for (SOCKET fd = 0; fd < FDMAX; ++fd) {
        if (FD_ISSET(fd, &conn[iconn].fds)) {
          FD_SET(fd, &read_fds);
        }
      }
    }
    // virtualization pattern: join all connect pending into one
    fd_set except_fds;
    FD_ZERO(&except_fds);
    for (int iconn = 0; iconn < CONNMAX; ++iconn) {
      if (conn[iconn].isConnecting) {
        FD_SET(conn[iconn].sockfd, &except_fds);
      }
    }

    struct timeval tv = {0, TIMEOUT * 1000};
    int rv = select(FDMAX, &read_fds, NULL, &except_fds, &tv);
    if (SOCKET_ERROR == rv && WSAGetLastError() != WSAEINTR) {
      perror("EE select");
      exit(EXIT_FAILURE);
    }

    if (rv > 0) {
      // looking for data to read available
      for (SOCKET fd = 0; fd < FDMAX; ++fd) {
        if (FD_ISSET(fd, &read_fds)) {
          for (int iconn = 0; iconn < CONNMAX; ++iconn) {
            if (FD_ISSET(fd, &conn[iconn].fds)) {
              switch (conn[iconn].typ) {
              case CONN_CLIENT:
                client_handle(iconn, fd);
                break;
              case CONN_SERVER:
                server_handle(iconn, fd);
                break;
              }
            }
          }
        }
      }
      // looking for connect pending fail
      for (int iconn = 0; iconn < CONNMAX; ++iconn) {
        if (FD_ISSET(conn[iconn].sockfd, &except_fds)) {
          client_reopen(iconn);
        }
      }
    }
    timer_walk();
  }
}

The function conn_event_loop() is the heart of our I/O-multiplexing framework because the function select() is the core of I/O-multiplexing. Before we call select(), we collect all fds fd_set from all CONN objects into one read_fds fd_set. I call this the virtualization pattern. After the select() function returns, we investigate the read_fds fd_set, check for SOCKET descriptors that have read data available, and find the fitting business logic callback function.

The second job of the function conn_event_loop() is handling non-blocking connect. We copy the "master" read fd_set variable read_fds to except fd_set variable except_fds. Linux uses the except fd_set only for "out of band" data available, see Conditions for a Ready Descriptor. MS-Windows documentation says "If the client uses the select function, success is reported in the writefds set, and failure is reported in the exceptfds set", see section Remarks. My MS-Windows solution only checks for the connect() fail. After the select() returns, we investigate the except fd_set, check for SOCKET descriptors that tell us "connect fail" and call the function client_reopen() for the fitting CONN object.

The third job is done in function timer_walk(). The TIMER object implements a one-shot timer with millisecond resolution as an I/O-Multiplexing compatible alternative for sleep() or other "time wait" functions. See chapter "Linux Timer object" above. The TIMER object source code is identical to the Linux and MS-Windows implementation.

After the select() function returns, the read_fds variable is changed. Only the file descriptors that have read data available are still set. Now the "virtualization pattern" works the opposite way: from the one read_fds variable to the many CONN objects and their read callback functions. At the moment the read callback functions are still "hard-wired". But chat application versions 3, 7, and 8 show different callback function plugin solutions.

void client_handle(CONN_ID id, SOCKET fd) {
  assert(id >= 0 && id < CONNMAX);
  assert(fd < FDMAX);
  int cb_client_read(CONN_ID id, SOCKET fd);
  int rv = cb_client_read(id, fd);
  if (rv < 1) {
    // documentation conflict between
    // https://docs.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-getsockopt
    // https://docs.microsoft.com/en-us/windows/win32/winsock/sol-socket-socket-options
    unsigned long optval;
    socklen_t optlen = sizeof optval;
    int rv = getsockopt(conn[id].sockfd, SOL_SOCKET, SO_ERROR, (char*)&optval,
                        &optlen);
    assert(0 == rv && "getsockopt SOL_SOCKET SO_ERROR");
    fprintf(stderr, "WW %s: connect fail to %s port %s socket %llu rv %d: %s\n",
            FUNCTION, conn[id].hostname, conn[id].port, conn[id].sockfd, rv,
            strerror(optval));
    client_reopen(id);
  } else {
    conn[id].isConnecting = 0;  // hack: connect successful after first good read()
  }
}

The client_handle() function calls the TCP client business logic, that part of the framework that the application programmer provides. Like in all production code, the algorithm is hidden in a lot of checking and logging. Call the cb_client_read() function and react on the return value of this call is the important stuff. A cb_client_read() return value of 0 tells us that the TCP connection was closed. A negative return value shows an error. Do you remember this TCP client logging line from chapter Introduction:

WW client_handle: connect fail to ::1 port 60000 socket 208 rv 0: No error

The strerror() function translated the errno value into "No Error", but the message was "connect fail". The reason for the connect() fail was a read() with return value (rv) zero. That read() was successful, but a socket read() of 0 bytes tells us that the connection is closed. Every detail in this logging line makes perfect sense, but together it is a little puzzler. For the function client_reopen() see below in this chapter.

A cb_client_read() return value of 1 or larger tells us that the business logic received data. We use this information to end the isConnecting state. This is a hack, that is a bad solution because the MS-Windows system does not provide an event for "connect success" - or I as a programmer did not understand the Microsoft documentation.

void client_reopen(CONN_ID id) {
  assert(id >= 0 && id < CONNMAX);
  closesocket(conn[id].sockfd);
  FD_CLR(conn[id].sockfd, &conn[id].fds);
  conn[id].sockfd = -1;
  client_open1(id);
}


The function client_reopen() closes and opens again the client connection. For some unknown reason to me, Linux needs some time between close() and open(). But MS-Windows does not need this time, it is even forbidden. The MS-Windows select() always needs at least one SOCKET descriptor in the fd_set arguments.

Linux TCP server open

Again, we have to build our TCP server server_open() function.

void server_open(CONN_ID id, const char* port, const char* hostname) {
  assert(id >= 0 && id < CONNMAX);
  assert(port != NULL);
  // no assert hostname
  printf("II %s: port=%s hostname=%s\n", FUNCTION, port, hostname);

  FD_ZERO(&conn[id].fds);
  conn[id].typ = CONN_SERVER;

  struct addrinfo hints;
  memset(&hints, 0, sizeof hints);
  hints.ai_family = AF_UNSPEC;
  hints.ai_socktype = SOCK_STREAM;
  hints.ai_flags = AI_PASSIVE;

  struct addrinfo* res;
  int rv = getaddrinfo(hostname, port, &hints, &res);
  if (rv != 0) {
    fprintf(stderr, "EE %s getaddrinfo: %s\n", FUNCTION, gai_strerror(rv));
    exit(EXIT_FAILURE);
  }

  struct addrinfo* p;
  for(p = res; p != NULL; p = p->ai_next) {
    conn[id].sockfd = socket(p->ai_family, p->ai_socktype, p->ai_protocol);
    if (-1 == conn[id].sockfd) {
      perror("WW socket");
      continue;
    }

    int yes = 1;
    rv = setsockopt(conn[id].sockfd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(yes));
    if (rv != 0) {
      perror("WW setsockopt SO_REUSEADDR");
      close(conn[id].sockfd);
      continue;
    }

    rv = bind(conn[id].sockfd, p->ai_addr, p->ai_addrlen);
    if (rv != 0) {
      perror("WW bind");
      close(conn[id].sockfd);
      continue;
    }

    break;  // exit loop after socket and bind were successful
  }

  freeaddrinfo(res);

  assert(p != NULL && "bind");

  rv = listen(conn[id].sockfd, 10);
  assert(0 == rv && "listen");

  // add the listener to the fd_set
  FD_SET(conn[id].sockfd, &conn[id].fds);

  printf("II %s: listen on socket %d\n", FUNCTION, conn[id].sockfd);
}


The TCP server source code is easier because the server_open() can be successful without any running TCP client program. There is no connect and therefore no non-blocking connect in the TCP server.

The function getaddrinfo() transforms again the port/service and optional hostname information into the internal representation. We need the socket library functions socket(), bind() and listen() to create a listen port. Every TCP client connects to the listen port. The TCP server creates then a new connection to the TCP client and data exchange is done through this data transfer connection and the listen port is free for the next customer, another TCP client.

The function setsockopt() with option SO_REUSEADDR helps with the TIME_WAIT problem. After a TCP server terminates, the TCP stack in the operating system has an inhibit time of typical 60 seconds before another TCP server program can use the same listen port. With SO_REUSEADDR we disable this timeout.

The listen port file descriptor is one of the read data available file descriptors in the fd_set and is the sockfd. We need that information to see if the incoming data is from a new TCP client that wants to connect or from an already connected TCP client. In the first case, we have read data available at the listen port/listen file descriptor.

void server_handle(CONN_ID id, int fd) {
  assert(id >= 0 && id < CONNMAX);
  assert(fd >= 0 && fd < FDMAX);
  if (fd == conn[id].sockfd) {
    // handle new connections
    struct sockaddr_storage remoteaddr; // client address
    socklen_t addrlen = sizeof remoteaddr;
    // newly accept()ed socket descriptor
    int newfd = accept(conn[id].sockfd, (struct sockaddr*)&remoteaddr, &addrlen);
    if (-1 == newfd) {
      perror("WW accept");
    } else {
      FD_SET(newfd, &conn[id].fds);
      char dst[NI_MAXHOST];
      int rv = getnameinfo((struct sockaddr*)&remoteaddr, sizeof remoteaddr, dst,
                           sizeof dst, NULL, 0, 0);
      assert(0 == rv && "getnameinfo");
      printf("II %s: new connection %s on socket %d\n", FUNCTION, dst, newfd);
    }
  } else {
    int cb_server_read(CONN_ID id, int fd);
    int rv = cb_server_read(id, fd);
    if (rv < 1) {
      printf("II %s: connection closed on socket %d\n", FUNCTION, fd);
      close(fd);
      FD_CLR(fd, &conn[id].fds);
    }
  }
}


The function conn_event_loop(), see chapter "Linux TCP non-blocking connect select", calls function server_handle() if there is read data available for a CONN server object. The "handle a new connection from a new TCP client" part is done in the framework. The "handle to read data available for existing connections" is done by the business logic function cb_server_read(), see chapter "Business logic". Because remote close produces a socket read() with return value zero, the function cb_server_read() has to return the read() return value. If necessary, function server_handle() closes the connection to a TCP client. In our chat application TCP server close() is always a reaction to TCP client close(), but a TCP server can close a TCP connection for its reasons, for example, to force the TCP clients to another TCP server.

A new TCP connection calls the function accept() to get a file descriptor for the data exchange. The function inet_ntop() transforms the internal representation of the foreign/remote IP address of the newly connected TCP client into a user-readable representation as C-string.

MS-Windows TCP server open

The MS-Windows TCP server open need the same socket library functions getaddrinfo(), socket(), setsockopt(), bind(), freeaddrinfo() and listen().

void server_open(CONN_ID id, const char* port, const char* hostname) {
  assert(id >= 0 && id < CONNMAX);
  assert(port != NULL);
  assert(hostname != NULL);
  printf("II %s: port=%s hostname=%s\n", FUNCTION, port, hostname);

  FD_ZERO(&conn[id].fds);
  conn[id].typ = CONN_SERVER;

  struct addrinfo hints;
  memset(&hints, 0, sizeof hints);
  hints.ai_family = AF_UNSPEC;
  hints.ai_socktype = SOCK_STREAM;
  hints.ai_flags = AI_PASSIVE;

  struct addrinfo* res;
  int rv = getaddrinfo(hostname, port, &hints, &res);
  if (rv != 0) {
    fprintf(stderr, "EE %s getaddrinfo: %d\n", FUNCTION, rv);
    exit(EXIT_FAILURE);
  }

  struct addrinfo* p;
  for(p = res; p != NULL; p = p->ai_next) {
    conn[id].sockfd = socket(p->ai_family, p->ai_socktype, p->ai_protocol);
    if (INVALID_SOCKET == conn[id].sockfd) {
      perror("WW socket");
      continue;
    }

    // documentation conflict between
    // https://docs.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-setsockopt
    // https://docs.microsoft.com/en-us/windows/win32/winsock/sol-socket-socket-options
    const unsigned long yes = 1;
    rv = setsockopt(conn[id].sockfd, SOL_SOCKET, SO_REUSEADDR, (const char*)&yes,
                    sizeof(yes));
    if (rv != 0) {
      perror("WW setsockopt SO_REUSEADDR");
      closesocket(conn[id].sockfd);
      continue;
    }

    rv = bind(conn[id].sockfd, p->ai_addr, p->ai_addrlen);
    if (rv != 0) {
      perror("WW bind");
      closesocket(conn[id].sockfd);
      continue;
    }

    break;  // exit loop after socket and bind were successful
  }

  freeaddrinfo(res);

  assert(p != NULL && "bind");

  rv = listen(conn[id].sockfd, 10);
  assert(0 == rv && "listen");

  // add the listener to the fd_set
  FD_SET(conn[id].sockfd, &conn[id].fds);

  printf("II %s: listen on socket %llu\n", FUNCTION, conn[id].sockfd);
}


The TCP server source code is easier, because the server_open() can be successful without any running TCP client program. There is no connect and therefore no non-blocking connect in the TCP server.

The function getaddrinfo() transforms again the port/service and optinal hostname information into internal representation. We need the socket library functions socket(), bind() and listen() to create a listen port. Every TCP client connects to the listen port. The TCP server creates then an new connection to the TCP client and data exchange is done through this data transfer connection and the listen port is free for the next customer, another TCP client.

The function setsockopt() with option SO_REUSEADDR helps with the TIME_WAIT problem. After a TCP server terminates, the TCP stack in the operating system has an inhibit time of typical 60 seconds before another TCP server program can use the same listen port. With SO_REUSEADDR we disable this timeout. The MS-Windows setsockopt() function needs a cast. The fourth and fifth parameter are pointer and size to an opaque piece of data. The MS-Windows header file uses const char* for the fourth parameter. The real type depends on the option. The SO_REUSEADDR option needs a DWORD or unsigned long parameter.

The listen port file descriptor is one of the read data available file descriptors in the fd_set and is the sockfd. We need that information to see if the incoming data is from a new TCP client that wants to connect or from an already connected TCP client. In the first case, we have read data available at the listen port/listen file descriptor.

void server_handle(CONN_ID id, SOCKET fd) {
  assert(id >= 0 && id < CONNMAX);
  assert(fd < FDMAX);
  if (fd == conn[id].sockfd) {
    // handle new connections
    struct sockaddr_storage remoteaddr; // client address
    socklen_t addrlen = sizeof remoteaddr;
    // newly accept()ed socket descriptor
    SOCKET newfd = accept(conn[id].sockfd, (struct sockaddr*)&remoteaddr, &addrlen);
    if (INVALID_SOCKET == newfd) {
      perror("WW accept");
    } else {
      FD_SET(newfd, &conn[id].fds);
      void* src = get_in_addr((struct sockaddr*)&remoteaddr);
      char dst[INET6_ADDRSTRLEN];
      char* rv = inet_ntop(remoteaddr.ss_family, src, dst, sizeof dst);
      assert(rv != NULL && "inet_ntop");
      printf("II %s: new connection %s on socket %llu\n", FUNCTION, dst, newfd);
    }
  } else {
    int cb_server_read(CONN_ID id, SOCKET fd);
    int rv = cb_server_read(id, fd);
    if (rv < 1 || SOCKET_ERROR == rv) {
      printf("II %s: connection closed on socket %llu\n", FUNCTION, fd);
      closesocket(fd);
      FD_CLR(fd, &conn[id].fds); // remove from fd_set
    }
  }
}


The function conn_event_loop(), see chapter "MS-Windows TCP non-blocking connect select", calls function server_handle() if there is read data available for a CONN server object. The "handle a new connection from a new TCP client" part is done in the framework. The "handle read data available for existing connections" is done by the business logic function cb_server_read(), see chapter "MS-Windows Business logic". Because remote close produces a socket read() with return value zero, the function cb_server_read() has to return the read() return value. If necessary, function server_handle() closes the connection to an TCP client. In our chat application TCP server close() is always a reaction to TCP client close(), but a TCP server can close a TCP connection for own reasons, for example to force a re-connect of the TCP clients to another TCP server.

A new TCP connection calls the function accept() to get a file descriptor for the data exchange. The function inet_ntop() transforms the internal representation of the foreign/remote IP address of the new connected TCP client into user-readable representation as C-string.

Linux Timer object

In I/O-Multiplexing programs, slow system calls are forbidden. Every function SHALL only run for microseconds. The function sleep() is a very slow function. We replace it with the callback register function after(). There is a function after in the programming language Tcl/Tk. The Tk part of Tcl/Tk is as Tkinter today part of the programming language Python, see this Python after() example.

typedef void (*cb_timer_t)(int id);

typedef struct  {
  cb_timer_t cb_timer;  // cb_timer and arg are a closure
  int arg;
  struct timespec ts;   // expire time
} TIMER;

static TIMER timer[TIMERMAX];  // Timer objects array

int after(int interval, cb_timer_t cb_timer, int arg) {
  assert(interval >= 0);
  assert(cb_timer != NULL);
  // no assert arg

  int id;
  for (id = 0; id < TIMERMAX; ++id) {
    if (NULL == timer[id].cb_timer) {
      break;  // found a free entry
    }
  }
  assert (id < TIMERMAX && "timer array full");

  // convert interval in milliseconds to timespec
  struct timespec dts;
  dts.tv_nsec = (interval % 1000) * 1000000;
  dts.tv_sec = interval / 1000;
  struct timespec now;
  clock_gettime(CLOCK_MONOTONIC, &now);
  timer[id].cb_timer = cb_timer;
  timer[id].arg = arg;
  timer[id].ts.tv_nsec = (now.tv_nsec + dts.tv_nsec) % 1000000000;
  timer[id].ts.tv_sec = (now.tv_nsec + dts.tv_nsec) / 1000000000;
  timer[id].ts.tv_sec += (now.tv_sec + dts.tv_sec);
  /*
  printf("II %s now=%ld,%ld dt=%ld,%ld ts=%ld,%ld\n", FUNCTION,
    now.tv_sec, now.tv_nsec, dts.tv_sec, dts.tv_nsec,
    timer[i].ts.tv_sec, timer[i].ts.tv_nsec);
  */
  return id;
}

void timer_walk() {
  // looking for expired timers
  for (int i = 0; i < TIMERMAX; ++i) {
    if (timer[i].cb_timer != NULL) {
      struct timespec ts;
      clock_gettime(CLOCK_MONOTONIC, &ts);
      if ((ts.tv_sec > timer[i].ts.tv_sec) || (ts.tv_sec == timer[i].ts.tv_sec
          && ts.tv_nsec >= timer[i].ts.tv_nsec)) {
        TIMER tmp = timer[i];
        // erase array entry because called function can overwrite this entry
        memset(&timer[i], 0, sizeof timer[i]);
        assert(tmp.cb_timer != NULL);
        (*tmp.cb_timer)(tmp.arg);
      }
    }
  }
}

The Timer object implements a one-shot timer. The TIMER struct has variables cb_timer and arg to store a closure and the variable ts to store the expiration time. At this time or later the callback function will be executed.

The function after() checks its parameters with assert() and then has a for loop that implements the factory pattern. We have a fixed array of struct timer[TIMERMAX] and the for loop searches an empty element in this array of a struct.

The next part calculates the expiration time as the sum of a time difference, the after() parameter interval, and the current system time. We get the current system time with function clock_gettime(). The calculation is tricky because the system time has an integer part tv_sec and a fraction part tv_nsec. At the addition, we can have an overflow or carry from the fraction part to the integer part, for example, if we add 1.6 + 0.5. Another little detail is the variable units. The parameter interval has unit milliseconds, tv_nsec has unit nanoseconds and tv_sec has unit seconds. One second is 1000 milliseconds or 1000000000 nanoseconds. If you want to see the details of the calculation, uncomment the printf() statement.

The timer_walk() function is called after select() in the function conn_event_loop(), see chapter "Linux TCP non-blocking connect select" above. This function uses a for loop of the array of struct timer[TIMERMAX]. Again, the calculation is tricky. If the integer part of the current system time is larger than the integer part of the expiration time, the fraction part is not important. For example, the current system time is 2.0 seconds and the expiration time is 1.9 seconds. But if both integer parts are equal, we have to check the fraction parts.

A last tricky part is the call of the callback function. We copy the array of struct elements to a temporary variable tmp, erase the array of struct elements with function memset(), and then call the callback function. There is a high probability that the callback function will contain an after() function call. And there is a high probability that the array of struct elements we just have erased will get the values of the new after(). That is, we SHALL NOT tamper with the array of struct timer[TIMERMAX] after we have done the callback function call.

By the way, I write I/O-multiplexing programs for more than 20 years now, and my programs are in operational use for the same time. I learned the necessary tricks the hard way and I hope that you understand why this cunning is necessary.

MS-Windows Timer object

The MS-Windows timer object is the same as the Linux timer object. See chapter "Linux Timer object" above. The timer object is a good example for C/C++ "write once". The core of every re-use of source code is "write once". A big part of the success of UNIX operating system was that it is written in programming language C and that you could re-use more then 90% of the UNIX operating system source code at the next computer. The IBM OS/360 operating system was written in assembler. There was nearly no source code re-use.

Linux miscellaneous

There are some parts of the source code we have not discussed.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <libgen.h>     // basename()
#include <time.h>

// Linux
#include <errno.h>
#include <unistd.h>
#include <netdb.h>
#include <sys/ioctl.h>
#include <sys/select.h>
#include <sys/time.h>
#include <arpa/inet.h>

#define FUNCTION __func__

enum {
  BUFSIZE = 1460,   // Ethernet packet size minus IPv4 TCP header size
  CONNMAX = 10,     // maximum number of Connection objects
  FDMAX = 64,       // maximum number of open file descriptors
  CONN_SERVER = 1,  // TCP server connection object label
  CONN_CLIENT = 2,  // TCP client connection object label
  TIMEOUT = 40,     // select() timeout in milli seconds
  STRMAX = 80,      // maximum length of C-String
  TIMERMAX = 10,    // maximum number of Timer objects
};

// Copies a string with security enhancements
void strcpy_s(char* dest, size_t n, const char* src) {
  strncpy(dest, src, n);
  dest[n - 1] = '\0';
}

We need some header files. Please look at the man page of the functions to see which header file is necessary for which function. Please remember that POSIX or other standard bodies like to move the functions from one header file to another. This is necessary work but comes as a surprise sometimes.

The only #define in our source code needs some explanation. The compiler constant __func__ is replaced by a C-string that tells the function name. We use FUNCTION or the macro preprocessor replacement __func__ in printf() statements. I did use a #define because the C++ versions of the program will use __PRETTY_FUNCTION__ instead of __func__. Some very old compilers need __FUNC__ or __FUNCTION__ instead of __func__ because __func__ is only since C99 or C++11 part of the standards.

Instead of #define name constant, I use enum if I want to define integer constants. If you want to define floating-point constants or C-sting constants you still need #define in C. You can use constexpr since C++11. By the way, enum is part of the C language and #define is not. You see the difference in the compiler error messages. Please use enum as much as possible. It is a good C programming style today.

The function strcpy_s() is part of the MS-Windows C library. I like robust programs and therefore I like the _s() C-string functions. For every function that changes a C-string, you SHALL give the C-string pointer and the C-string sizeof as two parameters. Then you can always range check your C-strings. The GNU C library does not have the _s() C-string functions. But it is easy to write them yourself.

static CONN conn[CONNMAX];  // Connection objects array

CONN_ID conn_factory() {
  for (CONN_ID id = 0; id < CONNMAX; ++id) {
    if (0 == conn[id].typ) {
      return id; // found a free entry
    }
  }
  assert(0 && "conn array full");
  return CONNMAX;
}


The array of struct conn[CONNMAX] and the function conn_factory() implement the factory pattern. See chapter "Linux Timer object" for the basic idea.

void conn_add_fd(CONN_ID id, int fd) {
  assert(id >= 0 && id < CONNMAX);
  assert(fd >= 0 && fd < FDMAX);
  FD_SET(fd, &conn[id].fds);
}

The function conn_add_fd() is necessary for the file descriptor trick in function cb_client_read(), see chapter "Linux Business logic". The client CONN object uses the fd_set for this trick. The standard in file descriptor STDIN_FILENO is set in the client CONN object. Therefore the function cb_client_read() is called if there is read data available on the network socket or if there is read data available on standard in.

This ends the discussion of Linux version 1 of the chat application. But there are 8 more Linux versions and 9 more MS-Windows versions.

MS-Windows miscellaneous

There are some parts of the source code we have not discussed.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <libgen.h>     // basename()
#include <time.h>

// MS-Windows
#include <conio.h>      // _kbhit()
#include <winsock2.h>
#include <ws2tcpip.h>   // getaddrinfo()

#define FUNCTION __func__

enum {
  BUFSIZE = 1460,   // Ethernet packet size minus IPv4 TCP header size
  CONNMAX = 10,     // maximum number of Connection objects
  FDMAX = 256,      // maximum number of open file descriptors
  CONN_SERVER = 1,  // TCP server connection object label
  CONN_CLIENT = 2,  // TCP client connection object label
  TIMEOUT = 40,     // select() timeout in milli seconds
  STRMAX = 80,      // maximum length of C-String
  TIMERMAX = 10,    // maximum number of Timer objects
};

// get sockaddr, IPv4 or IPv6:
void* get_in_addr(struct sockaddr* sa) {
  assert(sa != NULL);
  if (AF_INET == sa->sa_family) {
    return &(((struct sockaddr_in*)sa)->sin_addr);
  } else {
    return &(((struct sockaddr_in6*)sa)->sin6_addr);
  }
}

We need some header files. Please look at the man page of the functions to see which header file is necessary for which function. Please remember that POSIX or other standard body like to move the functions from one header file to another. This is necessary work, but comes as a surprise sometimes.

The one and only #define in our source code needs some explanation. The compiler constant __func__ is replaced by a C-string that tells the function name. We use FUNCTION or the macro preprocessor replacement __func__ in printf statements. I did use a #define because the C++ versions of the program will use __PRETTY_FUNCTION__ instead of __func__. Some very old compilers need __FUNC__ or __FUNCTION__ instead of __func__, because __func__ is only since C99 or C++11 part of the standards.

Instead of #define name constant, I use enum if I want to define integer constants. If you want to define floating point constants or C-sting constants you still need #define in C. You can use constexpr in C++11. By the way, enum is part of the C language and #define is not. You see the difference in the compiler error messages. Please use enum as much as possible. It is good C programming style today.

You have seen that the functions getaddrinfo(), freeaddrinfo() and inet_ntop() gave our program IPv6 functionality. The helper function get_in_addr() cares about the difference between IPv4 and IPv6. The internal representation of the IP end-point struct between IPv4 and IPv6 is different. The end-point struct contains the IP address, the port number, and other information. The IPv4 end-point struct is struct sockaddr_in, the IPv6 end-point struct is struct sockaddr_in6. Both have the IP address information as different struct elements, sin_addr, and sin6_addr. Function get_in_addr() returns a pointer to the IP address from a generic struct sockaddr pointer depending on the IP version in use. The return value of function get_in_addr() is void*, the most dangerous type in C and C++. The only reason for void* is that function inet_ntop() expects a void* argument. Nobody defined a nice pointer type out of sin_addr pointer and sin6_addr pointer.

static CONN conn[CONNMAX];  // Connection objects array

CONN_ID conn_factory() {
  for (CONN_ID id = 0; id < CONNMAX; ++id) {
    if (0 == conn[id].typ) {
      return id; // found a free entry
    }
  }
  assert(0 && "conn array full");
  return CONNMAX;
}


The array of struct conn[CONNMAX] and the function conn_factory() implement the factory pattern. See chapter "Linux Timer object" for the basic idea.

This ends the discussion of MS-Windows version 1 of the chat application. But there are 8 more Linux versions and 8 more MS-Windows versions.

To be continued ...

Contact

You can contact the author via e-mail: