Web Client Application (epwget) =============================== Introduction ------------ The ``epwget`` program is a sample event-driven HTTP web client which sends HTTP requests and receives the web pages through HTTP response. ``epwget`` uses ``epoll`` (event poll) interface to detect whether the mTCP socket is ready for read and write operations. Code Walkthrough ----------------- The following sections provide an explanation of the main components of the epwget code. All mOS library functions used in the sample code are prefixed with ``mtcp_`` and are explained in detail in the `Programmer's Guide - mOS Programming API`_. Note that we omit the error handling logic from the example code snippets for brevity. (1) The main() Function ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The ``main()`` function performs the initialization and calls the execution threads for each CPU core. The first task is to initialize mOS thread based on the mOS configuration file. ``fname`` holds the path to the ``mos.conf`` file which will be passed to ``mtcp_init()`` function. We can use ``mtcp_getconf()`` function to retrieve current configuration settings from the mOS core. .. code-block:: c /* parse mos configuration file */ ret = mtcp_init(fname); mtcp_getconf(&g_mcfg); core_limit = g_mcfg.num_cores; The next step is global parameter initialization using the ``GlbInitWget()`` function. We will describe the details of this function in the next section. The last step is to create and run per-core mTCP threads. For each CPU core, it creates a new mTCP thread which gets spawned from a function named ``RunMTCP()``. .. code-block:: c for (i = 0; i < core_limit; i++) pthread_create(&mtcp_thread[i], NULL, RunMTCP, (void *)&cores[i])); (2) The Global Parameter Initialization Function ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The ``GlbInitWget()`` function loads the ``epwget`` application-specific configuration from ``epwget.conf`` file. The following code block shows the example configuration for ``epwget.conf``. ``url`` parameter is used to set the URL of the file to be downloaded. ``dest_port`` specifies the port number of the web server to connect. ``total_flows`` indicates the total number of flows (in other words, the total number of downloads), and ``total_concurrency`` is the number of concurrent flows allowed to run at the same time. By setting ``core_limit`` parameter, the application can override the number of CPU cores to be used. .. code-block:: c url = 10.0.0.3/64K dest_port = 80 total_flows = 100000 total_concurrency = 4000 core_limit = 8 ``GlbInitWget()`` function reads the configuration file, and saves the parameters in global variables. We note that our ``epwget`` implementation assumes that the maximum number of file descriptors that mTCP thread can create is three times larger than the user-defined number of concurrent flows. ``epwget`` overrides the ``max_concurrency`` and ``max_num_buffers`` parameters of mOS configuration using ``mtcp_getconf()`` and ``mtcp_setconf()`` functions: .. code-block:: c /* set the max number of fds 3x larger than concurrency */ max_fds = concurrency * 3; mtcp_getconf(&mcfg); mcfg.max_concurrency = max_fds; mcfg.max_num_buffers = max_fds; mtcp_setconf(&mcfg); (3) The RunMTCP() Function ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The ``RunMTCP()`` function is executed in a per-thread manner. First, ``RunMTCP()`` function affinitizes a CPU core to each thread and creates a mtcp context. Next, it calls the ``RunApplication()`` function, which uses sockets to create connections, send HTTP requests, and receive HTTP responses. .. code-block:: c /* affinitize the mTCP thread to a core */ mtcp_core_affinitize(core); /* mTCP initialization */ mctx = mtcp_create_context(core); RunApplication(mctx); ``RunApplication()`` function consists of ``InitWget()`` function and ``RunWget()`` function. ``InitWget()`` creates a thread context which holds thread-specific metadata including epoll-related variables and statistics of the flows related to their status (e.g., started, pending, done, errors, and incompletes). One of the important roles of ``InitWget()`` function is to initialize the RSS (receive-side scaling) setup which involves deriving the source port number from the remaining three parameters of 4-tuple (source network address, destination network address, and destination port number) TCP connection information. .. code-block:: c mtcp_init_rss(mctx, saddr, IP_RANGE, daddr, dport); Afterwards, ``epwget`` creates the epoll loop to receive the read and write availability events as follows (note that we have simplified the code for better readability): .. code-block:: c ep = mtcp_epoll_create(mctx, ctx->maxevents); ``RunWget()`` is the core of this program. In this function, using the ``epoll`` event API, it creates new connections, and sends or receives data. .. code-block:: c while (!done) { /* until it meets the maximum number of concurrent connections, */ while (mtcp_get_connection_cnt(ctx->mctx) < concurrency) { /* create a new connection */ CreateConnection(ctx); } /* wait inside the epoll_wait call until there's any event */ nevents = mtcp_epoll_wait(mctx, ctx->ep, ctx->events, ,,,); for (i = 0; i < nevents; i++) { if (ctx->events[i].events & MOS_EPOLLERR) { /* print an error message and close the connection*/ ... } else if (ctx->events[i].events & MOS_EPOLLIN) { /* read the data arrived at the socket buffer */ HandleReadEvent(ctx, ctx->events[i].data.sock, ...); } else if (ctx->events[i].events == MOS_EPOLLOUT) { /* write HTTP request to the socket send buffer */ SendHTTPRequest(ctx, ctx->events[i].data.sock, wv); } } ... } Here are some detailed explanations for each sub-function in the code above: * ``CreateConnection()`` function creates a new mtcp socket, sets the socket as non-blocking, connects to the target web server, and adds the socket to the epoll event queue. .. code-block:: c sockid = mtcp_socket(mctx, AF_INET, SOCK_STREAM, 0); ... mtcp_setsock_nonblock(mctx, sockid); ... mtcp_connect(mctx, sockid, &addr, sizeof(struct sockaddr_in)); ... mtcp_epoll_ctl(mctx, ctx->ep, MOS_EPOLL_CTL_ADD, sockid, &ev); * ``SendHTTPRequest()`` function creates an outgoing HTTP request header, and opens a file to store the response data. .. code-block:: c snprintf(request, HTTP_HEADER_LEN, "GET %s HTTP/1.0\r\n", ...); len = strlen(request); wr = mtcp_write(ctx->mctx, sockid, request, len); ... wv->fd = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644); * ``HandleReadEvent()`` function consists of reading the payload from the socket, and storing the data to the file. .. code-block:: c rd = mtcp_read(mctx, sockid, buf, BUF_SIZE); /* parse the http header */ ... if (writable) { /* store the data to the file */ write(wv->fd, pbuf + wr, rd - wr); } (4) Multi-process Version (DPDK-only) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ You can also run ``epwget`` in multi-process (single-threaded) mode. This mode will only work with Intel DPDK driver. You can find ``epwget-mp`` placed in the same directory where ``epwget`` lies. The overall design of ``epwget-mp`` is similar to ``epwget`` (only ``pthreads`` are absent). One can run ``epwget-mp`` on a 4-core machine using the following script: .. code-block:: bash #!/bin/bash ./epwget-mp -f config/mos-master.conf -c 0 & sleep 5 for i in {1..3} do ./epwget-mp -f config/mos-slave.conf -c $i & done The ``-c`` switch is used to bind the process to a specific CPU core. Under DPDK settings, the master process (core 0 in the example above) is responsible for initializing the underlying DPDK-specific NIC resources one time. The slave processes (cores 1-3) share those initialized resources with the master process. The master process relies on the ``mos-master.conf`` file for configuration. It has only 1 new keyword: ``multiprocess = 0 master``; where 0 stands for the CPU core id. The ``mos-slave.conf`` configuration file has an additional line: ``multiprocess = slave``; which (as the line suggests) sets the process as a DPDK secondary (slave) instance. We employ a mandatory wait between the execution of the master and the slave processes. This is needed to avoid potential race conditions between the shared resources that are updated between them. .. _`Programmer's Guide - mOS Programming API`: ../programmer/04_mos_api.html