How things work#1 - Understanding web servers and sockets

How things work#1 - Understanding web servers and sockets

In the past couple of months I've been digging down to how everything started, we live in a period now where everything is abstracted and sometimes people forget to understand really how it all started. In this article I'll be showing you the very basics of how a server is created, how it accepts connections and also I'll be discussing the most popular server design patterns so without further or do let's get started

Sockets

When a client sends a request to a server via a TCP connection, what happens is that the server has a predefined endpoint with an IP address and port combination (for example 127.0.0.1:3000. The client has it's own IP address and port combination but we never actually see the port. Why? because we don't really need to; every time the client requests something from the server with a new TCP connection it gets assigned a port from a range of ports called ephermal ports; 49,000-65,535 range. These ports are used for temporary purposes and so it gets assigned a random port every time a new TCP connection occurs. So the unique combination of the clients IP + port along with the server's IP + port is what identifies a TCP connection and each side of the connection exists sockets, the client's socket is the initiator and the server's socket is the listener socket.

To define a socket in Ruby all we need to do is:

require 'socket'

socket = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM)

what this does is just instantiate a new instance from the Socket class giving it two arguments; the first is AF_INET which basically defines the socket with an IPV4 of protocols and the second is SOCK_STREAM which identifies the protocol of the connection which is TCP in this case, if we wanted to use UDP we would define it with DGRAM instead.

Defining a server

After we instantiated a socket, we actually need to bind this socket to a port of our own so that the client has the combination of the server's IP address + port in order to be able to connect to us. There's also a range that we need to chose from and we can't go lower or above this range; 1025-48,999 because lower than that are well known ports that are used by the system and above that are the ephermal ports as we discussed above. Before building our first server we need to define the ip that we'll bind our server to, basically we can have multiple network interfaces on our system; one of them is the loopback interface which is represented by localhost or 127.0.0.1; this special interface routes all the outgoing requests back to itself hence the name loopback. Also you can have other interfaces with different IP addresses. All in all if you want to bind your server to 127.0.0.1 you'll only be able to listen from the loopback interface But if you bind the server to 0.0.0.0 you'll be able to listen to all the interfaces. This is very useful in containerization for example docker where you'd want your server inside the container to listen to all interfaces so we can connect to them externally. Last thing is to make the server actually listen for connections. Listening usually takes a queue of requests that once it exceeds it will begin to drop the requests and you'll get a connection refused error. This is called the listen queue of a socket and you can give it a value of the maximum number of connections in the queue by identifying a number when calling listen in ruby. Usually to get the maximum number your device can handle you can print Socket::SOMAXCONN to see the output.

Once the server starts listening for connections, we can start accepting connections using the accept call which blocks until a connection Is there to accept. We can create a socket with a IP address and port as follows:

require 'socket'
local_socket = Socket.new(:INET, :STREAM)
local_addr = Socket.pack_sockaddr_in(3000, '127.0.0.1')
local_socket.bind(local_addr)
local_socket.listen(Socket::SOMAXCONN)
connection, _ = local_socket.accept

Accepting a connection returns the connection itself which is the IP address + port combination of the client that instantiated the request with other info aswell. connection is a socket instance returned. what internally happens is that on accepting a connection the socket is attached to the processes' file descriptors so basically the process knows about the socket since it's in its file descriptors. You can learn more about file descriptors from a previous article I wrote here This socket listens on localhost:3000, That means to connect to our server we can just use the command netcat to check if it's running or not.

nc localhost 3000

If it succeeded you'll realize that the server exists since accept was blocking and the request succeeded. This was just an introduction on how servers are built, let's get into the patterns!

Different network architecture patterns

Before diving in I just wanted to quickly talk about different network architecture patterns that exist in our world. We use these patterns every day wether we spin up a web server, visit a website , .. etc and we take them for granted. I'll briefly explain each one and we'll get started right away with the server implementation.

  1. Serial Pattern
  2. Process Per Communication
  3. Thread Per Communication
  4. Preforking
  5. Thread Pool
  6. Evented (Reactor)
  7. Hybrid

Serial Pattern With this pattern all connections are handled serially; no concurrency. This means every client must wait in a line until the client that came before him finishes. The pros of this is obviously it's very simple since there's no concurrency you don't deal with lots of headaches that concurrency comes with The cons of this is how slow it would perform, it would be slow in it's best possible performance so imagine what would happen if a client had a slow request.

Process Per Communication This architecture relies on creating an ENTIRE new process (via forking) just to handle a clients request. The process will die after the clients request finishes. So the server can handle incoming connections along with users requests but the overload of the processes per request is a bit too much This still has the advantage of simplicity and achieves parellelism and/or concurrency depending on the machine of course. The main disadvantage is the number of processes that have a linear relationship with the number of requests. This can overload the machine and make it unusable

Thread Per Communication Similar to the approach above but lighter since we deal with creating threads not processes; Threads are more lightweight than processes. But since all threads share the same memory here we might need synchronization and locking between them to prevent unwanted race conditions. One other disadvantage is as the number of threads increase, the overhead of the context switch happening between them increases via kernel which isn't optimal. This has the same disadvantage as the approach above as-well as the number of requests grow the threads will do as-well which can overwhelm the system and make it unusable.

Preforking This approach is a better way of the process per communication approach. What happens is we have a main server which forks a predefined number of child processes; for example 10. On doing so the children all inherit the file descriptors of the parent, hence inherit the server socket. The kernel automatically load balances connections across all the processes with the socket. The main server has to keep an eye out for the child processes and respawn one if it died unexpectedly. This pattern has the advantage of keeping everything separated because each process has its own memory. However it can be very expensive to fork even 10 processes because as we know each process gets it's own memory so if a process has the size of 100 MB then 1GB of our memory will be dedicated to only spawning the processes. This is without the consideration of wether it has Copy on write semantics which saves more memory.

Thread Pooling Similar to preforking, this pattern spawns a predefined number of threads and dedicates each connection to any available thread. The kernel makes sure each thread gets a single connection aswell. The advantage of this is that we can spawn more threads because they're lightweight than processes in the preforking pattern above. The main thread keeps monitoring it's children while each get connections and handle them accordingly. This approach is very good for concurrent processing and not a burden on the system.

Evented (Reactor) This pattern has gained a lot of popularity in the past few years, This pattern is single threaded and single process. But achieves a really high level of concurrency on par with the others.

How it works

  1. The server monitors the listening socket for incoming connections.
  2. Upon receiving a new connection it adds it to the list of sockets to monitor.
  3. The server now monitors the active connection as well as the listening socket.
  4. Upon being notified that the active connection is readable the server reads a chunk of data from that connection and dispatches the relevant callback.
  5. Upon being notified that the active connection is still readable the server reads another chunk and dispatches the callback again.
  6. The server receives another new connection; it adds that to the list of sockets to monitor.
  7. The server is notified that the first connection is ready for writing, so the response is written out on that connection.

This is done basically by a unix syscall such as select(2) syscall which is rarely used now and there are better options such as epoll(7). I encourage you to check these out, they basically can have a bunch of sockets and watch them for reading, writing and whenever a socket is ready it would return it to be processed.

Hybrid This is not a specific pattern, it's a combination of one or more of the patterns discussed above; for example nginx which is a popular web server uses a combination of the preforking pattern along with the reactor pattern to serve millions of concurrent requests. This takes maximum advantages of server resources.

Summary

It's important to understand how things work bottom-up because in my opinion it can boost your creativity in creating new, even undiscovered patterns! This concludes part 1 of this 2 part article, in the next one we'll actually get to building a server from scratch step by step. Till we meet again!

References

Working with Ruby by Jessie Storimer This book is amazing I encourage everyone to read it at least once, it would really change the way you think and help boost your creativity.

Did you find this article valuable?

Support Amr Elhewy by becoming a sponsor. Any amount is appreciated!