Story of a connection

April 16, 2019 ☼ Interview

You type mysite.com and…

Ah, one of the classic interview questions. I really like this one.

Can’t really say the same thing about What’s the output of the following statement in JS

0.1 + 0.2 === 0.3

Oh well, back to our question. Shall we? Here we go…

Our user has pressed the enter button and first the computer needs to look up the IP (Internet Protocol) of the address (mysite.com). The computer does this even before connecting to the website. The IP address is unique to your address and each machine attached to a network has one.

An IP address looks like this 127.0.0.1

There are actually two IP protocols: IPv4, IPv6

The former has been created because we were running out of IPv4 addresses. Ops…

By the way, for a given hostname, you could have one or more addresses.

There are two ways for the client (i.e. PCs) to resolve a domain name: recursive query, non recursive query

Recursive query

PC      DNS Server
| |---> |  |
|_|<--- |__|
 
 Do you know the answer?
 Nope. No referral
 

Non recursive query

PC      DNS Server
| |---> |  |
|_|<--- |__|
 
 Do you know the answer?
 Nope. Try this other server (referral)
 

The term referral indicates a response to a query which does not contain an answer section (it is empty) but which contains one or more authoritative name servers (in the Domain Authority section) that are closer to the required query question.

Source

There are a lot of computers in internet and most likely your ISP doesn’t know the address of your website (unless you are Google, Facebook, Amazon etc.). So another request will be made to another DNS server and another one until finally a server does know the answer and kindly gets back to you.

This process of jumping” over multiple DNS servers to resolve a domain name is called DNS traversal

You can try and see it for yourself by running this command in your terminal: dig google.com +trace

As you can see this process can take quite a bit of time. It doesn’t matter if your website has a super beefy server and it’s super fast if it takes 2 seconds to find it.

Luckily DNS clients and server can relay on cache to speed up this process and if you host your website on a decent provider you should be fine.

DNS can be used as a simple load balancer

Let’s connect!

Now that we have the address we can start talking to the server. This conversation happens through a protocol called TCP.

You can find more about TCP here: https://en.wikipedia.org/wiki/Transmission_Control_Protocol

Long story short, TCP takes messages from an application/server and splits them into numbered packets, which can then be forwarded to the destination. The good thing about this protocol is the the data is guaranteed to arrive to destination and in correct order.

You might be more familiar with the term HTTP protocol though. The HTTP protocol utilizes TCP to transfer its information between computers.

At this point the server takes over to handle the request. There are numbers of servers and OS, each one of them with different use cases and trade offs.

Based on your application architecture here the web server might respond with a static page, start calling a database or other services if a dynamic page has been requested.

After the request has been processed the server sends back a response to the client and to the user. The browser in this case will render the page.

That’s it. Or not?

I actually omitted a lot of things for the sake of brevity and because the topic is too broad for this article:

I hope this will, at least, tickle your curiosity to dive deeper into those often overlooked aspects of a (not-that) simple action that we perform pretty much every single day: surfing the web 🏄‍♀️


If you have any suggestions, questions, corrections or if you want to add anything please DM or tweet me: @zanonnicola

s