What happens when you type google.com in your browser and press Enter ?

Iheb Yahyaoui
5 min readJul 13, 2022

Well, this question looks easy. But, it’s a trap and it’s frequently asked in interview questions for many types of engineering positions. So, let’s dig down and understand how it really works.

Concept

Typing an URL in you browsers and pressing Enter loads a webpage in the browsers. But, first of all what’s a webpage ?
A webpage is basically a text file formatted a certain way so that your browser (ie. Chrome, Firefox, Safari, etc) can understand it; this format is called HyperText Markup Language (HTML). These files are located in computers that providethe service of sorting them (the files) and wait for the user’s need for them to be delivered. As we can see, they are called servers because they serve content that they hold to whoever needs it.

Servers can vary in classes, application servers which are the ones that hold anapplication base code that will then be used to interact with web browsers or other applications. Database servers are also out there, which are the ones that hold a database that can be updated and consulted when needed. We can also find web servers, the one that we will be talking about in the main portion of this article.

These servers — much like courier services — need to have an adress so that the person in need for content (web page) can request a delivery. In the other side, the person requesting the content needs an adress where the server can deliver the web pagefor him. These addresses are called IP (Internet Protocol)Address, a set of 4 numbers that range from 0 to 255 seperated by periods (ie. the famous 127.0.0.1).

For this delivery’s success, there’s two ways for the content of a server is delivered:
TCP (Transmission Control Protcol) which isusually used to deliver static websites. Also to download files to your computer or for email services. It accomplishes this bu sending the file in small packets of data and along with each pocket a confirmation to know that the package was delivered;that’s why if you are ever downloading something and your internet connection suddenly drops when it comes back up you don’t have to start over because the server would know exactly how many packets you have and how many you still need to receive. The downside to TCP is that because it has to confirm whether you got the packet or not before sending the next, it tends to be slower.
UDP (User Diagram Protocol) , on the other hand, is usually used to serve live videos or online video games. Since UDP does not check if the information was received or not this protocol is a lot faster than TCP. That is the reason why if you’ve ever watched a live video and if either your internet connection or the host’s drops, you would just stop seeing the content; and when the connection comes back up you will only see the current stream of the broadcast and what was missed is forever lost. This is also true for online videogames (if you’ve played them you know exactly what this means)

Diggin deeper

The Uniform Resource Locator (URL) address is read from right to left. The top-level domain portion is the “.com” which with the name “google” makes up the qualified domain name “google.com”.

https://webhostinggeeks.com/guides/dns/
Figure 1: A diagram explaining the domain architecture

Many website URLs we encounter today contain a third-level domain, a second-level domain, and a top-level domain. Each of these levels contains their own name server, which is queried during the DNS lookup process.

When we type the domain google.com into the browser this initiates the domain name resolution query. First the browser checks if it has the address in its local DNS cache, which is basically just a dictionary table of domain name to IP relationships. Checking the local cache is much faster and more efficient than making a remote request, but the tradeoff is the local information can become desynchronized with the remote and thus invalidated.

If the browser does not know the address it asks the OS cache. If that fails a request is initiated to the Resolver Server which is usually at the Internet Service Provider. The resolver will then query the root DNS servers which know the location of all the Top Level Domain servers. The TLD server corresponding to the “.com” top-level domain will know its Authoritative Name Server which will finally know the actual IP address associated with the domain and reply it to the client browser.

Once the IP of the server in question is known, connecting to it is done by the Transmission Control Protocol. The first point of contact with the remote server will be its firewall, a possible implementation being Uncomplicated Firewall. There are two basic kinds of firewalls: network and host-based. Network firewalls block off disparate networks, whereas host-based firewalls only block off the given host machine from its network. The first-line firewall can be considered in a way network-based since it separates the internet server network from the external internet network but is actually host-based on the Load Balancer server. Firewalls regulate specified network protocols on particular ports. In this case “https” is the network protocol which tries to connect on port 443.

HTTPS is an extension of HTTP and stand for Hypertext Transfer Protocol Secure. Using Secure Socket Layer (SSL) encryption — a transport-layer encryption — ,which encrypts all requests and replies between the client and the server, makes the network secure. The remote server needs an SSL certificate terminated at the point of contact.

Here comes the Load Balancer which is the server which makes th first contact with client requests. Its purpose is to distribute frequent and numerous client requests across two or more web servers which actually contain the software and data to serve those requests. This is to reduce the possibility of overloading any one single web server. In this case of HTTPS the SSL certificate is terminated at the Load Balancer, which means that load balancing software like HighAvailability Proxy is by default listening to HTTPS requests on port 443. Upon receiving them it redirects them by some algorithm like Round Robin to an internal network of web servers.

The Web Server, such as Nginx or Apache, as we have already mentioned is where static web content is stored. For more complex application or business logic, such as that encoded in Python or PHP, the web server will relay a request to an Application Server. It can implement clustering, fail-over, load-balancing, and handles connections to the Database Server. Each of these is its own microservice in order to decouple and optimize each component based on the business needs The Database Server may be on its own or as part of the Application Server, and it contains the DB data and software such as MySQL or MongoDB.

Conclusion

It is amazing how easy it is to access any desired website by just typing it’s URL after knowing the complexity of this act.
Yet, it is done so fast few would even begin to fathom the amazing process that takes place.

Thanks to the engineers that helped making this process easier for us and what can do is only learning how everything works and appreciate their hard work in making our lives easier.

--

--

Iheb Yahyaoui

Software Developer/ Holbertoon School Student / Blogger / Bibliophile ! LinkedIn: https://www.linkedin.com/in/ihebyh/