Πως δουλεύουν οι Web Browsers
Ένα πάρα πολύ ωραίο άρθρο απο τον Tapajyoti Bose (δημοσιευμένο στο dev.to)
Browsers are now a part of everyday life, but have you ever wondered how they work under the hood?
This article will take a closer look at the magic behind the scenes of web browsers.
Let’s get started! 🚀
1. Navigation
Navigation is the first step of loading a web page. It happens when the user enters a URL in the address bar or clicks on a link.
1.1. DNS lookup
The first step is to find the IP address where the resources are located. This is done by a DNS lookup.
The Domain Name System (DNS) Server is a server that is specifically used for matching website hostnames (like www.example.com) to their corresponding Internet Protocol or IP addresses. The DNS server contains a database of public IP addresses and their corresponding domain names
For example, if you visit www.example.com, the DNS server will return the IP address 93.184.216.34
which is its corresponding IP address.
1.2. 3-way TCP Handshake
The next step is to establish a TCP connection with the server. This is done by a 3-way TCP handshake.
First, the client sends a request to open up a connection to the server with a SYN packet.
The server then responds with a SYN-ACK packet to acknowledge the request & requesting the client to open up a connection.
Finally, the client sends an ACK packet to the server acknowledging the request.
1.3. TLS handshake
If the website uses HTTPS (encrypted HTTP protocol), the next step is to establish a TLS connection via a TLS handshake.
During this step, some more messages are exchanged between the browser and the server.
- Client says hello: The browser sends the server a message that includes which TLS version and cipher suite it supports and a string of random bytes known as the client random.
- Server hello message and certificate: The server sends a message back containing the server’s SSL certificate, the server’s chosen cipher suite, and the server random (a random string of bytes that’s generated by the server).
- Authentication: The browser verifies the server’s SSL certificate with the certificate authority that issued it. This way the browser can be sure that the server is who it says it is.
- The premaster secret: The browser sends one more random string of bytes called the premaster secret, which is encrypted with a public key that the browser takes from the SSL certificate from the server. The premaster secret can only be decrypted with the private key by the server.
- Private key used: The server decrypts the premaster secret.
- Session keys created: The browser and server generate session keys from the client random, the server random, and the premaster secret.
- Client finished: The browser sends a message to the server saying it has finished.
- Server finished: The server sends a message to the browser saying it has also finished.
- Secure symmetric encryption achieved: The handshake is completed and communication can continue using the session keys.
Now requesting and receiving data from the server can begin.
2. Fetching resources
After the TCP connection is established, the browser can start fetching resources from the server.
2.1. HTTP Request
If you have any experience with web development, you will have encountered the concept of HTTP requests.
HTTP requests are used to fetch resources from the server. It requires a URL & the type of request (GET, POST, PUT, DELETE) to be processed. The browser also adds some additional headers to the request to provide additional context.
The first request sent to a server is usually a GET request to fetch an HTML file.
2.2. HTTP Response
The server then responds with an appropriate HTTP response for the given request.
The response contains the status code, the headers & the body.
3. Parsing HTML
Now comes the main section. After the browser has received the HTML file, it parses it to generate the DOM (Document Object Model) tree.
This is done by the browser engine which is the core of the browser (Eg: Gecko for Firefox, Webkit for Safari, Blink for Chrome, etc).
Here is an example HTML file:
<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>
<p>Hello World!</p>
</body>
</html>
3.1. Tokenization
The first step for displaying the web page is to tokenize the HTML file. Tokenization is the process of breaking up a string of characters into meaningful chunks for the browser, called tokens.
Tokens are the basic building blocks of the DOM tree.
3.2. DOM Tree construction
Lexing is the process of converting a sequence of tokens into a tree structure called the DOM tree.
The DOM tree is a tree data structure that represents the nodes in the HTML document.
NOTE: If the page requires any external resources it will be handled as follows:
- Non-blocking resources are fetched in parallel. Eg: Images.
- Deferring resources are fetched in parallel but are executed after the DOM tree is constructed. Eg: script WITH
defer
attribute & CSS files. - Blocking resources are fetched and executed sequentially. Eg:
script
WITHOUTdefer
attribute.
4. Parsing CSS
After the DOM tree is constructed, the browser parses the CSS file to generate the CSSOM (CSS Object Model).
This process is similar to the DOM tree construction using tokenization & generation of the CSSOM
5. Executing JavaScript
As mentioned previously, if the page requires a blocking script
, it will be fetched and executed instantly, while the DOM tree construction is paused, else the script
will be fetched & executed after the DOM tree construction is completed.
Regardless of when the script
is executed, it will be handled by the JavaScript engine which too like the browser engine varies from browser to browser.
5.1. JIT compilation
<!– wp:spacer {“height”:”40px”} –><div style=”height:40px” aria-hidden=”true” class=”wp-block-spacer”></div><!– /wp:spacer –>Assuming you are familiar with the concept of interpreters & compilers, the JavaScript engine uses a hybrid approach called JIT (Just in Time) compilation.
JIT stands for Just In Time, meaning, unlike with a compiled language, such as C, where the compilation is done ahead of time (in other words, before the actual execution of the code), with JavaScript, the compilation is done during execution
6. Rendering
It’s finally time to render the page. The browser uses the DOM tree & CSSOM to render the page.
6.1. Render tree construction
The first step is to construct the render tree. The render tree is a subset of the DOM tree that contains only the elements that are visible on the page.
6.2. Layout
The next step is to layout the render tree. This is done by calculating the exact size & position of each element in the render tree.
This step happens every time we change something in the DOM that affects the layout of the page, even partially.
Examples of situations when the positions of the elements are recalculated are:
- Adding or deleting elements from the DOM
- Resizing the browser window
- Changing the
width
,height
, or theposition
of an element
6.3. Painting
Finally, the browser decides which nodes need to be visible and calculates their position in the viewport, it’s time to paint them (render the pixels) on the screen. This phase is also known as the rasterization phase, where the browser converts each element calculated in the layout phase to actual pixels on the screen.
Just like the layout phase, this phase happens every time we change the appearance of an element in the DOM, even partially.
Examples of situations when the positions of the elements are recalculated are:
- Changing the
outline
of an element - Changing the
opacity
orvisibility
of an element - Changing the
background color
of an element
6.4. Layering & Compositing
The final step is to composite the layers. This is done by the browser to optimize the rendering process.
Compositing is a technique to separate parts of a page into layers, painting them separately and compositing as a page in a separate thread called the compositor thread. When sections of the document are drawn in different layers, overlapping each other, compositing is necessary to ensure they are drawn to the screen in the right order and the content is rendered correctly
NOTE: DOM updates, specifically layout & paint, are extremely expensive operations, which can be noticed significantly on low-end devices. So, it’s important to minimize the number of times it is triggered.