7 min read

Load Balancing In Distributed Systems

Load Balancing In Distributed Systems
"It's not the load that breaks you down, it's the way you carry it." Lou Holtz


Motivation

As businesses grow and the demand for online services and applications increases, distributed systems become a popular choice for managing workloads and ensuring high availability. However, managing resources in distributed systems can be complex, and workload imbalances can lead to reduced performance and even system failure. That's where load balancers come in. This blog post will describe key concepts in the load balancing field and an end-to-end demonstration using a simple React App and Nginx.

What are load balancers?

A load balancer is a critical component in a distributed system that helps distribute incoming network traffic across multiple servers or nodes. The primary purpose of a load balancer is to improve the availability, scalability, and performance of the system by evenly distributing the workload across the servers.  

In a distributed system, multiple servers work together to perform a specific task or provide a service. The load balancer sits in front of these servers and receives incoming requests from users or other systems. It then analyzes the requests and determines which server is best equipped to handle the request based on factors such as current workload, response time, and available resources.

Vertical vs Horizontal scaling

source: https://www.cloudzero.com/blog/horizontal-vs-vertical-scaling

Vertical scaling: also known as upscaling, adds more resources, such as CPU, memory or storage, to a single server or node to handle higher traffic and workload. In vertical scaling, the system increases the capacity of a single server so that it can handle more requests and perform more complex tasks. In this approach, a single server performs all tasks, and requests are handled by that server.

Horizontal scaling: also known as scaling outward, adds more nodes or servers to the system to handle the increased traffic and workload. In horizontal scaling, the system distributes the workload across multiple servers so that it can handle more requests simultaneously. In this approach, each server or node performs the same tasks, and requests can be handled by any available server.

Load balancers allow us to scale our servers horizontally by adding more servers to the system as needed and efficiently distributing the workload among them. It also allows us to manage our servers more effectively, as we can add and remove servers as needed without affecting the overall functionality of the system.

Types of load balancers

Hardware Load Balancer

A hardware load balancer is a type of load balancer that is implemented as a physical device that is installed in a data center or a server room. It is a specialized piece of hardware that is designed to distribute incoming network traffic across multiple servers, thereby improving the performance and availability of a system.

Typical Hardware load balancer 
Software Load Balancer

A software load balancer is a type of load balancer that is implemented as a software application that runs on a server or a virtual machine. It is a lightweight and flexible solution. Software load balancers are typically more affordable than hardware load balancers and can be deployed quickly and easily, making them ideal for small to medium-sized organizations.

Examples of software load balancers

Load balancing strategies and algorithms

Round-robin: Distributes incoming requests in a round-robin fashion, i.e., each server in the pool is given an equal share of the incoming requests.

Weighted round-robin: Similar to round-robin, but assigns a weight to each server based on its capacity or performance. The server with a higher weight receives more requests than the server with a lower weight.

Least connections: Distributes incoming requests to the server with the least number of active connections, thereby ensuring that the workload is evenly distributed.

IP hash: Uses the client's IP address to determine which server to send the request to. The same client IP address is always sent to the same server, ensuring session persistence.

Least response time: Distributes incoming requests to the server with the shortest response time, ensuring that the workload is evenly distributed and minimizing the overall response time.

Weighted response time: Takes into account the response time of each server in the pool and assigns a weight to each server based on its response time. This strategy ensures that the load balancer sends more requests to the server with a faster response time and fewer requests to the server with a slower response time. The servers with faster response times are given higher weights, while the servers with slower response times are given lower weights.

Agent-based: Uses software agents installed on servers to monitor their performance and availability. These agents periodically report their status to the load balancer, which uses this information to make decisions about where to route incoming traffic. The agents can collect various metrics, such as CPU usage, memory usage, disk utilization, network traffic, and other relevant data to determine the health of the server.

Load Balancer Demo With Ngnix

Nginx installation

Let's start by installing Nginx.
For Mac users it is simple, just run brew install nginx .

Once the installation is done, execute cd /usr/local/etc/nginx/  

Open nginx.conf with VScode or any other preferred editor.
Keep it open aside, we will edit it later on.

Create a simple React app to use a server

To Create a new React app we can use the Npx tool.
Run npx create-react-app load-balacer-test-server

Once the installation is done, you can run npm start to check that the app run as expected.

Open the app folder with VScode or your preferred IDE, and edit the App.jsx file.

import './App.css';
import ServerHeader from './components/ServerHeader';

const serverName = process.env.REACT_APP_SERVER_NAME;

function App() {
  return (
    <div className="App">
      <ServerHeader 
      name={serverName}/>
    </div>
  );
}

export default App;

serverName will be read from an environment variable we will inject to the docker image later on. Inside the return block, remove everything inside the App div and add the serverHeader component.

Create a new file named ServerHeader.jsx as follows.

import React from "react";

function ServerHeader(porps) {
  return (
    <div>
        <h1>Hello from:</h1>
        <h3>Server {porps.name}</h3>
    </div>
  );
}

export default ServerHeader;

Open a terminal and run npm start if everything went fine you should see:

Adding Dockerfile

If you do not have docker installed on your machine go to the official site and follow the installation steps.

Let's add the docker file to deploy our app in multiple containers

# pull official base image
FROM node:16-alpine

# set working directory
WORKDIR /app

# add `/app/node_modules/.bin` to $PATH
ENV PATH /app/node_modules/.bin:$PATH

# install app dependencies
COPY package.json ./
COPY package-lock.json ./
RUN npm install --silent
RUN npm install react-scripts@3.4.1 -g --silent

# add app
COPY . ./

# start app
CMD ["npm", "start"]

Run docker build . -t testapp to build our app docker image. When the process is complete you'll see the images section on Docker desktop.  

We are now ready to run our app instances on multiple containers.
Run docker run -p 9001:3000 -e REACT_APP_SERVER_NAME='1' -d testapp
4 times each time change the mapping port and the server name.
notice: port 9001 on our machine will be mapped to port 3000 on the app container. This trick will allow us to run multiple apps side by side.
-e flag injects the environment variable we need for displaying the server name.

Check one of the containers to see that everything works.

Go to nginx.conf file we opened earlier and edit it as follows.
Inside the http section add:  
when not specified, Nginx will round-robin - circulating between the servers.  

    upstream backendserver {
        server 127.0.0.1:9001;
        server 127.0.0.1:9002;
        server 127.0.0.1:9003;
        server 127.0.0.1:9004;
    }

Inside the server section add:

location / {
            proxy_pass http://backendserver/;
        }

Open a new terminal window and run nginx to start the Nginx server.
If all goes well you should see the. use nginx -s reload if you need to update the configuration.  

round-robin servers


Nginx supports multiple algorithms and strategies out of the box.
You can see examples for least-connection strategy, ip_hash and more and more advanced configurations like adding weights or setting a slow start parameter for a server on the Load Balancing example page.  

Summary

In this blog post, we discussed load balancers, their role in managing resources in distributed systems, and their ability to distribute incoming network traffic across multiple servers or nodes. The post compares vertical and horizontal scaling and identifies hardware and software load balancers as two types of load balancers. We talked about some Load balancing strategies and algorithms, like round-robin, least connections and IP hash. Finally, we created our own "mini" distributed system to demonstrate load balancing on a distributed app, using Nginx React and Docker.

All the code in this blog post can be found on my github page.