Load Balancing Explained

Distribute traffic, prevent overload, and enable horizontal scaling — the foundation of high-availability systems.

Load Balancing

The distribution of incoming network traffic across multiple servers to ensure no single server is overwhelmed, improving availability, reliability, and performance.

Explanation

A load balancer sits between users and your servers, routing each request to the least busy server. If a server fails, the load balancer stops sending traffic to it (health checks). Load balancing enables horizontal scaling — add more servers to handle more traffic. Common algorithms include round-robin, least connections, and IP hash. Cloud providers offer managed load balancers (AWS ALB/NLB, GCP Load Balancer) that handle SSL termination, health checks, and auto-scaling integration.

Bookuvai Implementation

Bookuvai configures load balancing for all production deployments. We use AWS ALB for HTTP/HTTPS traffic with path-based routing, health checks, and auto-scaling group integration. For WebSocket applications, we use NLB with sticky sessions. Load balancer configuration is defined in Terraform as part of our IaC setup.

Related Terms

Frequently Asked Questions

When do I need a load balancer?
As soon as you have more than one server instance, or when you need zero-downtime deployments (rolling updates). Most production applications should use a load balancer.
What is the difference between ALB and NLB?
ALB (Application Load Balancer) operates at HTTP level and supports path-based routing, host-based routing, and WebSocket upgrades. NLB (Network Load Balancer) operates at TCP level and is used for non-HTTP protocols or when ultra-low latency is required.