Load Balancing Explained
Distribute traffic, prevent overload, and enable horizontal scaling — the foundation of high-availability systems.
Load Balancing
The distribution of incoming network traffic across multiple servers to ensure no single server is overwhelmed, improving availability, reliability, and performance.
Explanation
A load balancer sits between users and your servers, routing each request to the least busy server. If a server fails, the load balancer stops sending traffic to it (health checks). Load balancing enables horizontal scaling — add more servers to handle more traffic. Common algorithms include round-robin, least connections, and IP hash. Cloud providers offer managed load balancers (AWS ALB/NLB, GCP Load Balancer) that handle SSL termination, health checks, and auto-scaling integration.
Bookuvai Implementation
Bookuvai configures load balancing for all production deployments. We use AWS ALB for HTTP/HTTPS traffic with path-based routing, health checks, and auto-scaling group integration. For WebSocket applications, we use NLB with sticky sessions. Load balancer configuration is defined in Terraform as part of our IaC setup.
Related Terms
Frequently Asked Questions
- When do I need a load balancer?
- As soon as you have more than one server instance, or when you need zero-downtime deployments (rolling updates). Most production applications should use a load balancer.
- What is the difference between ALB and NLB?
- ALB (Application Load Balancer) operates at HTTP level and supports path-based routing, host-based routing, and WebSocket upgrades. NLB (Network Load Balancer) operates at TCP level and is used for non-HTTP protocols or when ultra-low latency is required.