Notes | Load Balancing Orchestration using K8s and AWS

Load balancing is distributing network traffic to multiple backend servers. At its core, LB acts as reverse proxy, sitting in front of servers and routing client requests across all servers.

Load Balancing Algorithms

Methods that LB uses tot decide which backend server to send a request.

Round Robin: Simply sends requests to the next server to the server the previous request was sent to. This does not incorporate current load/health of servers.

Least Connections: Directs traffic with fewest active connections (useful when different response times)

IP Hash: Uses hash of incoming request to redirect to server. Hence same user gets redirected to same server (called user stickiness)

Weighted Round Robin / Weighted Least Connections: Admin assigns weight to each server. Useful for different capacities.

Health Checks: Ensuring Reliability

LB sends requests to healthy servers only. Facilitated by periodic health checks of configurable interval to check status. Some examples of health checks:

TCP connection attempt
HTTP 200 OK from a particular endpoint (e.g. /ping) If check fail, LB removes it from the pool and redirects to healthy (this is called failover).

Session Persistence (Sticky Sessions) : Maintaining State

For use cases where customer interacts with same server (some examples can be e-commerce shopping carts), session persistence or sticky sessions are essential. For client’s first request, servers add cookie in response. For all further request, LB reads this cookie and redirects accordingly. This is a subset of IP Hash(hashing on session id)

AWS Elastic Load Balancer (ELB) and Application Load Balancer (ALB)

AWS provides managed load balancer (elb) which redirects to multiple targets like Amazon EC2 instances, containers and IP addresses. There are different types of LB available under ELB umbrella, the two prominent are: 1. Application Load Balancer (ALB): Commonly used, operates at application layer (OCI Layer 7). This provides it to inspect the content of the traffic (HTTP headers, request paths, etc.). For instance, ALB can route requests based on URL path (Kong’s usage of Path Based Routing) to a set of servers. Suited for HTTP & HTTPS traffic 2. Network Load Balancer (NLB): Operates on transport layer (layer 4). Routing based of ip protocol data. More performant than ALB. NLB are best suited for TCP, UDP and TLS traffic.

Symphony of Load Balancing with K8s , ELB/ALB and EC2

Kubernetes is an open source orchestration platform. A kubernetes cluster is typically composed of a set of Amazon EC2 instances (if you use AWS services). These instances run and manage containers (application pods). Hence EC2 provides compute for K8s. K8s exposes a Service, that provides a stable endpoint for a set of pods. Service is like an internal LB within cluster. When a Service is of defined as type LoadBalancer, K8s communicates with AWS behind the scenes and automatically provisions AWS load balancer (mostly NLB).

AWS Load Balancer Controller

LoadBalancer Service type is functional, a preferred approach is use Ingress to manage external access. Ingress is used to manage external access in K8s, it can provide load balancing, SSL Termination and name-based virtual hosting.

We need an Ingress Controller to make Ingress work in AWS. AWS Load Balancer Controller watches Ingress resources, when a new Ingress is detected, it provisions and configures an ALB. Here’s step-by-step flow of traffic:

User Request: User sends request to application domain name (e.g. app.com)
DNS Resolution: DNS resolves domain of public IP of AWS Application Load Balancer provisioned by AWS LB Controller.
ALB Receives the Request: ALB receives request and it can terminate SSL/TLS offloading work from application pods.
ALB Routing: ALB inspects the request and based on rules in K8s ingress, determines which K8s cluster to forward traffic to.
Traffic to Node: ALB forwards to Service’s NodePort on one of the EC2 worker nodes in k8s cluster. NodePort is specific port opened on every node in cluster.
Internal Routing to Pod: K8s internal networking routes the traffic to correct NodePort to the correct pod, even if that pod is in different EC2 instance.
Application Processing: The container within pod receives the request and sends response back using the same path.

Entire process is dynamic. If a pod crashes, K8s reschedules it, AWS LB Controller updates new pod location. Similar happens while scaling out.

Load Balancers are unsung heroes of modern scalable application delivery.