How in-depth are you on AWS Network Load Balancers and Reverse- Proxy? — Part 2
Network Load Balancer
Do you know what is idle_timeout?
A TCP connection that remains idle for a period of time can timeout. When the timeout occurs, the network appliance will no longer consider this connection as active and will no longer deliver packets in either direction. Idle timeouts exist for one reason: Don’t keep connections opened channels in the server that dont needed. Remember, connections open consume resources ( memory, cpu, TCP connectins, ephemeral ports etc).
Conclusion
Cascade effect, how everything intertwines
When you’re working with AWS Network Load Balancer (NLB) in front of NGINX ( ECS, EC2, etc) , and NGINX proxying to upstream servers, it’s critical to understand that connection timeouts are not isolated settings. They form a chain, and if one link is misconfigured, you can end up with broken sessions, failed downloads, or wasted server resources.
What Happens When Timeouts Are Misaligned
Imagine this scenario (based on AWS documentation):
- The client has sent a request to server and is waiting for a server response.
- The server is processing this request, and it isn’t transmitting anything back to the client for an extended period of time.
- The connection timeout occurs at the network appliance used between the client and server, but neither side is aware of it.
- The server continues to process the request not knowing that the connection is already closed.
- Once the server completes processing and starts to transmit a response back, it will receive an RST packet from the network appliance, notifying it that the connection is closed.
- This can result in a waste of resources and a redundant load on the server.
This is why idle timeouts must be considered end-to-end, not just at the load balancer.
The Full Chain: Client → NLB → NGINX → Upstream
Each layer has its own timeout and keepalive settings:
- NLB: Controls how long it keeps an idle TCP connection open.
- NGINX: Has directives like
keepalive_timeout,proxy_read_timeout, andproxy_connect_timeout. - Upstream: Application servers may also have their own idle timeout.
If these values are inconsistent, you risk:
- Premature connection closure (NLB closes before NGINX or upstream).
- 502 Bad Gateway errors (NGINX tries to reuse a closed connection).
- Timeouts
Best Practice
Configure timeouts so that the first component in the chain closes the connection first.
Example strategy:
- Client > NLB idle timeout > NGINX keepalive_timeout > Upstream keepalive_timeout.
- Keep differences small (e.g., 5–10 seconds) to avoid race conditions.
