Unexplained latency with AWS ELB

We've been recently experiencing unexplained latency issues as refelcting from the ELB latency metric with our AWS setup.

Our setup includes and 3 EC2 c1.medium machines (each running an NGINX which talks to a uWSGI handler on the machine) behind an ELB.

Now, our traffic has peaks in morning and evening times but that doens't explain what we're seeing, ie peaks of 10 seconds in latency well into the the traffic peak.

Our NGINX logs and uWSGI stats show that we are not queuing any requests and response times are solid under 500 ms.

Some config details:

ELB listens on port 8443 and transfers to 8080

NGINX has the following config on each EC2:

worker_processes 2;
pid /var/run/nginx.pid;

events {
    worker_connections 4000;
    multi_accept on;
    use epoll;
}

http {
    server {
        reset_timedout_connection on;
        access_log off;
        listen 8080;

        location / {
            include uwsgi_params;
            uwsgi_pass 127.0.0.1:3031;
        }
    }
}

I was wondering if someone had experienced something similar or can maybe supply an explanation.

Thank you..


I'm not sure if it's documented somewhere but we've been using ELBs for quite a while. And in essence ELBs are EC2 instances in front of the instances you are load balancing, it's our understanding that when your ELB starts experiencing more traffic, Amazon does some magic to turn that ELB instance from say a c1.medium to an m1.xlarge.

So it could be that when you are starting to see peaks Amazon does some transitioning between the smaller to the larger ELB instance and you are seeing those delays.

Again customers don't know what goes on inside Amazon so for all you know they could be experiencing heavy traffic at the same time you have your peaks and their load balancers are going berserk.

You could probably avoid these delays by over-provisioning but who wants to spend more money.

There a couple of things that I would recommend if you have time and resources:

  • Setup an haproxy instance in front of your environment (some large instance) and monitor your traffic that way. Haproxy has a command line (or web) utility that allows you to see stats. Of course you also need to monitor your instance for things like CPU and memory.

  • You may not be able to do in production in which case you are going to have to run test traffic through it. I recommend using something like loader.io. Another options is to try to partially send some of the traffic to an haproxy instance, perhaps using GSLB (if your DNS provider supports it)

  • 链接地址: http://www.djcxy.com/p/80074.html

    上一篇: 如何删除AWS CloudWatch指标?

    下一篇: AWS ELB无法解释的延迟