nginx-connection-limit

nginx connection limit模块是用来限制整体的连接数。这个贵司是用来限制单个nginx最大连接数。一旦超过了就限制重新连接,默认就返回503了。

这个阿里云slb也是这样的策略。

但是我们碰到了一点问题。突然有一台服务器reload之后,发现很多503返回了。可查了一下当时该机器的连接数,并没有超过我们设置的限定数。但其他服务器的却没有问题。

配置如下:

1
2
3
#整个整体连接数限制:
limit_conn_zone all zone=allserver:10m;
limit_conn allserver 50000;

而且每次reload之后我们会清理的对应shutting down的进程。 后来只能通过restart nginx来处理。于是我们就进行了测试。

这里使用了wrk来进行测试,主要是ab无法保证连接。

1
2
3
4
5
git clone https://github.com/wg/wrk.git wrk
cd wrk
make
sudo cp wrk /usr/bin/
wrk -t1 -c6 -d300s http://127.0.0.1:8998/

为了验证效果,我们把连接数设置为5.

1
2
3
#整个整体连接数限制:
limit_conn_zone all zone=allserver:10m;
limit_conn allserver 5;

这种测试不用看,必定就是很多的503返回.

1
2
3
4
5
2022/03/13 10:38:57 [error] 5491#0: *1381450 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 10:38:57 [error] 5491#0: *1381391 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 10:38:57 [error] 5492#0: *1437629 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 10:38:57 [error] 5492#0: *1437629 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 10:42:41 [error] 5491#0: *10140463 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"

但是我要验证的是nginx worker进程被重启后的情况。我们把wrk的并发设置为4,然后杀掉worker进程看下, 虽然我测试还在跑着,但是连接数并没有变化。然后也没有返回503。

这个就比较奇怪了,我的生产环境的这个服务器就是reload之后突然产生了503.

但是生产环境是在reload之后shuttding down进程还是会保持很多连接的。因此还需要模拟一下。但是wrk这个没法一直保持连接。 于是我先reload一下,但是不kill shutting down进程

下面这个我们可以看到,我reload是在49秒,但是error日志是在56秒出来的。而测试程序是每秒4个并发,保持4个连接。但是我理解理论上应该50秒的时候超过了,而不是56秒。
而且只要shutting down进程在,应该一直报,但是实际上就56秒报了,而且为什么就报了12次,而不是别的数字。这个还需要再验证。

1
wrk -t4 -c4 -d300s http://127.0.0.1:8998/
1
2
3
4
5
6
7
8
timo   90335  5.4  0.4 758740 152956 ?       S    18:27   5:31 nginx: worker process is shutting down
timo 90566 5.0 0.4 758728 152960 ? S 18:28 5:06 nginx: worker process is shutting down
timo 128183 93.8 0.5 794176 185860 ? R 20:07 0:47 nginx: worker process
timo 128184 94.0 0.5 794176 185868 ? R 20:07 0:47 nginx: worker process
timo 128526 0.0 0.0 116976 2448 pts/1 S+ 20:08 0:00 grep --color=auto nginx
[timo@nginx1 nginx]$ ps -o lstart 128183
STARTED
Sun Mar 13 20:07:49 2022
1
2
3
4
5
6
7
8
9
10
11
12
2022/03/13 20:07:56 [error] 128183#0: *28300832 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 20:07:56 [error] 128183#0: *28300832 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 20:07:56 [error] 128183#0: *28300832 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 20:07:56 [error] 128183#0: *28300832 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 20:07:56 [error] 128183#0: *28302782 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 20:07:56 [error] 128183#0: *28218366 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 20:07:56 [error] 128183#0: *28302782 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 20:07:56 [error] 128183#0: *28300832 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 20:07:56 [error] 128183#0: *28310083 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 20:07:56 [error] 128183#0: *28300832 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 20:07:56 [error] 128183#0: *28310083 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"
2022/03/13 20:07:56 [error] 128183#0: *28310083 limiting connections by zone "allserver", client: 127.0.0.1, server: _, request: "GET / HTTP/1.1", host: "127.0.0.1:8998"