Summary
Many users would like to know what’s the TCP connection limits on AWS Linux instance (Linux connection limit) and how to increase the number of connections. This article explains the configurations needed to check and modify on Linux to increase the maximum TCP connection in terms of server side and client side. Also, some tests were done to see the results with different configurations on Amazon Linux*.
*Amazon Linux v28.0 314 11
Tuning Linux maximum TCP connection
AWS network does not limit the number of TCP connections. When user encounter TCP connection number problem, they are suggested to check their system configuration first.
In Linux, a TCP/IP socket is considered an open file. Thus, it requires an “open file descriptor”, in GNU/Linux terminology “file handle”.
To fine tune the maximum TCP/IP connection on Linux, we should first check the maximum “file handle” number.
In this article, we assume a server listens on a specific TCP port for clients to connect. Usually users use one or few clients to test the maximum TCP connection.
Below are the system configurations that should be taken care on server side.
Server side tuning
1. System Wide File Descriptors Limits in Linux
There is a “file-max” configuration that defines a system-wide limit on the number of open files for all processes [1]. The default value is set to about 10% of the system memory (if 10% of the system memory is less than 8192, it will be set to 8192). The detail is on Linux kernel function __init files_maxfiles_init() in fs/file_table.c.
System calls that fail when encountering this limit fail with the error ENFILE. If you get error messages in the kernel log about running out of file handles (look for “VFS: file-max limit <number> reached”), try increasing this value.
Note, if the process is with root privilege (CAP_SYS_ADMIN), it will not be limited by this file-max limitation, but there will be still the same error messages in kernel log.
For example:
kernel: [ 908.991163] VFS: file-max limit 99580 reached
You could use below commands to determine the maximum number of open files on a Linux system.
# sysctl fs.file-max
fs.file-max = 99580
# cat /proc/sys/fs/file-max
99580
If users think this value is not big enough, the could change it by
# sysctl -w fs.file-max=500000
To make the change permanent, add or change the following line in the file /etc/sysctl.conf. The configuration will be applied during the boot process.
# echo "fs.file-max=500000" >> /etc/sysctl.conf
2. Process Level Maximum Open Files (fs.nr_open and RLIMIT_NOFILE)
fs.nr_open denotes the maximum number of file-handles a process can allocate [1]. This file imposes ceiling on the value to which the RLIMIT_NOFILE (will be described below) resource limit can be raised.This ceiling is enforced for both unprivileged and privileged process.
Default value is 1024*1024 (1048576) which should be enough for most machines. Actual limit depends on RLIMIT_NOFILE resource limit.
To check nr_file value:
# sysctl fs.nr_open
fs.nr_open = 1048576
# cat /proc/sys/fs/nr_open
1048576
The default value should be big enough, but if you still want to modify the value of nr_open
# sysctl -w fs.nr_open=2000000
fs.nr_open = 2000000
To make the change permanent, add or change the following line in the file /etc/sysctl.conf. The configuration will be applied during the boot process.
# echo "fs.nr_open=2000000" >> /etc/sysctl.conf
RLIMIT_NOFILE indicates the maximum file descriptor number that can be opened by this process or the process forked/spawned by this process. This limit constrains the number of file descriptors that a process may allocate [2].
The value has an associated soft and hard limit. The soft limit is the value that the kernel enforces for the corresponding resource. The hard limit acts as a ceiling for the soft limit: an unprivileged process may set only its soft limit to a value in the range from 0 up to the hard limit.
In summary,
RLIMIT_NOFILE soft limit is the effective value right now for that process. The process can increase the soft limit on their own in times of needing more resources, but cannot set the soft limit higher than the hard limit.
RLIMIT_NOFILE hard limit is the maximum allowed to a process, set by the superuser/root. This value is set in the file /etc/security/limits.conf. Think of it as an upper bound for soft limit.
fs.nr_open is the upper bound that RLIMIT_NOFILE hard limit could be configured.
In mathematical or programming expression:
RLIMIT_NOFILE soft limit <= RLIMIT_NOFILE hard limit <= fs.nr_open
If process functions open files or create sockets that exceeds the RLIMIT_NOFILE soft limit, functions that allocate a file descriptor shall fail with errno set to EMFILE.
Check the RLIMIT_NOFILE (open file number) in ulimit with following commands:
# ulimit -Sn
# ulimit -Hn
The default number of open file limit in Amazon Linux is like below:
The maximum number of open file that could be opened by this process (RLIMIT_NOFILE soft limit)
# ulimit -Sn
1024
The maximum number of open file that Supervisor/root could configure for this process (RLIMIT_NOFILE hard limit)
# ulimit -Hn
4096
To increase the number of open file limit:
1. Modify /etc/security/limits.conf
# vi /etc/security/limits.conf
2. Add below two lines to modify maximum open file limit to 99999
* hard nofile 99999
* soft nofile 99999
3. logout and login linux.
4. make sure the number of open file limit.
# ulimit -Sn
99999
# ulimit -Hn
99999
3. Connection Tracking
If your Linux system enables nf_conntrack module (ex. you are using iptables), it will keep track of which connections are established, and it puts these into a connection tracking table. You may need to take care of the size of connection tracking table.
By default Amazon Linux does not enable nf_conntrack.
If users’ Linux does not enable nf_conntrack, they could ignore checking this.
To enable nf_conntrack
# modprobe nf_conntrack
If your Linux system enables nf_conntrack module, you can see the current size of the tracking table by
# sysctl net.netfilter.nf_conntrack_count
and its limit using
# sysctl net.netfilter.nf_conntrack_max
32768
The default is mentioned in document [3].
nf_conntrack_max - INTEGER
Size of connection tracking table. Default value is nf_conntrack_buckets value * 4.
nf_conntrack_buckets - INTEGER
Size of hash table. If not specified as parameter during module loading, the default size is calculated by dividing total memory by 16384 to determine the number of buckets but the hash table will never have fewer than 32 and limited to 16384 buckets. For systems with more than 4GB of memory it will be 65536 buckets. This sysctl is only writeable in the initial net namespace.
To increase the size of connection tracking table
/sbin/sysctl -w net.netfilter.nf_conntrack_max=100000
To make it permanent after reboot, please add these values to the sysctl.conf
echo net.netfilter.nf_conntrack_max=100000 >> /etc/sysctl.conf
4. Application Limits
Users’ application may have it’s own connection limits. For example, HTTP server apache could configure the maximum clients [4][5]. The nginx HTTP server also has the configuration to specify maximum client connections [6]. This is out of scope of this article.
Client side tuning
In client side, users would still need to take care of below three configurations that we mentioned in server side turning:
1. System Wide File Descriptors Limits in Linux
2. Process Level Maximum Open Files (fs.nr_open and RLIMIT_NOFILE)
3. Connection Tracking
Besides, if users would like to use one or few clients to create many TCP connections to the server, those connections would need different TCP source ports. Be sure to check below configuration.
1. Client Side Ephemeral Ports Limits
When testing maximum connections, users may use one client to generate many connections with different TCP source ports. So users may need to increase client side number of Ephemeral Ports.
ip_local_port_range defines the local port range that is used by TCP and UDP to choose the local port. The first number is the first local ephemeral port number. The second is the last local ephemeral port number [7].
If possible, it is better these numbers have different parity.(one even and one odd values)
The maximum number of the last local port number is 65535. This is limited according to TCP protocol.
The default values are 32768 and 60999 respectively. That means totally 60999 – 32768 = 28231 ports.
To check current ephemeral port range
# sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 32768 60999
For example, we could increase the range to
# sudo sysctl -w net.ipv4.ip_local_port_range="15000 65535"
To make it permanent after reboot, please add these values to the sysctl.conf
echo "net.ipv4.ip_local_port_range=15000 65000" >> /etc/sysctl.conf
So that the total usable ephemeral ports are 65535-15000=50535 ports.
Below lists two situations that when application reaches the limits.
If this happens when the application uses bind(), it will get errno(98) EADDRINUSE: Address already in use.
If this happens when the application uses connect(), it will get errno(99) EADDRNOTAVAIL: Cannot assign requested address.
Test Maximum TCP Connections
Below tests is to test above limits in both server side and client side. When testing server side, all the limits in client side are configured above the max connection number we would like to test, so we could assume there is no limits in client side. Vice versa when testing client side.
To test maximum TCP connections, I launched two Amazon Linux t2.micro instances in the same subnet. One of them will be the server. The other will be the client. I use tcpkali on github. tcpkali is a high performance TCP and WebSocket load generator and sink.
https://github.com/machinezone/tcpkali
Ttcpkali is smart that it will check some system limits before conducting the test. If tcpkali detects the limit is smaller than the number of connection it’s going to create, it will throw error message and will not start the test. However, since my test is to see what will happen when the server/client sides really hit system limits, I comment out those part of tcpkali source codes before installing tcpkali.
In src/tcpkali.c, add “#if 0” and “#endif” to below part:
#if 0
/*
* Check that we'll have a chance to report latency
*/
if(conf.latency_window) {
if(conf.latency_window > conf.test_duration) {
fprintf(stderr, "--statsd-latency-window=%gs exceeds --duration=%gs.\n",
conf.latency_window, conf.test_duration);
exit(EX_USAGE);
}
if(conf.latency_window >= conf.test_duration / 2) {
warning("--statsd-latency-window=%gs might result in too few latency reports.\n", conf.latency_window);
}
if(conf.latency_window < 0.5) {
fprintf(stderr, "--statsd-latency-window=%gs is too small. Try 0.5s.\n",
conf.latency_window);
exit(EX_USAGE);
}
}
#endif
Build tcpkali and Install tcpkali on the instances.
Test steps:
Server side will listen on port 12345 for 3 hours. Command:
./tcpkali -l12345 -T3h -v
Client side will try to create 50000 TCP connection to the server with total time 600s and connection-rate 100/s.
./tcpkali --connections 50000 -T 600 --connect-rate=100 <server_IP>:12345
I use the other console to keep monitoring the connection number. Since this is dumped every 1 second, there maybe some difference between the values of server side and client side.
while true; echo "ESTABLISHED TCP connections with peer:"; ss|grep <peer_IP> | grep ESTAB |wc -l ;sleep 1 ; done;
Test Server Side
The client side system limits are configures as below and is never changed when testing server side.
fs.file-max = 99384
fs.nr_open = 1048576
RLIMIT_NOFILE (ulimit nofile) = 99999
net.nf_conntrack_max = 65535
net.ipv4.ip_local_port_range = 15000 65535 (65535-15000=50535)
1. Limited by RLIMIT_NOFILE (ulimit nofile)
The server side is configured
fs.file-max = 99384
fs.nr_open = 1048576
RLIMIT_NOFILE (ulimit nofile) = 1024
net.nf_conntrack_max = 32767
net.ipv4.ip_local_port_range = 32768 60999 (60999 – 32768 = 28231)
Test result:
Error message | connection established | |
Server | Application: EMFILE: Too many open files | 1139 |
Client | 1887 |
The reason that client side has more connection than server side is that client received SYN+ACK from server and considered connection established. However, when server side “accept” the connection and trying to request a file handle, it failed due to open file limit, so the connection does not establish.
2. Limited by net.nf_conntrack_max
The server side is configured
fs.file-max = 99384
fs.nr_open = 1048576
RLIMIT_NOFILE (ulimit nofile) = 99999
net.nf_conntrack_max = 32767
net.ipv4.ip_local_port_range = 32768 60999 (60999 – 32768 = 28231)
Test result:
Error message | connection established | |
Server | /var/log/messages: kernel: nf_conntrack: nf_conntrack: table full, dropping packet | 32757 |
Client | 32765 |
3. Limited by fs.file-max
The server side is configured
fs.file-max = 6000
fs.nr_open = 1048576
RLIMIT_NOFILE (ulimit nofile) = 99999
net.nf_conntrack_max = 32767
net.ipv4.ip_local_port_range = 32768 60999 (60999 – 32768 = 28231)
Test result:
Error message | connection established | |
Server (ec2-user) | Applicaation: ENFILE: Too many open files in system /var/log/messages: kernel: VFS: file-max limit 6000 reached | 5469 |
Client | 6585 |
Error message | connection established | |
Server (root) | /var/log/messages: kernel: nf_conntrack: nf_conntrack: table full, dropping packet | 32765 |
Client | 32765 |
We could see that if executing tcpkali in server side using root privilege, it will not be limited by fs.file-max. It then limited by nf_conntrack_max.
Test Client Side
The server side system limits are configures as below and is never changed when testing client side.
fs.file-max = 99384
fs.nr_open = 1048576
RLIMIT_NOFILE (ulimit nofile) = 99999
net.nf_conntrack_max = 65535
net.ipv4.ip_local_port_range = 32768 60999 (60999-32768 = 28231)
1. Limited by RLIMIT_NOFILE (ulimit nofile)
The client side is configured
fs.file-max = 99384
fs.nr_open = 1048576
RLIMIT_NOFILE (ulimit nofile) = 1024
net.nf_conntrack_max = 32767
net.ipv4.ip_local_port_range = 32768 60999 (60999 – 32768 = 28231)
Test result:
Error message | connection established | |
Server | 1010 | |
Client | Application: EMFILE : Too many open files | 1010 |
2. Limited by ip_local_port_range
The client side is configured
fs.file-max = 99384
fs.nr_open = 1048576
RLIMIT_NOFILE (ulimit nofile) = 99999
net.nf_conntrack_max = 32767
net.ipv4.ip_local_port_range = 32768 60999 (60999 – 32768 = 28231)
Test result:
Error message | connection established | |
Server | 28231 | |
Client | Application: EADDRINUSE: Address already in use or EADDRNOTAVAIL: Cannot assign requested address | 28231 |
3. Limited by net.nf_conntrack_max
The client side is configured
fs.file-max = 99384
fs.nr_open = 1048576
RLIMIT_NOFILE (ulimit nofile) = 99999
net.nf_conntrack_max = 32767
net.ipv4.ip_local_port_range = 15000 65535 (65535 – 15000 = 50535)
Test result:
Error message | connection established | |
Server | 32747 | |
Client | /var/log/messages: kernel: nf_conntrack: nf_conntrack: table full, dropping packet | 32758 |
4. Limited by fs.file-max
The client side is configured
fs.file-max = 6000
fs.nr_open = 1048576
RLIMIT_NOFILE (ulimit nofile) = 99999
net.nf_conntrack_max = 32767
net.ipv4.ip_local_port_range = 15000 65535 (65535 – 15000 = 50535)
Test result:
Error message | connection established | |
Server | 4778 | |
Client (ec2-user) | Application: ENFILE: Too many open files in system /var/log/messages: kernel: VFS: file-max limit 6000 reached | 6395 |
Error message | connection established | |
Server | 32765 | |
Client (root) | /var/log/messages kernel: VFS: file-max limit 6000 reached kernel: nf_conntrack: nf_conntrack: table full, dropping packet | 32765 |
We could see that if executing tcpkali in client side using root privilege, it will not be limited by fs.file-max. It then limited by nf_conntrack_max.
5. Increase all limits
The client side is configured
fs.file-max = 99384
fs.nr_open = 1048576
RLIMIT_NOFILE (ulimit nofile) = 99999
net.nf_conntrack_max = 65535
net.ipv4.ip_local_port_range = 15000 65535 (65535 – 15000 = 50535)
Test result:
Error message | connection established | |
Server | 50002 | |
Client | 50002 |
It could successfully connect to server with 50000 connections.
Summary:
This article explains the configurations needed to be checked and modified on Linux to increase the maximum TCP connection. Also, performed a test on Amazon Linux using tcpkali tool.
Reference:
[1] Documentation for /proc/sys/fs/* - https://www.kernel.org/doc/Documentation/sysctl/fs.txt
[2] getrlimit man page - http://man7.org/linux/man-pages/man2/getrlimit.2.html
[3] /proc/sys/net/netfilter/nf_conntrack_* Variables - https://www.kernel.org/doc/Documentation/networking/nf_conntrack-sysctl.txt
[4] https://httpd.apache.org/docs/2.4/mod/worker.html
[5] https://www.devside.net/articles/apache-performance-tuning
[6] https://www.nginx.com/blog/tuning-nginx
[7] /proc/sys/net/ipv4/* Variables - https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt
延伸閱讀:
- What’s Asymmetric Routing? Dealing with Two ENIs and Asymmetric Routing on AWS EC2 Linux instances
- 在 Windows 10 的 WSL 2 Ubuntu 上安裝 WordPress
- 自架 WordPress 網站在 Google 上一直搜尋不到?自行提交 sitemap,讓 Google 建立索引