You are currently viewing Tuning Linux maximum TCP connections and test on Amazon Linux

Tuning Linux maximum TCP connections and test on Amazon Linux

Summary 

Many users would like to know what’s the TCP connection limits on AWS Linux instance (Linux connection limit) and how to increase the number of connections. This article explains the configurations needed to check and modify on Linux to increase the maximum TCP connection in terms of server side and client side. Also, some tests were done to see the results with different configurations on Amazon Linux*. 

*Amazon Linux v28.0 314 11 

Network connections
Network connections

Tuning Linux maximum TCP connection

AWS network does not limit the number of TCP connections. When user encounter TCP connection number problem, they are suggested to check their system configuration first. 

In Linux, a TCP/IP socket is considered an open file. Thus, it requires an “open file descriptor”, in GNU/Linux terminology “file handle”.  

To fine tune the maximum TCP/IP connection on Linux, we should first check the maximum “file handle” number. 

In this article, we assume a server listens on a specific TCP port for clients to connect. Usually users use one or few clients to test the maximum TCP connection. 

Below are the system configurations that should be taken care on server side. 

Server side tuning 

1. System Wide File Descriptors Limits in Linux  

There is a “file-max” configuration that defines a system-wide limit on the number of open files for all processes [1]. The default value is set to about 10% of the system memory (if 10% of the system memory is less than 8192, it will be set to 8192). The detail is on Linux kernel function __init files_maxfiles_init() in fs/file_table.c.  

System calls that fail when encountering this limit fail with the error ENFILE. If you get error messages in the kernel log about running out of file handles (look for “VFS: file-max limit <number> reached”), try increasing this value. 

Note, if the process is with root privilege (CAP_SYS_ADMIN), it will not be limited by this file-max limitation, but there will be still the same error messages in kernel log. 

For example: 

kernel: [ 908.991163] VFS: file-max limit 99580 reached 

You could use below commands to determine the maximum number of open files on a Linux system. 

# sysctl fs.file-max 
fs.file-max = 99580 
# cat /proc/sys/fs/file-max 
99580 

 If users think this value is not big enough, the could change it by 

# sysctl -w fs.file-max=500000 

To make the change permanent, add or change the following line in the file /etc/sysctl.conf. The configuration will be applied during the boot process. 

# echo "fs.file-max=500000" >> /etc/sysctl.conf 

2.  Process Level Maximum Open Files (fs.nr_open and RLIMIT_NOFILE) 

fs.nr_open denotes the maximum number of file-handles a process can allocate [1]. This file imposes ceiling on the value to which the RLIMIT_NOFILE (will be described below) resource limit can be raised.This ceiling is enforced for both unprivileged and privileged process.  

Default value is 1024*1024 (1048576) which should be enough for most machines. Actual limit depends on RLIMIT_NOFILE resource limit. 

To check nr_file value: 

# sysctl fs.nr_open 
fs.nr_open = 1048576 
# cat /proc/sys/fs/nr_open 
1048576 

The default value should be big enough, but if you still want to modify the value of nr_open 

# sysctl -w fs.nr_open=2000000 
fs.nr_open = 2000000 

To make the change permanent, add or change the following line in the file /etc/sysctl.conf. The configuration will be applied during the boot process. 

# echo "fs.nr_open=2000000" >> /etc/sysctl.conf 

RLIMIT_NOFILE indicates the maximum file descriptor number that can be opened by this process or the process forked/spawned by this process. This limit constrains the number of file descriptors that a process may allocate [2]. 

The value has an associated soft and hard limit. The soft limit is the value that the kernel enforces for the corresponding resource. The hard limit acts as a ceiling for the soft limit: an unprivileged process may set only its soft limit to a value in the range from 0 up to the hard limit. 

In summary, 

RLIMIT_NOFILE soft limit is the effective value right now for that process. The process can increase the soft limit on their own in times of needing more resources, but cannot set the soft limit higher than the hard limit. 

RLIMIT_NOFILE hard limit is the maximum allowed to a process, set by the superuser/root. This value is set in the file /etc/security/limits.conf. Think of it as an upper bound for soft limit. 

fs.nr_open is the upper bound that RLIMIT_NOFILE hard limit could be configured. 

In mathematical or programming expression: 

RLIMIT_NOFILE soft limit <= RLIMIT_NOFILE hard limit <= fs.nr_open 

If process functions open files or create sockets that exceeds the RLIMIT_NOFILE soft limit, functions that allocate a file descriptor shall fail with errno set to EMFILE.  

Check the RLIMIT_NOFILE (open file number) in ulimit with following commands: 

# ulimit  -Sn 
# ulimit  -Hn 

The default number of open file limit in Amazon Linux is like below: 

The maximum number of open file that could be opened by this process (RLIMIT_NOFILE soft limit) 

# ulimit  -Sn 
1024 

The maximum number of open file that Supervisor/root could configure for this process (RLIMIT_NOFILE hard limit) 

# ulimit  -Hn 
4096 

To increase the number of open file limit: 

1. Modify /etc/security/limits.conf 

# vi /etc/security/limits.conf 

2. Add below two lines to modify maximum open file limit to 99999 

*       hard    nofile          99999 
*       soft    nofile          99999 

3. logout and login linux. 

4. make sure the number of open file limit. 

# ulimit -Sn 
99999 
# ulimit -Hn 
99999 

3. Connection Tracking 

If your Linux system enables nf_conntrack module (ex. you are using iptables), it will keep track of which connections are established, and it puts these into a connection tracking table. You may need to take care of the size of connection tracking table. 

By default Amazon Linux does not enable nf_conntrack. 

If users’ Linux does not enable nf_conntrack, they could ignore checking this. 

To enable nf_conntrack 

# modprobe nf_conntrack 

If your Linux system enables nf_conntrack module, you can see the current size of the tracking table by 

# sysctl net.netfilter.nf_conntrack_count 

and its limit using 

# sysctl net.netfilter.nf_conntrack_max 
32768 

The default is mentioned in document [3]. 

nf_conntrack_max - INTEGER 

Size of connection tracking table.  Default value is nf_conntrack_buckets value * 4. 

nf_conntrack_buckets - INTEGER 

Size of hash table. If not specified as parameter during module loading, the default size is calculated by dividing total memory by 16384 to determine the number of buckets but the hash table will never have fewer than 32 and limited to 16384 buckets. For systems with more than 4GB of memory it will be 65536 buckets. This sysctl is only writeable in the initial net namespace. 

To increase the size of connection tracking table 

/sbin/sysctl -w net.netfilter.nf_conntrack_max=100000 

To make it permanent after reboot, please add these values to the sysctl.conf 

echo net.netfilter.nf_conntrack_max=100000 >> /etc/sysctl.conf 

4. Application Limits 

Users’ application may have it’s own connection limits. For example, HTTP server apache could configure the maximum clients [4][5]. The nginx HTTP server also has the configuration to specify maximum client connections [6]. This is out of scope of this article. 

Client side tuning 

In client side, users would still need to take care of below three configurations that we mentioned in server side turning: 

1. System Wide File Descriptors Limits in Linux  

2. Process Level Maximum Open Files (fs.nr_open and RLIMIT_NOFILE) 

3. Connection Tracking 

Besides, if users would like to use one or few clients to create many TCP connections to the server, those connections would need different TCP source ports. Be sure to check below configuration. 

1. Client Side Ephemeral Ports Limits 

When testing maximum connections, users may use one client to generate many connections with different TCP source ports. So users may need to increase client side number of Ephemeral Ports. 

ip_local_port_range defines the local port range that is used by TCP and UDP to choose the local port. The first number is the first local ephemeral port number. The second is the last local ephemeral port number [7]. 

If possible, it is better these numbers have different parity.(one even and one odd values) 

The maximum number of the last local port number is 65535. This is limited according to TCP protocol. 

The default values are 32768 and 60999 respectively. That means totally 60999 – 32768 = 28231 ports.  

To check current ephemeral port range 

# sysctl net.ipv4.ip_local_port_range 
net.ipv4.ip_local_port_range = 32768    60999

For example, we could increase the range to 

# sudo sysctl -w net.ipv4.ip_local_port_range="15000 65535" 

To make it permanent after reboot, please add these values to the sysctl.conf 

echo "net.ipv4.ip_local_port_range=15000 65000" >> /etc/sysctl.conf 

So that the total usable ephemeral ports are 65535-15000=50535 ports. 

Below lists two situations that when application reaches the limits.  

If this happens when the application uses bind(), it will get errno(98) EADDRINUSE: Address already in use. 

If this happens when the application uses connect(), it will get errno(99) EADDRNOTAVAIL: Cannot assign requested address. 

Test Maximum TCP Connections 

Below tests is to test above limits in both server side and client side. When testing server side, all the limits in client side are configured above the max connection number we would like to test, so we could assume there is no limits in client side. Vice versa when testing client side. 

To test maximum TCP connections, I launched two Amazon Linux t2.micro instances in the same subnet. One of them will be the server. The other will be the client. I use tcpkali on github. tcpkali is a high performance TCP and WebSocket load generator and sink. 

https://github.com/machinezone/tcpkali

Ttcpkali is smart that it will check some system limits before conducting the test. If tcpkali detects the limit is smaller than the number of connection it’s going to create, it will throw error message and will not start the test. However, since my test is to see what will happen when the server/client sides really hit system limits,  I comment out those part of tcpkali source codes before installing tcpkali. 

In src/tcpkali.c, add “#if 0” and “#endif” to below part: 

#if 0 

    /* 

     * Check that we'll have a chance to report latency 

     */ 

    if(conf.latency_window) { 

        if(conf.latency_window > conf.test_duration) { 

            fprintf(stderr, "--statsd-latency-window=%gs exceeds --duration=%gs.\n", 

                conf.latency_window, conf.test_duration); 

            exit(EX_USAGE); 

        } 

        if(conf.latency_window >= conf.test_duration / 2) { 

            warning("--statsd-latency-window=%gs might result in too few latency reports.\n", conf.latency_window); 

        } 

        if(conf.latency_window < 0.5) { 

            fprintf(stderr, "--statsd-latency-window=%gs is too small. Try 0.5s.\n", 

                conf.latency_window); 

            exit(EX_USAGE); 

        } 

    } 

#endif 

Build tcpkali and Install tcpkali on the instances. 

Test steps: 

Server side will listen on port 12345 for 3 hours. Command: 

./tcpkali -l12345 -T3h -v 

Client side will try to create 50000 TCP connection to the server with total time 600s and connection-rate 100/s. 

./tcpkali --connections 50000 -T 600 --connect-rate=100 <server_IP>:12345 

I use the other console to keep monitoring the connection number. Since this is dumped every 1 second, there maybe some difference between the values of server side and client side. 

while true; echo "ESTABLISHED TCP connections with peer:"; ss|grep <peer_IP> | grep ESTAB |wc -l ;sleep 1 ; done; 

Test Server Side 

The client side system limits are configures as below and is never changed when testing server side. 

fs.file-max = 99384 
fs.nr_open = 1048576
RLIMIT_NOFILE (ulimit nofile) = 99999  
net.nf_conntrack_max = 65535 
net.ipv4.ip_local_port_range = 15000    65535 (65535-15000=50535) 

1. Limited by RLIMIT_NOFILE (ulimit nofile) 

The server side is configured 

fs.file-max = 99384 
fs.nr_open = 1048576 
RLIMIT_NOFILE (ulimit nofile) = 1024 
net.nf_conntrack_max = 32767 
net.ipv4.ip_local_port_range = 32768    60999 (60999 – 32768 = 28231) 

Test result: 

 Error message connection established 
Server Application: EMFILE: Too many open files 1139 
Client  1887 

The reason that client side has more connection than server side is that client received SYN+ACK from server and considered connection established. However, when server side “accept” the connection and trying to request a file handle, it failed due to open file limit, so the connection does not establish.  

2. Limited by net.nf_conntrack_max 

The server side is configured 

fs.file-max = 99384 
fs.nr_open = 1048576 
RLIMIT_NOFILE (ulimit nofile) = 99999 
net.nf_conntrack_max = 32767 
net.ipv4.ip_local_port_range = 32768    60999 (60999 – 32768 = 28231) 

Test result: 

 Error message connection established 
Server /var/log/messages: kernel: nf_conntrack: nf_conntrack: table full, dropping packet 32757 
Client  32765 

3. Limited by fs.file-max 

The server side is configured 

fs.file-max = 6000 
fs.nr_open = 1048576 
RLIMIT_NOFILE (ulimit nofile) = 99999 
net.nf_conntrack_max = 32767 
net.ipv4.ip_local_port_range = 32768    60999 (60999 – 32768 = 28231) 

Test result: 

 Error message connection established 
Server (ec2-user) Applicaation: ENFILE: Too many open files in system /var/log/messages: kernel: VFS: file-max limit 6000 reached 5469 
Client  6585 
 Error message connection established 
Server (root) /var/log/messages: kernel: nf_conntrack: nf_conntrack: table full, dropping packet 32765 
Client  32765 

We could see that if executing tcpkali in server side using root privilege, it will not be limited by fs.file-max. It then limited by nf_conntrack_max. 

Test Client Side 

The server side system limits are configures as below and is never changed when testing client side. 

fs.file-max = 99384 
fs.nr_open = 1048576 
RLIMIT_NOFILE (ulimit nofile) = 99999 
net.nf_conntrack_max = 65535 
net.ipv4.ip_local_port_range = 32768    60999 (60999-32768 = 28231) 

1. Limited by RLIMIT_NOFILE (ulimit nofile) 

The client side is configured 

fs.file-max = 99384 
fs.nr_open = 1048576 
RLIMIT_NOFILE (ulimit nofile) = 1024 
net.nf_conntrack_max = 32767 
net.ipv4.ip_local_port_range = 32768    60999 (60999 – 32768 = 28231) 

Test result: 

 Error message connection established 
Server  1010 
Client Application: EMFILE: Too many open files 1010 

2. Limited by ip_local_port_range 

The client side is configured 

fs.file-max = 99384 
fs.nr_open = 1048576 
RLIMIT_NOFILE (ulimit nofile) = 99999 
net.nf_conntrack_max = 32767 
net.ipv4.ip_local_port_range = 32768    60999 (60999 – 32768 = 28231) 

Test result: 

 Error message connection established 
Server  28231 
Client Application: EADDRINUSE: Address already in use or EADDRNOTAVAIL: Cannot assign requested address 28231 

3. Limited by net.nf_conntrack_max 

The client side is configured 

fs.file-max = 99384 
fs.nr_open = 1048576 
RLIMIT_NOFILE (ulimit nofile) = 99999 
net.nf_conntrack_max = 32767 
net.ipv4.ip_local_port_range = 15000    65535 (65535 – 15000 = 50535) 

Test result: 

 Error message connection established 
Server  32747 
Client /var/log/messages: kernel: nf_conntrack: nf_conntrack: table full, dropping packet 32758 

4. Limited by fs.file-max 

The client side is configured 

fs.file-max = 6000 
fs.nr_open = 1048576 
RLIMIT_NOFILE (ulimit nofile) = 99999 
net.nf_conntrack_max = 32767 
net.ipv4.ip_local_port_range = 15000     65535  (65535 – 15000 = 50535) 

Test result: 

 Error message connection established 
Server  4778 
Client (ec2-user) Application: ENFILE: Too many open files in system /var/log/messages: kernel: VFS: file-max limit 6000 reached 6395 
 Error message connection established 
Server  32765 
Client (root) /var/log/messages kernel:  VFS: file-max limit 6000 reached kernel: nf_conntrack: nf_conntrack: table full, dropping packet 32765 

We could see that if executing tcpkali in client side using root privilege, it will not be limited by fs.file-max. It then limited by nf_conntrack_max. 

5. Increase all limits 

The client side is configured 

fs.file-max = 99384 
fs.nr_open = 1048576 
RLIMIT_NOFILE (ulimit nofile) = 99999 
net.nf_conntrack_max = 65535 
net.ipv4.ip_local_port_range = 15000     65535  (65535 – 15000 = 50535) 

Test result: 

 Error message connection established 
Server   50002 
Client   50002 

It could successfully connect to server with 50000 connections. 

Summary: 

This article explains the configurations needed to be checked and modified on Linux to increase the maximum TCP connection. Also, performed a test on Amazon Linux using tcpkali tool. 

Reference: 

[1] Documentation for /proc/sys/fs/* - https://www.kernel.org/doc/Documentation/sysctl/fs.txt 
[2] getrlimit man page - http://man7.org/linux/man-pages/man2/getrlimit.2.html 
[3] /proc/sys/net/netfilter/nf_conntrack_* Variables - https://www.kernel.org/doc/Documentation/networking/nf_conntrack-sysctl.txt 
[4] https://httpd.apache.org/docs/2.4/mod/worker.html
[5] https://www.devside.net/articles/apache-performance-tuning 
[6] https://www.nginx.com/blog/tuning-nginx 
[7] /proc/sys/net/ipv4/* Variables - https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt 

延伸閱讀:


Perry Lin

Perry Lin 有十幾年科技業工作經驗,包含網通、晶片及雲端技術領域,喜愛旅遊及打羽球,曾經在巴西首都巴西利亞實習,去阿拉斯加看極光、南美巴拉圭、非洲坦尚尼亞等地旅遊。

發佈留言