SAP note 3406060
SAP HANA and VMware support pages
SAP HANA on HCI powered by vSAN
vSphere and SAP HANA best practices - https://www.vmware.com/docs/
I believe the Next Generation Computing is Software Defined Infrastructure on top of the robust physical infrastructure. You can ask me anything about enterprise infrastructure (virtualization, compute, storage, network) and we can discuss it deeply on this blog. Don't hesitate to contact me.
In PART 1, I have compared FreeBSD 14.2 and Debian 10.2 default installations and performed some basic network tuning of FreeBSD to approach Debian tcp throughput, which is, based on my testing, higher than network throughput on FreeBSD. The testing in PART 1 was performed on Cisco UCS enterprise servers with 2x CPU Intel Xeon CPU E5-2680 v4 @ 2.40GHz with ESXi 8.0.3. This is approximately 9 year old server with Intel server Xeon Family CPU.
In this PART 2, I will continue with network deep dive into network throughput tuning with some additional context and advanced network tuning of FreeBSD and Debian. Tests will be performed on 9 years old consumer PC (Intel NUC 6i3SYH) having 1x CPU Intel Core i3-6100U CPU @ 2.30GHz with ESXi 8.0.3.
VM hardware used for iperf tests has following specification
I run iperf -s on one VM01 and iperf -c [IP-OF-VM01] -t600 -i5 on VM02. I use iperf parameter -P1, -P2, -P3, -P4 to test impact of more paralel client threads and watching results, because I realized that more paralel client threads has a positive impact on FreeBSD network throughput and none or little bit negative impact on Debian (linux).
I test network throughput with and without following hardware offload capabilities
When two VMs talk to each other on the same ESXi host:
RSS is another important network technology to achieve high network traffic and throughput. RSS spreads incoming network traffic across multiple CPU cores by using a hash of the packet headers (IP, TCP, UDP).
Without RSS:
With RSS:
In this exercise we test network throughput of single vCPU virtual machines, therefore RSS would not help us anyway. I will focus on multi CPU VMs in the future.
Anyway, it seems that RSS is not implemented in FreeBSD's vmx driver of VMXNET3 network card and only partly implemented in VMXNET3 driver in Linux. The reason is, that RSS would add overhead inside a VM.
Implementing RSS would:
In most cases, multiqueue + interrupt steering gives enough performance inside a VM without the cost of full RSS.
FreeBSD blacklists MSI/MSI-X (Message Signaled Interrupts) for some virtual and physical devices to avoid bugs or instability. In VMware VMs, this means that MSI-X (which allows multiple interrupt vectors per device) is disabled by default, limiting performance — especially for multiqueue RX/TX and RSS (Receive Side Scaling).
With MSI-X enabled, you get:
This setting affects all PCI devices, not just vmx, so it should be tested carefully in production VMs. On ESXi 6.7+ and FreeBSD 12+, MSI-X is generally stable for vmxnet3.
This is another potential improvement for multi vCPU VMs but it should not help us in single vCPU VM.
I have test it and it really does not help in single vCPU VM. I will definitely test this setting along with RSS and RX/TX queues later in future parts of this series of articles about FreeBSD network throughput when I will test impact of multiple vCPUs and network queues.
By default, FreeBSD uses a single thread to process all network traffic in accordance with the strong ordering requirements found in some protocols, such as TCP.
In order to increase potential packet processing concurrency, net.isr.maxthreads can be define as "-1" which will automatically enable netisr threads equal to the number of CPU cores in the machine. Now, all CPU cores can be used for packet processing and the system will not be limited to a single thread running on a single CPU core.
As we are testing TCP network throughput in single CPU Core machine, this is not going to help us.
The net.isr.defaultqlimit setting
in FreeBSD controls the queue length for Interrupt Service Routines
(ISR), which are part of the network packet processing pipeline.
Specifically, this queue holds incoming network packets that are being
processed in the Interrupt Handler before being passed up to higher
layers (e.g., the TCP/IP stack).
The ISR queues help ensure that
network packets are processed efficiently without being dropped
prematurely when the network interface card (NIC) is receiving packets
at a high rate.
We can experiment with different values. The default value is 256, but for high-speed networks, we might try values like 512 or 1024.
The net.isr.dispatch sysctl in FreeBSD controls how inbound network packets are processed in relation to the netisr (network interrupt service routines) system. This is central to FreeBSD's network stack parallelization.
The net.link.ifqmaxlen sysctl in FreeBSD controls the maximum length of the interface output queue, i.e., how many packets can be queued for transmission on a network interface before packets start getting dropped.
Every network interface in FreeBSD has an output queue for packets that are waiting to be transmitted. net.link.ifqmaxlen defines the default maximum number of packets that can be held in this queue. If the queue fills up (e.g., due to a slow link or CPU bottleneck), additional packets are dropped until space becomes available again.
The default value is typically 50, which can be too low for high-throughput scenarios.
FreeBSD lets you set the number of queues via loader.conf if supported by the driver.
With only 1 core, there's no benefit (and typically no support) for having more than 1 TX and 1 RX queue. FreeBSD’s vmx driver will automatically limit the number of queues to match the number of cores available.
At the moment we test network throughput of single vCPU machines, therefore, we do not tune this setting.
In this chapter I will consider additional FreeBSD network tuning described for example at https://calomel.org/freebsd_network_tuning.html and other resources over the Internet.
soreceive_stream() can significantly reduced CPU usage and lock contention when receiving fast TCP streams. Additional gains are obtained when the receiving application, like a web server, is using SO_RCVLOWAT to batch up some data before a read (and wakeup) is done.
How to enable soreceive_stream?
Add following line to /boot/loader.conf
net.inet.tcp.soreceive_stream="1" # (default 0)
How to check the status of soreceive_stream?
sysctl -a | grep soreceive
soreceive_stream is disabled by default. During my testing I have not seen any increase of network throughput when soreceive_stream is enabled, therefore, we can keep it on default - disabled.
There are several TCP Congestion Control Algorithms in FreeBSD and Debian. In FreeBSD are cubic (default), newreno, htcp (Hamilton TCP), vegas, cdg, chd. In Debian is cubic (default) and reno. Cubic is the default TCP Congestion Control Algorithms in FreeBSD and Debian and I tested all these algorithms in FreeBSD and cubic is optimal.
FreeBSD 14.2 currently supports three TCP stacks.
I found out that default FreeBSD TCP stack has the biggest throughput in data center network, therefore changing TCP stack does not help to increase network throughput.