Sunday, March 30, 2025

Network benchmark (iperf) of 10Gb Data Center Interconnect

I wanted to test 10Gb ethernet link I have got as data center interconnect between two datacenters. I generally do not trust anything I have not tested.

If you want test something, it is important to have good testing methodology and toolset.

Toolset

OS: FreeBSD 14.2 is IMHO the best x86-64 operating system in terms of networking. Your mileage may vary.

Network benchmark testing tool: IPERF (iperf2) is weel known tool to benchmark network performance and bandwidth.

Hypervisor: VMware ESXi 8.0.3 is the best in class hypervisor to test varios virtual machines

Methodology

I have use two Virtual Machines. At the end I will test network throughput between two VMs, where one VM is in each end of network link (DC Interconnect). However before the final test (Test 4) of DC interconnect throughput, I will test network throughput (Test 1) within the same VM to test localhost throughput, (Test 2) between VMs within single hypervisor (ESXi) host to avoid using physical network, (Test 3) VMs across two hypervisors (ESXi) within single VLAN in one datacenter to test local L2 throughput.

Results

Test 1: Network throughput within the same VM to test localhost throughput

VMware Virtual Machines have following hardware specification:

  • 8 vCPU (INTEL XEON GOLD 6544Y @ 3.6 Ghz)
  • 8 GB RAM
  • 8 GB vDisk
  • 1 vNIC (vmxnet) 
1 iperf connection / 2 CPU Threads (-P not specified, default setting in use)
iperf server command: iperf -s
iperf client comand: iperf -c localhost -t 60
Network Throughput: 75.4Gb/s - 83Gb/s
CPU usage on server/client: 23%
MEM usage on server/client: ~500MB 
 
2 iperf connections / 4 CPU Threads (-P 2)
iperf server command: iperf -s
iperf client comand: iperf -c localhost -P 2 -t 60
Network Throughput: 90.8Gb/s - 92Gb/s
CPU usage on server/client: 28%
MEM usage on server/client: ~500MB

4 iperf connections / 8 CPU Threads (-P 4)
iperf server command: iperf -s
iperf client comand: iperf -c localhost -P 4 -t 60
Network Throughput: 88.5Gb/s - 89.1Gb/s
CPU usage on server/client: 29%
MEM usage on server/client: ~500MB 
 
8 iperf connections / 16 CPU Threads (-P 8)
iperf server command: iperf -s
iperf client comand: iperf -c localhost -P 8 -t 60
Network Throughput: 91.6Gb/s - 95.3Gb/s
CPU usage on server/client: 30%
MEM usage on server/client: ~500MB 
 
Tests with Higher TCP Windows Size (800kB)
 
1 iperf connection / 2 CPU Threads (-P not specified, default setting in use -w 800k)
iperf server command: iperf -s
iperf client comand: iperf -c localhost -w 800k -t 60
Network Throughput: 69.8Gb/s - 81.0Gb/s
CPU usage on server/client: 28%
MEM usage on server/client: ~500MB
 
2 iperf connections / 4 CPU Threads (-P 2 -w 800k)
iperf server command: iperf -s
iperf client comand: iperf -c localhost -P 2 -w 800k -t 60
Network Throughput: 69.8Gb/s - 69.9Gb/s
CPU usage on server/client: 28%
MEM usage on server/client: ~500MB
 
4 iperf connections / 8 CPU Threads (-P 4 -w 800k)
iperf server command: iperf -s
iperf client comand: iperf -c localhost -P 2 -w 800k -t 60
Network Throughput: 69.2Gb/s - 70.0Gb/s
CPU usage on server/client: 28%
MEM usage on server/client: ~500MB
 
8 iperf connections / 16 CPU Threads (-P 8 -w 800k)
iperf server command: iperf -s
iperf client comand: iperf -c localhost -P 8 -w 800k -t 60
Network Throughput: 72.6Gb/s - 74.0Gb/s
CPU usage on server/client: 28%
MEM usage on server/client: ~500MB

Test 2: Network throughput between VMs within hypervisor (no physical network)

VMware Virtual Machines with following hardware specification:

  • 8 vCPU (INTEL XEON GOLD 6544Y @ 3.6 Ghz)
  • 8 GB RAM
  • 8 GB vDisk
  • 1 vNIC (vmxnet) 
1 iperf connection / 2 CPU Threads (-P not specified, default setting in use)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -t 60
Network Throughput: 6.5Gb/s - 6.71Gb/s
CPU usage on server: 70%
CPU usage on client: 30-50%
MEM usage on server/client: ~500MB 
 
2 iperf connections / 4 CPU Threads (-P 2)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -P 2 -t 60
Network Throughput: 8.42Gb/s -8.62Gb/s
CPU usage on server: ~33%
CPU usage on client: ~30%
MEM usage on server/client: ~500MB
 
4 iperf connections / 8 CPU Threads (-P 4)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -P 4 -t 60
Network Throughput: 19.5Gb/s - 20.2Gb/s
CPU usage on server: 85%
CPU usage on client: 48%
MEM usage on server/client: ~500MB
 
 

8 iperf connections / 16 CPU Threads (-P 8)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -P 8 -t 60
Network Throughput: 17.1Gb/s - 18.4Gb/s
CPU usage on server: ~85%
CPU usage on client: ~30%
MEM usage on server/client: ~500MB
 

Tests with Higher TCP Windows Size (800kB)
 
1 iperf connection / 2 CPU Threads (-P not specified, default setting in use -w 800k)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -w 800k -t 60
Network Throughput: 6.57Gb/s - 6.77Gb/s
CPU usage on server: 24%
CPU usage on client: 24%
MEM usage on server/client: ~500MB
 
2 iperf connections / 4 CPU Threads (-P 2 -w 800k)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -P 2 -w 800k -t 60
Network Throughput: 7.96Gb/s -8.0Gb/s
CPU usage on server: ~30%
CPU usage on client: ~28%
MEM usage on server/client: ~500MB
 
4 iperf connections / 8 CPU Threads (-P 4 -w 800k)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -P 4 -w 800k -t 60
Network Throughput: 15.8Gb/s -18.8Gb/s
CPU usage on server: ~85%
CPU usage on client: ~40%
MEM usage on server/client: ~500MB
 
8 iperf connections / 16 CPU Threads (-P 8 -w 800k)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -P 8 -w 800k -t 60
Network Throughput: 19.1Gb/s - 22.8Gb/s
CPU usage on server: ~98%
CPU usage on client: ~30%
MEM usage on server/client: ~500MB
 

Test 3: Network throughput between VMs across two hypervisors within VLAN (25Gb switch ports) in one DC

VMware Virtual Machines have following hardware specification:

  • 8 vCPU (INTEL XEON GOLD 6544Y @ 3.6 Ghz)
  • 8 GB RAM
  • 8 GB vDisk
  • 1 vNIC (vmxnet) - connected to 25Gb physical switch ports
1 iperf connection / 2 CPU Threads (-P not specified, default setting in use)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -t 60
Network Throughput: 6.1Gb/s - 6.34Gb/s
CPU usage on server: 23%
CPU usage on client: 17%
MEM usage on server/client: ~500MB 
 
2 iperf connections / 4 CPU Threads (-P 2)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -P 2 -t 60
Network Throughput: 9.31Gb/s -10.8Gb/s
CPU usage on server: ~43%
CPU usage on client: ~30%
MEM usage on server/client: ~500MB
 
4 iperf connections / 8 CPU Threads (-P 4)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -P 4 -t 60
Network Throughput: 19.5Gb/s - 20.2Gb/s
CPU usage on server: 85%
CPU usage on client: 48%
MEM usage on server/client: ~500MB

8 iperf connections / 16 CPU Threads (-P 8)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -P 8 -t 60
Network Throughput: 17.1Gb/s - 18.4Gb/s
CPU usage on server: ~80%
CPU usage on client: ~50%
MEM usage on server/client: ~500MB

Tests with Higher TCP Windows Size (800kB)
 
1 iperf connection / 2 CPU Threads (-P not specified, default setting in use -w 800k)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -w 800k -t 60
Network Throughput: 6.11Gb/s - 6.37Gb/s
CPU usage on server: 16%
CPU usage on client: 22%
MEM usage on server/client: ~500MB
 
2 iperf connections / 4 CPU Threads (-P 2 -w 800k)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -P 2 -w 800k -t 60
Network Throughput: 9.81Gb/s -10.9Gb/s
CPU usage on server: ~39%
CPU usage on client: ~25%
MEM usage on server/client: ~500MB
 
4 iperf connections / 8 CPU Threads (-P 4 -w 800k)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -P 4 -w 800k -t 60
Network Throughput: 16.5Gb/s -19.8Gb/s
CPU usage on server: ~85%
CPU usage on client: ~40%
MEM usage on server/client: ~500MB
 
8 iperf connections / 16 CPU Threads (-P 8 -w 800k)
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -P 8 -w 800k -t 60
Network Throughput: 17.7Gb/s - 18.2Gb/s
CPU usage on server: ~80%
CPU usage on client: ~50%
MEM usage on server/client: ~500MB

Test 4: Network throughput between VMs across two hypervisors across two interconnected VLANs across two DCs

VMware Virtual Machines have following hardware specification:

  • 8 vCPU (INTEL XEON GOLD 6544Y @ 3.6 Ghz)
  • 8 GB RAM
  • 8 GB vDisk
  • 1 vNIC (vmxnet) 
iperf server command: iperf -s
iperf client comand: iperf -c 10.202.201.6 -P 4 -t 60
Network Throughput: 9.74 Gb/s

Conclusion

Network throughput requires CPU cycles, therefore number of CPU cores matters.
 
iperf client by default uses one connection for generating network traffic where each connection uses 2 vCPUs (hyper-threading threads). In such default configuration I was able to achieve ~6.65 Gb/s in VM with at least 2 vCPU, which is not enough to test 10Gb/s datacenter interconnect. 
 
By using parameter -P 4, four parallel iperf client connections are initiated where each iperf connection uses 2 vCPUs (hyper-threading threads), therefore it can leverage all 8 vCPUs we have in testing VM.
 
By using parameter -P 8 in VM, eight parallel iperf client connections are initiated where each iperf client connection uses 2 vCPUs (hyper-threading threads), therefore it can leverage 16 vCPUs, but us we use only 8 vCPUs in our test machine, it only make bigger stress on existing CPUs and therfore it can have negative impact on overall network throughput.
 
The best practice is to use -P 4 for iperf client on machine with 8 CPUs as iperf client connections can be balanced across all 8 available CPUs. If you have more CPUs available, parameter -P should be the half of number of available CPUs.
  • 1 CPUs VM can achieve network traffic up to 5.83 Gb/s. During such network traffic, CPU is fully used (100% usage) and maximum single iperf connection throughput of 6.65 Gb/s cannot be acieved duw to CPU constraint.
  • 2 CPUs VM can achieve network traffic up to 6.65 Gb/s. During such network traffic, CPU is fully used (100% usage).
  • 4 CPUs VM with -P 2 is necessary to achieve network traffic up to 10 Gb/s.
  • 8 CPUs VM with -P 4 is necessary to achieve network traffic over 10 Gb/s. These 8 threads can generate 20 Gb/s which is good enough to test my 10Gb/s data center interconnect. 
Another iperf parametr which in theory could improve network throughput is the parameter -w which defines TCP Window Size. iperf by default uses TCP Window Size between 32kB and 64kB. By increasing TCP Window Size to 800kB (-w 800k) can slightly improve (~10%) performance during higher stress on CPU (-P 8 = 8 Processes / 16 Threads) across VMs. However, higher TCP Window Size (-w 800k) has negative impact (in some cases almost 30%) on localhost network throughput performance.

What real network throughput I have measured during this testing excercise? 

Localhost network throughput is significantly higher than network throughput across Virtual Machines or accross physical network and servers. We can achieve between 75 Gb/s and 95 Gb/s on Localhost. Network traffic does not need across virtual and physical hardware. It is logical that virtual and physical hardware introduces some bottlenecks.
 
Network throughput between VMs within single hypervisor can achieve 6.5 Gb/s with single process and two threads.  Up to 22.8 Gb/s (eight processes / sixteen threads and higher TCP Windows Size - 800kB) and 20.2 Gb/s with eight processes / sixteen threads and default TCP Windows Size. 

Network throughput between VMs within VLAN (25 Gb switch ports) in one data center can achieve up to 20.2 Gb/s (eight processes / sixteen threads and standard TCP Windows Size).
 
When you would need higher throughput than 20 Gb/s between VMware virtual machines, more CPU cores and special performance tuning of vNIC/vmxnet driver would need to be done. Such performance tunning would be about enabling Jumbo Frames (MTU 9,000, ifconfig_vmx0="inet <IP> netmask <NETMASK> mtu 9000") into guest OS, increasing Network Buffers in FreeBSD kernel (kern.ipc.maxsockbuf, net.inet.tcp.sendspace, net.inet.tcp.recvspace=4194304), Enable TCP Offloading (ifconfig_vmx0="inet <IP> netmask <NETMASK> mtu 9000 txcsum rxcsum tso4 tso6 lro"), Tune Interrupt Moderation, Use Multiple Queues aka RSS (sysctl net.inet.rss.enabled=1, sysctl net.inet.rss.bits=4). Fortunatelly enough, 20 Gb throughput is good enough to test my 10 Gb data center interconnect. 

Network throughput between VMs accros 10 Gb data center interconnect can achieve 9.74 Gb/s (four iperf connections / eight vCPUs in use). 9.74 Gb/s TCP throughput over 10 Gb/s data center ethernet interconnect is acceptable throughput.

Thursday, March 20, 2025

VMware PowerCLI (PowerShell) on Linux

VMware PowerCLI is very handy and flexible automation tool allowing automation of almost all VMware features. It is based on Microsoft PowerShell. I do not have any Microsoft Windows system in my home lab but I would like to use Microsoft PowerShell. Fortunately enough, Microsoft PowerShell Core is available for Linux. Here is my latest runbook how to leverage PowerCLI in Linux management workstation leveraging Docker Application packaging.

Install Docker in your Linux Workstation

This is out of scope of this runbook.

Pull official and verified Microsoft Powershell

sudo docker pull mcr.microsoft.com/powershell:latest

Now you can run powershell container interactively (-i) and in allocated pseudo-TTY (-t). Option -rm stands for "Automatically remove the container when it exits".

List container images

sudo docker image ls

Run powershell container

sudo docker run --rm -it mcr.microsoft.com/powershell

You can avoid image pull and run powershell container, it will pull image automatically during first attempt of run.

Install PowerCLI in PowerShell

Install-Module -Name VMware.PowerCLI -Scope CurrentUser -Force

Allow Untrusted Certificates

Set-PowerCLIConfiguration -InvalidCertificateAction Ignore -Confirm:$false

Now you can connect to vCenter and list VMs

Connect-VIServer -Server <vcenter-server> -User <username> -Password <password>

Get-VM


 

 

 

Saturday, March 15, 2025

How to update ESXi with unsupported CPU?

I have old unsupported servers in my lab used for ESXi 8.0.3. In such configuration, you cannot update ESXi by default procedure in GUI.

vSphere Cluster Update doesn't allow remediation

ESXi host shows unsupported CPU

Solution is to allow legacy CPU and update ESXi from shell with esxcli.

Allow legacy CPU

The option allowLegacyCPU is not available in the ESXi GUI (DCUI or vSphere Client). It must be enabled using the ESXi shell or SSH. Bellow are command to allow legacy CPU.

esxcli system settings kernel set -s allowLegacyCPU -v TRUE

You can verify it by command ...

esxcli system settings kernel list | grep allowLegacyCPU

If above procedure fails, the other option is to edit file /bootbank/boot.cfg and add allowLegacyCPU=true to the end of kernelopt line.

In my case, it look like ...

kernelopt=autoPartition=FALSE allowLegacyCPU=true

After modifying /bootbank/boot.cfg, ESXi configuration should be saved to make changes persistent across reboots.

 /sbin/auto-backup.sh

Reboot of ESXi is obviously required to make kernel option active.

reboot

After reboot, you can follow by standard system update procedure by ESXCLI method as documented below.

ESXi update procedure (ESXCLI method)

  1. Download appropriate ESXi offline depot. You can find URL of depot in Release Notes of particular ESXi version. You will need Broadcom credentials to download it from Broadcom support site.
  2. Upload (leveraging Datastore File Browser, scp, winscp, etc.) ESXi offline depot to some Datastore
    • in my case /vmfs/volumes/vsanDatastore/TMP
  3. List profiles in ESXi depot
    • esxcli software sources profile list -d /vmfs/volumes/vsanDatastore/TMP/VMware-ESXi-8.0U3d-24585383-depot.zip 
  4. Update ESXi to particular profile with no hardware warning
    • esxcli software profile update -d /vmfs/volumes/vsanDatastore/TMP/VMware-ESXi-8.0U3d-24585383-depot.zip -p ESXi-8.0U3d-24585383-no-tools --no-hardware-warning
  5. Reboot ESXi
    •   reboot

Hope this helps other folks in their home labs with unsupported CPUs.

Friday, February 07, 2025

Broadcom (VMware) Useful Links for Technical Designer and/or Architect

Lot of URLs have been changed after Broadcom acquisition of VMware. That's the reason I have started to document some of useful links for me.

VMware Product Configuration Maximums - https://configmax.broadcom.com (aka https://vmware.com/go/hcl)

Network (IP) ports Needed by VMware Products and Solutions - https://ports.broadcom.com/

VMware Compatibility Guide - https://compatibilityguide.broadcom.com/ (aka https://www.vmware.com/go/hcl)

VMware Product Lifecycle - https://support.broadcom.com/group/ecx/productlifecycle (aka https://lifecycle.vmware.com/)

Product Interoperability Matrix - https://interopmatrix.broadcom.com/Interoperability

VMware Hands-On Lab - https://labs.hol.vmware.com/HOL/catalog

Broadcom (VMware) Education / Learning - https://www.broadcom.com/education

VMware Validated Solutions - https://vmware.github.io/validated-solutions-for-cloud-foundation/

If you are independent consultant and have to open support ticket related to VMware Education or Certification you can use form at https://broadcomcms-software.wolkenservicedesk.com/web-form  

VMware Health Analyzer

 Do you know any other helpful link? Use comments below to let me know. Thanks.

Tuesday, February 04, 2025

How my Microsoft Windows OS syncing the time?

This is very short post with the procedure how to check time synchronization of Microsoft Windows OS in VMware virtual machine.

There are two options how time can be synchronized

  1. via NTP 
  2. via VMware Tools with ESXi host where VM is running 

The command w32tm /query /status shows the current configuration of time sync.

 Microsoft Windows [Version 10.0.20348.2582]  
 (c) Microsoft Corporation. All rights reserved.  
 C:\Users\david.pasek>w32tm /query /status  
 Leap Indicator: 0(no warning)  
 Stratum: 6 (secondary reference - syncd by (S)NTP)  
 Precision: -23 (119.209ns per tick)  
 Root Delay: 0.0204520s  
 Root Dispersion: 0.3495897s  
 ReferenceId: 0x644D010B (source IP: 10.77.1.11)  
 Last Successful Sync Time: 2/4/2025 10:14:10 AM  
 Source: DC02.example.com  
 Poll Interval: 7 (128s)  
 C:\Users\david.pasek>   

If Windows OS is connected to Active Directory (this is my case), it synchronize time with AD via NTP by default. This is visible in the output of command w32tm /query /status.

You are dependent on Active Directory Domain Controllers, therefore, the correct time in Active Directory Domain Controllers is crucial. I was blogging how to configure time in virtualized Active Directory Domain Controller back in 2011. Is is very old post but it still should work.

To check if VMware Tools are syncing time with ESXi host use following command

 C:\>"c:\Program Files\VMware\VMware Tools\VMwareToolboxCmd.exe" timesync status  
 Disabled  

VMware Tools time sync is disabled by default, which is the VMware best practice. It is highly recommended to not synchronize time with underlaying ESXi host and leverage NTP sync over network with trusted time provider. This will help you in case someone will make configuration mistake and time is not configured properly in particular ESXi.  

Hope you find this useful.

Friday, December 20, 2024

CPU cycles required for general storage workload

I recently published a blog post about CPU cycles required for network and VMware vSAN ESA storage workload. I realized it would be nice to test and quantify CPU cycles needed for general storage workload without vSAN ESA backend operations like RAID/RAIN and compression.

Performance testing is always tricky as it depends on guest OS, firmware, drivers, and application, but we are not looking for exact numbers and approximations are good enough for a general rule of thumb helping pure designer during capacity planning. 

My test environment was old Dell PowerEdge R620 (Intel Xeon CPU E5-2620 @ 2.00GHz), with ESXi 8.0.3 and Windows Server 2025 in a Virtual Machine (2 vCPU @ 2 GHz, 1x para-virtualized SCSI controller/PVSCSI, 1x vDisk). Storage subsystem was VMware VMFS datastore on local NVMe consumer-grade disk (Kingston SNVS1000GB flash).

Storage tests were done using an old good Iometer.

Test VM had total CPU capacity of 4 GHz (4,000,000,000 Hz aka CPU Clock Cycles)

Below are some test results to help me define another rule of thumb.

TEST - 512 B, 100% read, 100% random - 4,040 IOPS @ 2.07 MB/s @ avg response time 0.25 ms

  • 15.49% CPU = 619.6 MHz
  • 619.6 MHz  (619,600,000 CPU cycles) is required to deliver 2.07 MB/s (16,560,000 b/s)
    • 37.42 Hz to read 1 b/s
    • 153.4 KHz for reading 1 IOPS (512 B, random)

TEST - 512 B, 100% write, 100% random - 4,874 IOPS @ 2.50 MB/s @ avg response time 0.2 ms

  • 19.45% CPU = 778 MHz
  • 778 MHz  (778,000,000 CPU cycles) is required to deliver 2.50 MB/s (20,000,000 b/s)
    • 38.9 Hz to write 1 b/s
    • 159.6 KHz for writing 1 IOPS (512 B, random)

TEST - 4 KiB, 100% read, 100% random - 3,813 IOPS @ 15.62 MB/s @ avg response time 0.26 ms

  • 13.85% CPU = 554.0 MHz
  • 554.0 MHz  (554,000,000 CPU cycles) is required to deliver 15.62 MB/s (124,960,000 b/s)
    • 4.43 Hz to read 1 b/s
    • 145.3 KHz for 1 reading IOPS (4 KiB, random)

TEST - 4 KiB, 100% write, 100% random - 4,413 IOPS @ 18.08 MB/s @ avg response time 0.23 ms

  • 21.84% CPU = 873.6 MHz
  • 873.6 MHz  (873,600,000 CPU cycles) is required to deliver 18.08 MB/s (144,640,000 b/s)
    • 6.039 Hz to write 1 b/s
    • 197.9 KHz for writing 1 IOPS (4 KiB, random)

TEST - 32 KiB, 100% read, 100% random - 2,568 IOPS @ 84.16 MB/s @ avg response time 0.39 ms

  • 10.9% CPU = 436 MHz
  • 436 MHz  (436,000,000 CPU cycles) is required to deliver 84.16 MB/s (673,280,000 b/s)
    • 0.648 Hz to read 1 b/s
    • 169.8 KHz for reading 1 IOPS (32 KiB, random)

TEST - 32 KiB, 100% write, 100% random - 2,873 IOPS @ 94.16 MB/s @ avg response time 0.35 ms

  • 14.16% CPU = 566.4 MHz
  • 566.4 MHz  (566,400,000 CPU cycles) is required to deliver 94.16 MB/s (753,280,000 b/s)
    • 0.752 Hz to write 1 b/s
    • 197.1 KHz for writing 1 IOPS (32 KiB, random)

TEST - 64 KiB, 100% read, 100% random - 1,826 IOPS @ 119.68 MB/s @ avg response time 0.55 ms

  • 9.06% CPU = 362.4 MHz
  • 362.4 MHz  (362,400,000 CPU cycles) is required to deliver 119.68 MB/s (957,440,000 b/s)
    • 0.37 Hz to read 1 b/s
    • 198.5 KHz for reading 1 IOPS (64 KiB, random)

TEST - 64 KiB, 100% write, 100% random - 2,242 IOPS @ 146.93 MB/s @ avg response time 0.45 ms

  • 12.15% CPU = 486.0 MHz
  • 486.0 MHz  (486,000,000 CPU cycles) is required to deliver 149.93 MB/s (1,199,440,000 b/s)
    • 0.41 Hz to write 1 b/s
    • 216.7 KHz for writing 1 IOPS (64 KiB, random)

TEST - 256 KiB, 100% read, 100% random - 735 IOPS @ 192.78 MB/s @ avg response time 1.36 ms

  • 6.66% CPU = 266.4 MHz
  • 266.4 MHz  (266,400,000 CPU cycles) is required to deliver 192.78 MB/s (1,542,240,000 b/s)
    • 0.17 Hz to read 1 b/s
    • 362.4 KHz for reading 1 IOPS (256 KiB, random)

TEST - 256 KiB, 100% write, 100% random - 703 IOPS @ 184.49 MB/s @ avg response time 1.41 ms

  • 7.73% CPU = 309.2 MHz
  • 309.2 MHz  (309,200,000 CPU cycles) is required to deliver 184.49 MB/s (1,475,920,000 b/s)
    • 0.21 Hz to write 1 b/s
    • 439.9 KHz for writing 1 IOPS (256 KiB, random)

TEST - 256 KiB, 100% read, 100% seq - 2784 IOPS @ 730.03 MB/s @ avg response time 0.36 ms

  • 15.26% CPU = 610.4 MHz
  • 610.4 MHz  (610,400,000 CPU cycles) is required to deliver 730.03 MB/s (5,840,240,000 b/s)
    • 0.1 Hz to read 1 b/s
    • 219.25 KHz for reading 1 IOPS (256 KiB, sequential)

TEST - 256 KiB, 100% write, 100% seq - 1042 IOPS @ 273.16 MB/s @ avg response time 0.96 ms

  • 9.09% CPU = 363.6 MHz
  • 363.6 MHz  (363,600,000 CPU cycles) is required to deliver 273.16 MB/s (2,185,280,000 b/s)
    • 0.17 Hz to write 1 b/s
    • 348.4 KHz for writing 1 IOPS (256 KiB, sequential)

TEST - 1 MiB, 100% read, 100% seq - 966 IOPS @ 1013.3 MB/s @ avg response time 1 ms

  • 9.93% CPU = 397.2 MHz
  • 397.2 MHz  (397,200,000 CPU cycles) is required to deliver 1013.3 MB/s (8,106,400,000 b/s)
    • 0.05 Hz to read 1 b/s
    • 411.18 KHz for reading 1 IOPS (1 MiB, sequential)

TEST - 1 MiB, 100% write, 100% seq - 286 IOPS @ 300.73 MB/s @ avg response time 3.49 ms

  • 10.38% CPU = 415.2 MHz
  • 415.2 MHz  (415,200,000 CPU cycles) is required to deliver 300.73 MB/s (2,405,840,000 b/s)
    • 0.17 Hz to write 1 b/s
    • 1.452 MHz for writing 1 IOPS (1 MiB, sequential)

Observations

We can see that the CPU cycles required to read 1 b/s vary based on I/O size, Read/Write, and Random/Sequential pattern.

  • Small I/O (512 B, random) can consume almost 40 Hz to read or write 1 b/s. 
  • Normalized I/O (32 KiB, random) can consume around 0.7 Hz to read or write 1 b/s
  • Large I/O (1 MiB, sequential) can consume around 0.1 Hz to read or write 1 b/s
If we use the same approach as for vSAN and average 32 KiB I/O (random) and 1 MiB I/O (sequential), we can define the following rule of thumb 
"0.5 Hz of general purpose x86-64 CPU (Intel Sandy Bridge) is required to read or write 1 bit/s from local NVMe flash disk"

If we compare it with the 3.5 Hz rule of thumb for vSAN ESA RAID-5 with compression, we can see the vSAN ESA requires 7x more CPU cycles, but it makes perfect sense because vSAN ESA does a lot of additional processing on the backend. Such processing mainly involves data protection (RAID-5/RAIN-5) and compression.  

I was curious how much CPU cycles require a non-redundant storage workload and observed numbers IMHO make sense.

Hope this helps others during infrastructure design exercises. 

Wednesday, December 11, 2024

VMware Desktop Products direct download links

UPDATE: Direct links below do not work anymore. They are redirected to https://support.broadcom.com

Main URL for all desktop products: https://softwareupdate.vmware.com/cds/vmw-desktop/

VMware Fusion: https://softwareupdate.vmware.com/cds/vmw-desktop/fusion/

VMware Workstation: https://softwareupdate.vmware.com/cds/vmw-desktop/ws/

VMware Remote Console (VMRC): https://softwareupdate.vmware.com/cds/vmw-desktop/vmrc/

You do not need to have a Broadcom account. All VMware desktop products are directly downloadable without signing in.

VMware Health Analyzer - how to download and register the tool

Are you looking for VMware Health Analyzer? It is not easy to find it so here are links to download and register the tool to get the license.

Full VHA download: https://docs.broadcom.com/docs/VHA-FULL-OVF10

Collector VHA download: https://docs.broadcom.com/docs/VHA-COLLECTOR-OVF10

Full VHA license Register Tool: https://pstoolhub.broadcom.com/

I publish it mainly for my own reference but I hope other VMware community folks find it useful.