Friday, December 06, 2024

VMware vSAN ESA - storage performance testing

I have just finished my first VMware vSAN ESA Plan, Design, and Implement project and had a chance to test vSAN ESA performance. By the way, every storage should be stressed and benchmarked before being put into production. VMware's software-defined hyperconverged storage (vSAN) is no different. It is even more important because the server's CPU, RAM, and Network usually used only for VM workloads are leveraged to emulate the enterprise-class storage.

vSAN ESA Environment

All storage performance tests were performed on

6-node vSAN ESA Cluster (6x ESXi hosts)

ESXi Specification

OS: VMware ESXi 8.0 U3 (8.0.3 Build: 24280767)
Server Model: Cisco UCS X210c M7
CPU: 32 CPU Cores - 2x CPU Intel Xeon Gold 6544Y 16C @ 3.6 GHz

115.2 GHz capacity

RAM: 1.5 TB
NIC: Cisco VIC 15230 - 2x 50Gbps

vSAN vmknic is active/standby, therefore active on one 50 Gbps NIC (vmnic)

50 Gbps is physically two 25G-KR (transceiver modules)

Storage: 5x NVMe 6.4 TB 2.5in U.2 P5620 NVMe High Perf High Endurance

The usable raw capacity of one disk is 5.82 TB, that's the difference between vendor "sales" capacity and reality. almost 0.6 TB difference :-(

Storage benchmark software - HCIBench 2.8.3

18 test VMs (8x data vDisk, 2 workers per vDisk) evenly distributed across the vSAN Cluster

3 VMs per ESXi host

fio target storage latency 2.5 ms (2,500 us)

vSAN Storage Policy:

RAID-5
compression enabled
IOPS Limit 5,000 (to not totally overload the server's CPU)

The vSphere/vSAN storage architecture is depicted in the diagram below.

vSphere/vSAN storage architecture

The Physical network topology is dictated by the Cisco UCS blade system and is depicted below.

Cisco UCS Network Topology

And the diagram of vSphere virtual networking on top of Cisco UCS.

vSphere Network Architecture

Test Cases

Random storage workloads

32KB IO, 100% read, 100% random

Test Case Name: fio-8vmdk-90ws-32k-100rdpct-100randompct-2500lt-1732885897

Performance Result

Datastore: CUST-1001-VSAN

=============================

JOB_NAME: job0

Number of VMs: 18

I/O per Second: 721,317.28 IO/S

Throughput: 22,540.00 MB/s

Read Latency: 2.03 ms

Write Latency: 0.00 ms

95th Percentile Read Latency: 2.00 ms

95th Percentile Write Latency: 0.00 ms

ESXi Host CPU Usage during test 78 GHz (1 GHz is used in idle)

vSAN vmnic4 transmit traffic ~3.4 GB/s (27.2 Gb/s)

vSAN vmnic4 receive traffic ~3.4 GB/s (27.2 Gb/s)

Storage IOPS per ESXi: 120,220 IOPS (721,317 IOPS / 6 ESXi hosts)

ESXi CPU Usage due to vSAN Storage + vSAN Network Traffic

120,220 Storage IOPS + 27.2 Gb/s Network transmit traffic + 27.2 Gb/s Network receive traffic requires 77 GHz

That means 1 vSAN read 32 KB I/O operation (including TCP network traffic) requires ~640 KHz.

In other words, 640,000 Hz for 32 KB read I/O (256,000 bits) means ~2.5 Hz to read 1 bit of data.

ESXi CPU Usage due to vSAN network traffic

I have tested that

9.6 Gb/s of transmit pure network traffic requires 1681 MHz (1.68 GHz) of CPU usage

That means

10,307,921,510 b/s transmit traffic requires 1,681,000,000 Hz
1 b/s transmit traffic requires 0.163 Hz
1 Gb/s transmit traffic requires 163 MHz

I have also tested that

10 Gb/s of receive pure network traffic requires 4000 MHz (4 GHz) of CPU usage

That means

10,737,418,240 b/s transmit traffic requires 4,000,000,000 Hz
1 b/s receive traffic requires 0.373 Hz
1 Gb/s receive traffic requires 373 MHz

vSAN ESXi host reports transmitting network traffic of 27.2 Gb/s, thus it requires ~ 4.43 GHz CPU

vSAN ESXi host reports receiving network traffic of 27.2 Gb/s, thus it requires ~ 10.15 GHz CPU

ESXi CPU Usage due to vSAN Storage without vSAN network traffic

We can deduct 14.58 GHz (4.43 + 10.15) CPU usage (the cost of bidirectional network traffic) from 77 GHz total ESXi CPU usage. That means we need 62.42 GHz CPU usage for vSAN storage operations without network transfers.

We were able to achieve 120,220 IOPS on the ESXi host at 62.42 GHz (62,420,000,000 Hz)

That means 1 NVMe read 32 KB I/O operation without a TCP network traffic requires ~519 KHz.

In other words, 519,000 CPU Hz for 32 KB read I/O (256,000 bits) means ~2 Hz to read 1 bit of data.

32k IO, 100% write, 100% random

Test Case Name: fio-8vmdk-90ws-32k-0rdpct-100randompct-2500lt-1732885897

Performance Result

Datastore: CUST-1001-VSAN

=============================

JOB_NAME: job0

Number of VMs: 18

I/O per Second: 285,892.55 IO/S

Throughput: 8,934.00 MB/s

Read Latency: 0.00 ms

Write Latency: 1.74 ms

95th Percentile Read Latency: 0.00 ms

95th Percentile Write Latency: 2.00 ms

ESXi Host CPU Usage during test 88 GHz (1 GHz is used in idle)

vSAN vmnic4 transmit traffic ~4.44 GB/s (35.5 Gb/s)

vSAN vmnic4 receive traffic ~5 GB/s (40 Gb/s)

Storage IOPS per ESXi: 47,650 IOPS (285,892 IOPS / 6 ESXi hosts)

ESXi CPU Usage due to vSAN Storage + vSAN Network Traffic

47,650 Storage IOPS + 35.5 Gb/s Network transmit traffic + 40 Gb/s Network receive traffic requires 87 GHz

That means 1 vSAN write 32 KB I/O operation (including TCP network traffic) requires ~1,825 KHz.

In other words, 1,825,000 CPU Hz for 32 KB write I/O (256,000 bits) means ~7.13 Hz to write 1 bit of data.

ESXi CPU Usage due to vSAN network traffic

1 Gb/s transmit traffic requires 163 MHz

1 Gb/s receive traffic requires 373 MHz

vSAN ESXi host reports transmitting network traffic of 35.5 Gb/s, thus it requires ~ 5.79 GHz CPU

vSAN ESXi host reports receiving network traffic of 40 Gb/s, thus it requires ~ 14.92 GHz CPU

ESXi CPU Usage due to vSAN Storage without vSAN network traffic

We can deduct 20.71 GHz (5.79 + 14.92) CPU usage (the cost of bidirectional network traffic) from 87 GHz total ESXi CPU usage. We need 66.29 GHz CPU usage for vSAN storage operations without network transfers.

We were able to achieve 47,650 IOPS on the ESXi host at 66.29 GHz (66,290,000,000 Hz)

That means 1 NVMe write 32 KB I/O operation without a TCP network traffic requires ~1,391 KHz.

In other words, 1,391,000 CPU Hz for 32 KB write I/O (256,000 bits) means ~5.43 Hz to write 1 bit of data.

32k IO, 70% read - 30% write, 100% random

Test Case Name: fio-8vmdk-90ws-32k-70rdpct-100randompct-2500lt-1732908719

Performance Result

Datastore: CUST-1001-VSAN

=============================

JOB_NAME: job0

Number of VMs: 18

I/O per Second: 602,702.73 IO/S

Throughput: 18,834.00 MB/s

Read Latency: 1.55 ms

Write Latency: 1.99 ms

95th Percentile Read Latency: 2.00 ms

95th Percentile Write Latency: 2.00 ms

ESXi Host CPU Usage during test 95 GHz (1 GHz is used in idle)

vSAN vmnic4 transmit traffic ~4.5 GB/s (36 Gb/s)

vSAN vmnic4 receive traffic ~4.7 GB/s (37.6 Gb/s)

Storage IOPS per ESXi: 100,450 IOPS (602,702 IOPS / 6 ESXi hosts)

Sequential storage workloads

1024k IO, 100% read, 100% sequential

Test Case Name: fio-8vmdk-90ws-1024k-100rdpct-0randompct-2500lt-1732911329

Performance Result

Datastore: CUST-1001-VSAN

=============================

JOB_NAME: job0

Number of VMs: 18

I/O per Second: 22,575.50 IO/S

Throughput: 22,574.00 MB/s

Read Latency: 6.38 ms

Write Latency: 0.00 ms

95th Percentile Read Latency: 6.00 ms

95th Percentile Write Latency: 0.00 ms

ESXi Host CPU Usage during test 60 GHz (1 GHz is used in idle)

vSAN vmnic4 transmit traffic ~3.4 GB/s (27.2 Gb/s)

vSAN vmnic4 receive traffic ~3.2 GB/s (25.6 Gb/s)

Storage IOPS per ESXi: 3,762 IOPS (22,574 IOPS / 6 ESXi hosts)

Throughput per ESXi: 3,762.00 MB/s (22,574.00 MB/s / 6 ESXi hosts)

ESXi CPU Usage due to vSAN Storage + vSAN Network Traffic

3,762 Storage IOPS + 27.2 Gb/s Network transmit traffic + 25.6 Gb/s Network receive traffic requires 59 GHz

That means 1 vSAN read 1024 KB I/O operation (including TCP network traffic) requires ~15,683 KHz.

In other words, 15,640,000 CPU Hz for 1024 KB read I/O (8,388,608 bits) means ~1.86 Hz to read 1 bit of data.

ESXi CPU Usage due to vSAN network traffic

1 Gb/s transmit traffic requires 163 MHz

1 Gb/s receive traffic requires 373 MHz

vSAN ESXi host reports transmitting network traffic of 27.2 Gb/s, thus it requires ~4.43 GHz CPU

vSAN ESXi host reports receiving network traffic of 25.6 Gb/s, thus it requires ~9.55 GHz CPU

ESXi CPU Usage due to vSAN Storage without vSAN network traffic

We can deduct 13.98 GHz (4.43 + 9.55) CPU usage (the cost of bidirectional network traffic) from 59 GHz total ESXi CPU usage. That means we need 45.02 GHz CPU usage for vSAN storage operations without network transfers.

We were able to achieve 3,162 IOPS on the ESXi host at 45.02 GHz (45,020,000,000 Hz)

That means 1 NVMe read 1 MB I/O operation without a TCP network traffic requires ~14,238 KHz.

In other words, 14,238,000 CPU Hz for 1024 KB read I/O (8,388,608 bits) means ~ 1.69 Hz to read 1 bit of data.

1024k IO, 100% write, 100% sequential

Test Case Name: fio-8vmdk-90ws-1024k-0rdpct-0randompct-2500lt-1732913825

Performance Result

Datastore: CUST-1001-VSAN

=============================

JOB_NAME: job0

Number of VMs: 18

I/O per Second: 15,174.08 IO/S

Throughput: 15,171.00 MB/s

Read Latency: 0.00 ms

Write Latency: 8.30 ms

95th Percentile Read Latency: 0.00 ms

95th Percentile Write Latency: 12.00 ms

ESXi Host CPU Usage during test 60 GHz (1 GHz is used in idle)

vSAN vmnic4 transmit traffic ~3.9 GB/s (31.2 Gb/s)

vSAN vmnic4 receive traffic ~3.9 GB/s (31.2 Gb/s)

Storage IOPS per ESXi: 2,529 IOPS (15,171.00 IOPS / 6 ESXi hosts)

Throughput per ESXi: 2,529 MB/s (15,171.00 MB/s / 6 ESXi hosts)

ESXi CPU Usage due to vSAN Storage + vSAN Network Traffic

2,529 Storage IOPS + 31.2 Gb/s Network transmit traffic + 31.2 Gb/s Network receive traffic requires 59 GHz

That means 1 vSAN 1024 KB write I/O operation (including TCP network traffic) requires ~23,329 KHz.

In other words, 23,329,000 CPU Hz for 1024 KB write I/O (8,388,608 bits) means ~2.78 Hz to write 1 bit of data.

ESXi CPU Usage due to vSAN network traffic

1 Gb/s transmit traffic requires 163 MHz

1 Gb/s receive traffic requires 373 MHz

vSAN ESXi host reports transmitting network traffic of 27.2 Gb/s, thus it requires ~4.43 GHz CPU

vSAN ESXi host reports receiving network traffic of 25.6 Gb/s, thus it requires ~9.55 GHz CPU

ESXi CPU Usage due to vSAN Storage without vSAN network traffic

We were able to achieve 2,259 IOPS on the ESXi host at 45.02 GHz (45,020,000,000 Hz)

That means 1 NVMe 1024 KB write I/O operation without a TCP network traffic requires ~19,929 KHz.

In other words, 19,929,000 CPU Hz for 1024 KB write I/O (8,388,608 bits) means ~ 2.37 Hz to write 1 bit of data.

1024k IO, 70% read - 30% write, 100% sequential

Performance Result

Datastore: CUST-1001-VSAN

=============================

JOB_NAME: job0

Number of VMs: 18

I/O per Second: 19,740.90 IO/S

Throughput: 19,738.00 MB/s

Read Latency: 5.38 ms

Write Latency: 8.68 ms

95th Percentile Read Latency: 7.00 ms

95th Percentile Write Latency: 12.00 ms

ESXi Host CPU Usage during test 62 GHz (1 GHz is used in idle)

vSAN vmnic4 receive traffic ~4.15 GB/s (33.2 Gb/s)

vSAN vmnic4 transmit traffic ~4.3 GB/s (34.4 Gb/s)

Storage IOPS per ESXi: 3,290 IOPS (19,740.90 IOPS / 6 ESXi hosts)

Throughput per ESXi: 3,290 MB/s (19,738.00 MB/s / 6 ESXi hosts)

Observations and explanation

Observation 1 - Storage and network workload requires CPU resources.

This is obvious and logical, however, here is some observed data from our storage performance benchmark exercise.

32K, 100% read, 100% random (721,317.28 IOPS in VM guest, 22,540.00 MB/s in VM guest)

=> CPU Usage ~77 GHz

=> ~2.5 Hz to read 1 bit of data (storage + network)

=> ~2 Hz to read 1 bit of data (storage only)

=> 25% goes to network traffic

32K, 70%read 30%write, 100% random (602,702.73 IOPS in VM guest, 18,834.00 MB/s in VM guest)

=> CPU Usage ~94 GHz << THIS IS STRANGE, WHY IS IT MORE CPU THAN 100% WRITE? I DON'T KNOW.

32K, 100% write, 100% random (285,892.55 IOPS in VM guest, 8,934.00 MB/s in VM guest)

=> CPU Usage ~87 GHz

=> ~7.13 Hz to write 1 bit of data (storage + network)

=> ~5.43 Hz to write 1 bit of data (storage only)

=> 31% goes to network traffic

1M, 100% read, 100% random (22,575.50 IOPS in VM guest, 22,574.00 MB/s in VM guest)

=> CPU Usage ~60 GHz

=> ~1.86 Hz to read 1 bit of data (storage + network)

=> ~1.69 Hz to read 1 bit of data (storage only)

=> 10% goes to network traffic

1M, 70% read 30% write, 100% random (19,740.90 IOPS in VM guest, 19,738.00 MB/s in VM guest)

=> CPU Usage ~61 GHz

1M, 100% write, 100% random (15,174.08 IOPS in VM guest, 15,171.00 MB/s in VM guest)

=> CPU Usage ~59 GHz

=> ~2.78 Hz to write 1 bit of data (storage + network)

=> ~2.37 Hz to write 1 bit of data (storage only)

=> 17% goes to network traffic

Reading 1 bit of information from vSAN hyper-converged storage requires roughly between ~1.86 Hz (1024 KB I/O size) and 2.5 Hz (32 KB I/O size).

Writing 1 bit of information to vSAN hyper-converged storage requires roughly between ~2.78 Hz (1024 KB I/O size) and 7.13 Hz (32 KB I/O size).

The above numbers are not set in stone but it is good to observe system behavior.

When I had no IOPS limits in vSAN Storage Polices, I was able to fully saturate ESXi CPU's.

CPU usage -16.52 GHz - interesting, right?

That's a clear sign that storage subsystem (NVMe NAND Flash disks) nor Ethernet/IP network (up to 50 Gbps via a single vmnic4) are bottlenecks. The bottleneck in this case is the CPU. Remember, there is always some bottleneck and we are not looking for maximum storage performance, but for predictable and consistent storage performance without a negative impact on other resources (CPU, Network, Disks).

That's the reason why it is really good to know at least these rough numbers to do some capacity/performance planning of the hyper-converged vSAN solution.

With IOPS limit 5,000, 144 vDisks @ 5000 IOPS can have a sustainable response time of around 2 ms (32 KB I/O). The vSphere/vSAN infrastructure is designed for ~150 VM's so that's perfectly balanced. We have other two VM Storage Polices (10,000 IOPS limit and 15,000 IOPS limit) for more demanding VMs hosting SQL Servers and other storage-intensive workloads.

That's about 720,000 IOPS aggregated in total. Pretty neat for a 6-node vSAN cluster, isn't it?

Observation 2 - Between 10% and 30% CPU is consumed due to TCP network traffic

vSAN is a hyper-converged (Compute, Storage, Network) software-defined storage striping data across ESXi hosts, thus heavily leveraging standard ethernet network and TCP/IP for transport storage data across vSAN nodes (ESXi hosts). vSAN RAID (Redundant Array of Independent Disks) is actually RAIN (Redundant Array of Independent Nodes), therefore the network is highly utilized during heavy storage load. You can see the numbers above in the test results.

As I planned, designed, and implemented vSAN on Cisco UCS infrastructure with 100Gb networking (partitioned into 2x32Gb FCoE, 2x10 Gb Ethernet, 2x10Gb Ethernet, 2x50Gb Ethernet), RDMA over Converged Ethernet (RoCE) would be great to use to decrease CPU requirements and even improve latency and I/O response time. RoCE v2 is supported on vSphere 8.0 U3 for my network interface card Cisco VIC 15230 (driver nenic version 2.0.11) but Cisco is not listed among vendors supporting vSAN over RDMA. I will ask somebody in Cisco why and if they have something in the roadmap.

10 comments:

Anonymous said...: May you provide additional information to enhance my understanding:
Networking: Details on network topology, vSAN traffic configuration, and the rationale behind the active/standby vmknic setup would be valuable.
Storage: Clarification on the impact of NVMe capacity discrepancies and information about the vSAN cache tier configuration is needed.
Benchmark Configuration: More specifics on HCIBench workload profiles, the reasoning behind VM distribution, and the seemingly low IOPS limit would provide better context.
vSAN Configuration: Information on deduplication settings, vSAN version, and vSphere version is missing.
Monitoring Tools: Details on any additional performance monitoring or analysis tools being used alongside HCIBench would be beneficial.; 11:50 PM
Anonymous said...: some observations and suggestions for further analysis that you could share:
The results demonstrate impressive performance, especially for random read operations (721,317 IOPS for 32KB 100% read).
Your CPU usage analysis is particularly insightful, breaking down the costs of storage operations and network traffic.
Consider including smaller I/O sizes (e.g., 4KB, 8KB) to represent common real-world scenarios.
It would be beneficial to see tests with varying queue depths to understand performance under different concurrency levels.
Adding long-duration tests could reveal any potential performance degradation over time.
Including failure scenario tests (e.g., disk, network, or node failures) would demonstrate the system's resilience.
Specific application workload patterns (e.g., OLTP, VDI) could make the results more relatable to real-world use cases.
Exploring the impact of deduplication on performance would be valuable for environments considering this feature.
Scalability tests with different cluster sizes could provide insights into the solution's growth potential.; 11:53 PM
David Pasek said...: Network topology is dictated (constraint) by the Cisco UCS blade system.

We have UCS Fabric Interconnect A (Fabric A) and Interconnect B (Fabric B). Particular fabric (A and B) is high-performance multi 25 Gb fabric (2x2x25Gb from each ESXi/UCS server). More details (architecture diagrams) were added to the blog post.

vSAN traffic configuration should be visible from the architecture diagrams added to the blog post.

active/standby vmknic for vSAN has been chosen because vSAN does not support multitenancy and to optimize vSAN traffic and keep it vSAN traffic within a particular fabric and avoid cross upstream ethernet switches. In a non-degraded state, vSAN traffic stays under Fabric Interconnect A (Fabric A). The second fabric is used only in case of Fabric A maintenance or failure of some components within Fabric A.

In the future, if customer would like to use NSX, NSX traffic can be pined into Fabric B.

Such design, IMHO, helps with traffic engineering and troubleshooting.; 8:27 AM
David Pasek said...: vSphere version is 8.0 U3 - VMware ESXi 8.0 U3 (8.0.3 Build: 24280767)
Blog post updated.; 8:32 AM
David Pasek said...: HCIBench workload profiles and VM distribution is described in the blog post.
>>> 18 test VMs (8x data vDisk, 2 workers per vDisk) evenly distributed across the 6-node vSAN Cluster
>>> 3 VMs per ESXi host
>>> fio target storage latency 2.5 ms (2,500 us)
>>> storage profiles are documented for each test case
For example:
32KB IO, 100% read, 100% random
Test Case Name: fio-8vmdk-90ws-32k-100rdpct-100randompct-2500lt-1732885897

The seemingly low IOPS limit? Do you mean 5000 IOPS limit per vDisk?
This is what I have infrastructure designed for. In production we have 3 storage polices
Tier 1 - 15,000 IOPS
Tier 2 - 10,000 IOPS
Tier 3 - 5,000 IOPS

HCIBench deploys 18 VMs each with 8 data disks. That means 144 vDisks.
144 vDisks @ 5000 IOPS = 720,000 IOPS and that's what we see as HCIBench performance result.

That's enough. The goal is not to get maximum IOPS or throughput but expected (required) performance (IOPS) and throughput (MB/s) in sustainable network and CPU load.

There are only two storage speeds.
Good enough and not good enough.
720,000 IOPS and ~20 GB/s throughput is IMHO good enough ;-)

And it seems that RDMA can add another 10,20, or 30% performance/throughput at same CPU load or degrese CPU load for the same storage load.; 8:49 AM
David Pasek said...: Q: Information on deduplication settings
A: vSAN ESA does not support deduplication. It is in the roadmap though.

Q: vSAN version, and vSphere version is missing.
A: vSphere version is 8.0 U3 - VMware ESXi 8.0 U3 (8.0.3 Build: 24280767). Blog post updated.; 8:53 AM
David Pasek said...: Q: Monitoring Tools: Details on any additional performance monitoring or analysis tools being used alongside HCIBench would be beneficial.

A: I used classis vCenter real-time Performance Monitoring. It is pretty good as it monitors 20 sec samples of metrics. Hoever, it is good for last hour data. We have also vROps (Aria Operations) so I observed some longer (more then hour) test there.; 9:12 AM
David Pasek said...: Q: Consider including smaller I/O sizes (e.g., 4KB, 8KB) to represent common real-world scenarios.
A: Yes. testing smaller I/O sizes (4KB, 8KB, 32KB) would be interesting, however, I had a limited time for testing and the system is already in production so I cannot play with it anymore.

Q: Adding long-duration tests could reveal any potential performance degradation over time.
A: Actually I run some tests for 60 minutes with 10 minutes warmup.
The whole test suits (6 tests) taken 6x70 mins (420 mins, 7 hours) and I haven't seen any performance degradation over time.

Q: It would be beneficial to see tests with varying queue depths to understand performance under different concurrency levels.
A: Do you mean outstanding I/Os to push by storage workers? Well, it would just increase the storage load from a single worker. HCIbench can do it in a distributed way. It's IMHO a better approach.

Q: Including failure scenario tests (e.g., disk, network, or node failures) would demonstrate the system's resilience.
A: Absolutely, but there was no time for such detailed testing. There was a business push to handover the new infrastructure to the customer ASAP ;-) I was happy to get a few days for design validation tests.

Q: Specific application workload patterns (e.g., OLTP, VDI) could make the results more relatable to real-world use cases.
A: Absolutely, however as I have already mentioned, there was no time for more tests and that's why we use enterprise-ready infrastructure (like VMware vSAN) where we trust such testing was done by the vendor and our testing is just to understand how the system behaves to not be surprised during the system operation in production.

Q: Exploring the impact of deduplication on performance would be valuable for environments considering this feature.
A: vSAN ESA does not support deduplication. It is in the roadmap though.

Q: Scalability tests with different cluster sizes could provide insights into the solution's growth potential.
A: Hmm. The cost of single ESXi hardware is ~ $50,000. You need a significant budget to test different cluster sizes. We do not have such a budget. We will add an additional 2 nodes to the 6-node vSAN cluster very soon (1 or 2 months) but I will not be able to perform tests in the production system. You cannot test everything. That's why rules of thumb are important for technical designers.; 9:37 AM
Anonymous said...: Your performance testing on VMware vSAN ESA is impressive and provides valuable insights, but there are some areas to consider for future testing:
1. Include smaller I/O sizes (e.g., 4KB, 8KB) and application-specific workloads (e.g., OLTP, VDI) to better reflect real-world scenarios.
2. Conduct long-duration tests (>24 hours) to identify potential performance degradation over time.
3. Simulate failure scenarios (e.g., disk, network, or node failures) to evaluate resilience and recovery behavior.
4. Explore advanced network configurations like RDMA over Converged Ethernet (RoCE) to reduce CPU overhead and improve latency.
5. Test deduplication when supported and analyze the specific impact of compression on performance.
6. Perform scalability tests with varying cluster sizes to understand growth potential and performance trade-offs.
7. Vary queue depths to analyze concurrency impacts and test the overhead of enabling encryption (data-at-rest and in-transit).
8. Leverage advanced monitoring tools like VMware Aria Operations for deeper insights into long-term metrics.
These additions could further enhance the comprehensiveness of your testing and provide even more actionable insights for production environments!; 12:39 AM
Anonymous said...: Your performance testing on VMware vSAN ESA is impressive and provides valuable insights, but there are some areas to consider for future testing:
1. Include smaller I/O sizes (e.g., 4KB, 8KB) and application-specific workloads (e.g., OLTP, VDI) to better reflect real-world scenarios.
2. Conduct long-duration tests (>24 hours) to identify potential performance degradation over time.
3. Simulate failure scenarios (e.g., disk, network, or node failures) to evaluate resilience and recovery behavior.
4. Explore advanced network configurations like RDMA over Converged Ethernet (RoCE) to reduce CPU overhead and improve latency.
5. Test deduplication when supported and analyze the specific impact of compression on performance.
6. Perform scalability tests with varying cluster sizes to understand growth potential and performance trade-offs.
7. Vary queue depths to analyze concurrency impacts and test the overhead of enabling encryption (data-at-rest and in-transit).
8. Leverage advanced monitoring tools like VMware Aria Operations for deeper insights into long-term metrics.
These additions could further enhance the comprehensiveness of your testing and provide even more actionable insights for production environments!; 12:45 AM

VCDX #200 The Ultimate Way to Virtualize
Blog of one VMware Infrastructure Designer

Pages

Friday, December 06, 2024

VMware vSAN ESA - storage performance testing

vSAN ESA Environment

Test Cases

Random storage workloads

32KB IO, 100% read, 100% random

32k IO, 100% write, 100% random

32k IO, 70% read - 30% write, 100% random

Sequential storage workloads

1024k IO, 100% read, 100% sequential

1024k IO, 100% write, 100% sequential

1024k IO, 70% read - 30% write, 100% sequential

Observations and explanation

Observation 1 - Storage and network workload requires CPU resources.

Observation 2 - Between 10% and 30% CPU is consumed due to TCP network traffic

10 comments:

Pages

Friday, December 06, 2024

VMware vSAN ESA - storage performance testing

vSAN ESA Environment

Test Cases

Random storage workloads

32KB IO, 100% read, 100% random

32k IO, 100% write, 100% random

32k IO, 70% read - 30% write, 100% random

Sequential storage workloads

1024k IO, 100% read, 100% sequential

1024k IO, 100% write, 100% sequential

1024k IO, 70% read - 30% write, 100% sequential

Observations and explanation

Observation 1 - Storage and network workload requires CPU resources.

Observation 2 - Between 10% and 30% CPU is consumed due to TCP network traffic

10 comments:

Subscribe To