I have just finished my first VMware vSAN ESA Plan, Design, and Implement project and had a chance to test vSAN ESA performance. By the way, every storage should be stressed and benchmarked before being put into production. VMware's software-defined hyperconverged storage (vSAN) is no different. It is even more important because the server's CPU, RAM, and Network usually used only for VM workloads are leveraged to emulate the enterprise-class storage.
vSAN ESA Environment
All storage performance tests were performed on
- 6-node vSAN ESA Cluster (6x ESXi hosts)
- ESXi Specification
- OS: VMware ESXi 8.0 U3 (8.0.3 Build: 24280767)
- Server Model: Cisco UCS X210c M7
- CPU: 32 CPU Cores - 2x CPU Intel Xeon Gold 6544Y 16C @ 3.6 GHz
- RAM: 1.5 TB
- NIC: Cisco VIC 15230 - 2x 50Gbps
- vSAN vmknic is active/standby, therefore active on one 50 Gbps NIC (vmnic)
- 50 Gbps is physically two 25G-KR (transceiver modules)
- Storage: 5x NVMe 6.4 TB 2.5in U.2 P5620 NVMe High Perf High Endurance
- The usable raw capacity of one disk is 5.82 TB, that's the difference between vendor "sales" capacity and reality. almost 0.6 TB difference :-(
- Storage benchmark software - HCIBench 2.8.3
- 18 test VMs (8x data vDisk, 2 workers per vDisk) evenly distributed across the vSAN Cluster
- fio target storage latency 2.5 ms (2,500 us)
- vSAN Storage Policy:
- RAID-5
- compression enabled
- IOPS Limit 5,000 (to not totally overload the server's CPU)
The vSphere/vSAN storage architecture is depicted in the diagram below.
|
vSphere/vSAN storage architecture |
The Physical network topology is dictated by the Cisco UCS blade system and is depicted below.
|
Cisco UCS Network Topology |
And the diagram of vSphere virtual networking on top of Cisco UCS.
|
vSphere Network Architecture |
Test Cases
Random storage workloads
32KB IO, 100% read, 100% random
Test Case Name: fio-8vmdk-90ws-32k-100rdpct-100randompct-2500lt-1732885897
Performance Result
Datastore: CUST-1001-VSAN
=============================
JOB_NAME: job0
Number of VMs: 18
I/O per Second: 721,317.28 IO/S
Throughput: 22,540.00 MB/s
Read Latency: 2.03 ms
Write Latency: 0.00 ms
95th Percentile Read Latency: 2.00 ms
95th Percentile Write Latency: 0.00 ms
ESXi Host CPU Usage during test 78 GHz (1 GHz is used in idle)
vSAN vmnic4 transmit traffic ~3.4 GB/s (27.2 Gb/s)
vSAN vmnic4 receive traffic ~3.4 GB/s (27.2 Gb/s)
Storage IOPS per ESXi: 120,220 IOPS (721,317 IOPS / 6 ESXi hosts)
ESXi CPU Usage due to vSAN Storage + vSAN Network Traffic
120,220 Storage IOPS + 27.2 Gb/s Network transmit traffic + 27.2 Gb/s Network receive traffic requires 77 GHz
That means 1 vSAN read 32 KB I/O operation (including TCP network traffic) requires ~640 KHz.
In other words, 640,000 Hz for 32 KB read I/O (256,000 bits) means ~2.5 Hz to read 1 bit of data.
ESXi CPU Usage due to vSAN network traffic
I have tested that
9.6 Gb/s of transmit pure network traffic requires 1681 MHz (1.68 GHz) of CPU usage
That means
10,307,921,510 b/s transmit traffic requires 1,681,000,000 Hz
1 b/s transmit traffic requires 0.163 Hz
1 Gb/s transmit traffic requires 163 MHz
I have also tested that
10 Gb/s of receive pure network traffic requires 4000 MHz (4 GHz) of CPU usage
That means
10,737,418,240 b/s transmit traffic requires 4,000,000,000 Hz
1 b/s receive traffic requires 0.373 Hz
1 Gb/s receive traffic requires 373 MHz
vSAN ESXi host reports transmitting network traffic of 27.2 Gb/s, thus it requires ~ 4.43 GHz CPU
vSAN ESXi host reports receiving network traffic of 27.2 Gb/s, thus it requires ~ 10.15 GHz CPU
ESXi CPU Usage due to vSAN Storage without vSAN network traffic
We can deduct 14.58 GHz (4.43 + 10.15) CPU usage (the cost of bidirectional network traffic) from 77 GHz total ESXi CPU usage. That means we need 62.42 GHz CPU usage for vSAN storage operations without network transfers.
We were able to achieve 120,220 IOPS on the ESXi host at 62.42 GHz (62,420,000,000 Hz)
That means 1 NVMe read 32 KB I/O operation without a TCP network traffic requires ~519 KHz.
In other words, 519,000 CPU Hz for 32 KB read I/O (256,000 bits) means ~2 Hz to read 1 bit of data.
32k IO, 100% write, 100% random
Test Case Name: fio-8vmdk-90ws-32k-0rdpct-100randompct-2500lt-1732885897
Performance Result
Datastore: CUST-1001-VSAN
=============================
JOB_NAME: job0
Number of VMs: 18
I/O per Second: 285,892.55 IO/S
Throughput: 8,934.00 MB/s
Read Latency: 0.00 ms
Write Latency: 1.74 ms
95th Percentile Read Latency: 0.00 ms
95th Percentile Write Latency: 2.00 ms
ESXi Host CPU Usage during test 88 GHz (1 GHz is used in idle)
vSAN vmnic4 transmit traffic ~4.44 GB/s (35.5 Gb/s)
vSAN vmnic4 receive traffic ~5 GB/s (40 Gb/s)
Storage IOPS per ESXi: 47,650 IOPS (285,892 IOPS / 6 ESXi hosts)
ESXi CPU Usage due to vSAN Storage + vSAN Network Traffic
47,650 Storage IOPS + 35.5 Gb/s Network transmit traffic + 40 Gb/s Network receive traffic requires 87 GHz
That means 1 vSAN write 32 KB I/O operation (including TCP network traffic) requires ~1,825 KHz.
In other words, 1,825,000 CPU Hz for 32 KB write I/O (256,000 bits) means ~7.13 Hz to write 1 bit of data.
ESXi CPU Usage due to vSAN network traffic
1 Gb/s transmit traffic requires 163 MHz
1 Gb/s receive traffic requires 373 MHz
vSAN ESXi host reports transmitting network traffic of 35.5 Gb/s, thus it requires ~ 5.79 GHz CPU
vSAN ESXi host reports receiving network traffic of 40 Gb/s, thus it requires ~ 14.92 GHz CPU
ESXi CPU Usage due to vSAN Storage without vSAN network traffic
We can deduct 20.71 GHz (5.79 + 14.92) CPU usage (the cost of bidirectional network traffic) from 87 GHz total ESXi CPU usage. We need 66.29 GHz CPU usage for vSAN storage operations without network transfers.
We were able to achieve 47,650 IOPS on the ESXi host at 66.29 GHz (66,290,000,000 Hz)
That means 1 NVMe write 32 KB I/O operation without a TCP network traffic requires ~1,391 KHz.
In other words, 1,391,000 CPU Hz for 32 KB write I/O (256,000 bits) means ~5.43 Hz to write 1 bit of data.
32k IO, 70% read - 30% write, 100% random
Test Case Name: fio-8vmdk-90ws-32k-70rdpct-100randompct-2500lt-1732908719
Performance Result
Datastore: CUST-1001-VSAN
=============================
JOB_NAME: job0
Number of VMs: 18
I/O per Second: 602,702.73 IO/S
Throughput: 18,834.00 MB/s
Read Latency: 1.55 ms
Write Latency: 1.99 ms
95th Percentile Read Latency: 2.00 ms
95th Percentile Write Latency: 2.00 ms
ESXi Host CPU Usage during test 95 GHz (1 GHz is used in idle)
vSAN vmnic4 transmit traffic ~4.5 GB/s (36 Gb/s)
vSAN vmnic4 receive traffic ~4.7 GB/s (37.6 Gb/s)
Storage IOPS per ESXi: 100,450 IOPS (602,702 IOPS / 6 ESXi hosts)
Sequential storage workloads
1024k IO, 100% read, 100% sequential
Test Case Name: fio-8vmdk-90ws-1024k-100rdpct-0randompct-2500lt-1732911329
Performance Result
Datastore: CUST-1001-VSAN
=============================
JOB_NAME: job0
Number of VMs: 18
I/O per Second: 22,575.50 IO/S
Throughput: 22,574.00 MB/s
Read Latency: 6.38 ms
Write Latency: 0.00 ms
95th Percentile Read Latency: 6.00 ms
95th Percentile Write Latency: 0.00 ms
ESXi Host CPU Usage during test 60 GHz (1 GHz is used in idle)
vSAN vmnic4 transmit traffic ~3.4 GB/s (27.2 Gb/s)
vSAN vmnic4 receive traffic ~3.2 GB/s (25.6 Gb/s)
Storage IOPS per ESXi: 3,762 IOPS (22,574 IOPS / 6 ESXi hosts)
Throughput per ESXi: 3,762.00 MB/s (22,574.00 MB/s / 6 ESXi hosts)
ESXi CPU Usage due to vSAN Storage + vSAN Network Traffic
3,762 Storage IOPS + 27.2 Gb/s Network transmit traffic + 25.6 Gb/s Network receive traffic requires 59 GHz
That means 1 vSAN read 1024 KB I/O operation (including TCP network traffic) requires ~15,683 KHz.
In other words, 15,640,000 CPU Hz for 1024 KB read I/O (8,388,608 bits) means ~1.86 Hz to read 1 bit of data.
ESXi CPU Usage due to vSAN network traffic
1 Gb/s transmit traffic requires 163 MHz
1 Gb/s receive traffic requires 373 MHz
vSAN ESXi host reports transmitting network traffic of 27.2 Gb/s, thus it requires ~4.43 GHz CPU
vSAN ESXi host reports receiving network traffic of 25.6 Gb/s, thus it requires ~9.55 GHz CPU
ESXi CPU Usage due to vSAN Storage without vSAN network traffic
We can deduct 13.98 GHz (4.43 + 9.55) CPU usage (the cost of bidirectional network traffic) from 59 GHz total ESXi CPU usage. That means we need 45.02 GHz CPU usage for vSAN storage operations without network transfers.
We were able to achieve 3,162 IOPS on the ESXi host at 45.02 GHz (45,020,000,000 Hz)
That means 1 NVMe read 1 MB I/O operation without a TCP network traffic requires ~14,238 KHz.
In other words, 14,238,000 CPU Hz for 1024 KB read I/O (8,388,608 bits) means ~ 1.69 Hz to read 1 bit of data.
1024k IO, 100% write, 100% sequential
Test Case Name: fio-8vmdk-90ws-1024k-0rdpct-0randompct-2500lt-1732913825
Performance Result
Datastore: CUST-1001-VSAN
=============================
JOB_NAME: job0
Number of VMs: 18
I/O per Second: 15,174.08 IO/S
Throughput: 15,171.00 MB/s
Read Latency: 0.00 ms
Write Latency: 8.30 ms
95th Percentile Read Latency: 0.00 ms
95th Percentile Write Latency: 12.00 ms
ESXi Host CPU Usage during test 60 GHz (1 GHz is used in idle)
vSAN vmnic4 transmit traffic ~3.9 GB/s (31.2 Gb/s)
vSAN vmnic4 receive traffic ~3.9 GB/s (31.2 Gb/s)
Storage IOPS per ESXi: 2,529 IOPS (15,171.00 IOPS / 6 ESXi hosts)
Throughput per ESXi: 2,529 MB/s (15,171.00 MB/s / 6 ESXi hosts)
ESXi CPU Usage due to vSAN Storage + vSAN Network Traffic
2,529 Storage IOPS + 31.2 Gb/s Network transmit traffic + 31.2 Gb/s Network receive traffic requires 59 GHz
That means 1 vSAN 1024 KB write I/O operation (including TCP network traffic) requires ~23,329 KHz.
In other words, 23,329,000 CPU Hz for 1024 KB write I/O (8,388,608 bits) means ~2.78 Hz to write 1 bit of data.
ESXi CPU Usage due to vSAN network traffic
1 Gb/s transmit traffic requires 163 MHz
1 Gb/s receive traffic requires 373 MHz
vSAN ESXi host reports transmitting network traffic of 27.2 Gb/s, thus it requires ~4.43 GHz CPU
vSAN ESXi host reports receiving network traffic of 25.6 Gb/s, thus it requires ~9.55 GHz CPU
ESXi CPU Usage due to vSAN Storage without vSAN network traffic
We can deduct 13.98 GHz (4.43 + 9.55) CPU usage (the cost of bidirectional network traffic) from 59 GHz total ESXi CPU usage. That means we need 45.02 GHz CPU usage for vSAN storage operations without network transfers.
We were able to achieve 2,259 IOPS on the ESXi host at 45.02 GHz (45,020,000,000 Hz)
That means 1 NVMe 1024 KB write I/O operation without a TCP network traffic requires ~19,929 KHz.
In other words, 19,929,000 CPU Hz for 1024 KB write I/O (8,388,608 bits) means ~ 2.37 Hz to write 1 bit of data.
1024k IO, 70% read - 30% write, 100% sequential
Performance Result
Datastore: CUST-1001-VSAN
=============================
JOB_NAME: job0
Number of VMs: 18
I/O per Second: 19,740.90 IO/S
Throughput: 19,738.00 MB/s
Read Latency: 5.38 ms
Write Latency: 8.68 ms
95th Percentile Read Latency: 7.00 ms
95th Percentile Write Latency: 12.00 ms
ESXi Host CPU Usage during test 62 GHz (1 GHz is used in idle)
vSAN vmnic4 receive traffic ~4.15 GB/s (33.2 Gb/s)
vSAN vmnic4 transmit traffic ~4.3 GB/s (34.4 Gb/s)
Storage IOPS per ESXi: 3,290 IOPS (19,740.90 IOPS / 6 ESXi hosts)
Throughput per ESXi: 3,290 MB/s (19,738.00 MB/s / 6 ESXi hosts)
Observations and explanation
Observation 1 - Storage and network workload requires CPU resources.
This is obvious and logical, however, here is some observed data from our storage performance benchmark exercise.
32K, 100% read, 100% random (721,317.28 IOPS in VM guest, 22,540.00 MB/s in VM guest)
=> CPU Usage ~77 GHz
=> ~2.5 Hz to read 1 bit of data (storage + network)
=> ~2 Hz to read 1 bit of data (storage only)
=> 25% goes to network traffic
32K, 70%read 30%write, 100% random (602,702.73 IOPS in VM guest, 18,834.00 MB/s in VM guest)
=> CPU Usage ~94 GHz << THIS IS STRANGE, WHY IS IT MORE CPU THAN 100% WRITE? I DON'T KNOW.
32K, 100% write, 100% random (285,892.55 IOPS in VM guest, 8,934.00 MB/s in VM guest)
=> CPU Usage ~87 GHz
=> ~7.13 Hz to write 1 bit of data (storage + network)
=> ~5.43 Hz to write 1 bit of data (storage only)
=> 31% goes to network traffic
1M, 100% read, 100% random (22,575.50 IOPS in VM guest, 22,574.00 MB/s in VM guest)
=> CPU Usage ~60 GHz
=> ~1.86 Hz to read 1 bit of data (storage + network)
=> ~1.69 Hz to read 1 bit of data (storage only)
=> 10% goes to network traffic
1M, 70% read 30% write, 100% random (19,740.90 IOPS in VM guest, 19,738.00 MB/s in VM guest)
=> CPU Usage ~61 GHz
1M, 100% write, 100% random (15,174.08 IOPS in VM guest, 15,171.00 MB/s in VM guest)
=> CPU Usage ~59 GHz
=> ~2.78 Hz to write 1 bit of data (storage + network)
=> ~2.37 Hz to write 1 bit of data (storage only)
=> 17% goes to network traffic
Reading 1 bit of information from vSAN hyper-converged storage requires roughly between ~1.86 Hz (1024 KB I/O size) and 2.5 Hz (32 KB I/O size).
Writing 1 bit of information to vSAN hyper-converged storage requires roughly between ~2.78 Hz (1024 KB I/O size) and 7.13 Hz (32 KB I/O size).
The above numbers are not set in stone but it is good to observe system behavior.
When I had no IOPS limits in vSAN Storage Polices, I was able to fully saturate ESXi CPU's.
|
CPU usage -16.52 GHz - interesting, right? |
That's a clear sign that storage subsystem (NVMe NAND Flash disks) nor Ethernet/IP network (up to 50 Gbps via a single vmnic4) are bottlenecks. The bottleneck in this case is the CPU. Remember, there is always some bottleneck and we are not looking for maximum storage performance, but for predictable and consistent storage performance without a negative impact on other resources (CPU, Network, Disks).
That's the reason why it is really good to know at least these rough numbers to do some capacity/performance planning of the hyper-converged vSAN solution.
With IOPS limit 5,000, 144 vDisks @ 5000 IOPS can have a sustainable response time of around 2 ms (32 KB I/O). The vSphere/vSAN infrastructure is designed for ~150 VM's so that's perfectly balanced. We have other two VM Storage Polices (10,000 IOPS limit and 15,000 IOPS limit) for more demanding VMs hosting SQL Servers and other storage-intensive workloads.
That's about 720,000 IOPS aggregated in total. Pretty neat for a 6-node vSAN cluster, isn't it?
Observation 2 - Between 10% and 30% CPU is consumed due to TCP network traffic
vSAN is a hyper-converged (Compute, Storage, Network) software-defined storage striping data across ESXi hosts, thus heavily leveraging standard ethernet network and TCP/IP for transport storage data across vSAN nodes (ESXi hosts). vSAN RAID (Redundant Array of Independent Disks) is actually RAIN (Redundant Array of Independent Nodes), therefore the network is highly utilized during heavy storage load. You can see the numbers above in the test results.
As I planned, designed, and implemented vSAN on Cisco UCS infrastructure with 100Gb networking (partitioned into 2x32Gb FCoE, 2x10 Gb Ethernet, 2x10Gb Ethernet, 2x50Gb Ethernet), RDMA over Converged Ethernet (RoCE) would be great to use to decrease CPU requirements and even improve latency and I/O response time. RoCE v2 is supported on vSphere 8.0 U3 for my network interface card Cisco VIC 15230 (driver nenic version 2.0.11) but Cisco is not listed among vendors supporting vSAN over RDMA. I will ask somebody in Cisco why and if they have something in the roadmap.
9 comments:
May you provide additional information to enhance my understanding:
Networking: Details on network topology, vSAN traffic configuration, and the rationale behind the active/standby vmknic setup would be valuable.
Storage: Clarification on the impact of NVMe capacity discrepancies and information about the vSAN cache tier configuration is needed.
Benchmark Configuration: More specifics on HCIBench workload profiles, the reasoning behind VM distribution, and the seemingly low IOPS limit would provide better context.
vSAN Configuration: Information on deduplication settings, vSAN version, and vSphere version is missing.
Monitoring Tools: Details on any additional performance monitoring or analysis tools being used alongside HCIBench would be beneficial.
some observations and suggestions for further analysis that you could share:
The results demonstrate impressive performance, especially for random read operations (721,317 IOPS for 32KB 100% read).
Your CPU usage analysis is particularly insightful, breaking down the costs of storage operations and network traffic.
Consider including smaller I/O sizes (e.g., 4KB, 8KB) to represent common real-world scenarios.
It would be beneficial to see tests with varying queue depths to understand performance under different concurrency levels.
Adding long-duration tests could reveal any potential performance degradation over time.
Including failure scenario tests (e.g., disk, network, or node failures) would demonstrate the system's resilience.
Specific application workload patterns (e.g., OLTP, VDI) could make the results more relatable to real-world use cases.
Exploring the impact of deduplication on performance would be valuable for environments considering this feature.
Scalability tests with different cluster sizes could provide insights into the solution's growth potential.
Network topology is dictated (constraint) by the Cisco UCS blade system.
We have UCS Fabric Interconnect A (Fabric A) and Interconnect B (Fabric B). Particular fabric (A and B) is high-performance multi 25 Gb fabric (2x2x25Gb from each ESXi/UCS server). More details (architecture diagrams) were added to the blog post.
vSAN traffic configuration should be visible from the architecture diagrams added to the blog post.
active/standby vmknic for vSAN has been chosen because vSAN does not support multitenancy and to optimize vSAN traffic and keep it vSAN traffic within a particular fabric and avoid cross upstream ethernet switches. In a non-degraded state, vSAN traffic stays under Fabric Interconnect A (Fabric A). The second fabric is used only in case of Fabric A maintenance or failure of some components within Fabric A.
In the future, if customer would like to use NSX, NSX traffic can be pined into Fabric B.
Such design, IMHO, helps with traffic engineering and troubleshooting.
vSphere version is 8.0 U3 - VMware ESXi 8.0 U3 (8.0.3 Build: 24280767)
Blog post updated.
HCIBench workload profiles and VM distribution is described in the blog post.
>>> 18 test VMs (8x data vDisk, 2 workers per vDisk) evenly distributed across the 6-node vSAN Cluster
>>> 3 VMs per ESXi host
>>> fio target storage latency 2.5 ms (2,500 us)
>>> storage profiles are documented for each test case
For example:
32KB IO, 100% read, 100% random
Test Case Name: fio-8vmdk-90ws-32k-100rdpct-100randompct-2500lt-1732885897
The seemingly low IOPS limit? Do you mean 5000 IOPS limit per vDisk?
This is what I have infrastructure designed for. In production we have 3 storage polices
Tier 1 - 15,000 IOPS
Tier 2 - 10,000 IOPS
Tier 3 - 5,000 IOPS
HCIBench deploys 18 VMs each with 8 data disks. That means 144 vDisks.
144 vDisks @ 5000 IOPS = 720,000 IOPS and that's what we see as HCIBench performance result.
That's enough. The goal is not to get maximum IOPS or throughput but expected (required) performance (IOPS) and throughput (MB/s) in sustainable network and CPU load.
There are only two storage speeds.
Good enough and not good enough.
720,000 IOPS and ~20 GB/s throughput is IMHO good enough ;-)
And it seems that RDMA can add another 10,20, or 30% performance/throughput at same CPU load or degrese CPU load for the same storage load.
Q: Information on deduplication settings
A: vSAN ESA does not support deduplication. It is in the roadmap though.
Q: vSAN version, and vSphere version is missing.
A: vSphere version is 8.0 U3 - VMware ESXi 8.0 U3 (8.0.3 Build: 24280767). Blog post updated.
Q: Monitoring Tools: Details on any additional performance monitoring or analysis tools being used alongside HCIBench would be beneficial.
A: I used classis vCenter real-time Performance Monitoring. It is pretty good as it monitors 20 sec samples of metrics. Hoever, it is good for last hour data. We have also vROps (Aria Operations) so I observed some longer (more then hour) test there.
Q: Consider including smaller I/O sizes (e.g., 4KB, 8KB) to represent common real-world scenarios.
A: Yes. testing smaller I/O sizes (4KB, 8KB, 32KB) would be interesting, however, I had a limited time for testing and the system is already in production so I cannot play with it anymore.
Q: Adding long-duration tests could reveal any potential performance degradation over time.
A: Actually I run some tests for 60 minutes with 10 minutes warmup.
The whole test suits (6 tests) taken 6x70 mins (420 mins, 7 hours) and I haven't seen any performance degradation over time.
Q: It would be beneficial to see tests with varying queue depths to understand performance under different concurrency levels.
A: Do you mean outstanding I/Os to push by storage workers? Well, it would just increase the storage load from a single worker. HCIbench can do it in a distributed way. It's IMHO a better approach.
Q: Including failure scenario tests (e.g., disk, network, or node failures) would demonstrate the system's resilience.
A: Absolutely, but there was no time for such detailed testing. There was a business push to handover the new infrastructure to the customer ASAP ;-) I was happy to get a few days for design validation tests.
Q: Specific application workload patterns (e.g., OLTP, VDI) could make the results more relatable to real-world use cases.
A: Absolutely, however as I have already mentioned, there was no time for more tests and that's why we use enterprise-ready infrastructure (like VMware vSAN) where we trust such testing was done by the vendor and our testing is just to understand how the system behaves to not be surprised during the system operation in production.
Q: Exploring the impact of deduplication on performance would be valuable for environments considering this feature.
A: vSAN ESA does not support deduplication. It is in the roadmap though.
Q: Scalability tests with different cluster sizes could provide insights into the solution's growth potential.
A: Hmm. The cost of single ESXi hardware is ~ $50,000. You need a significant budget to test different cluster sizes. We do not have such a budget. We will add an additional 2 nodes to the 6-node vSAN cluster very soon (1 or 2 months) but I will not be able to perform tests in the production system. You cannot test everything. That's why rules of thumb are important for technical designers.
Your performance testing on VMware vSAN ESA is impressive and provides valuable insights, but there are some areas to consider for future testing:
1. Include smaller I/O sizes (e.g., 4KB, 8KB) and application-specific workloads (e.g., OLTP, VDI) to better reflect real-world scenarios.
2. Conduct long-duration tests (>24 hours) to identify potential performance degradation over time.
3. Simulate failure scenarios (e.g., disk, network, or node failures) to evaluate resilience and recovery behavior.
4. Explore advanced network configurations like RDMA over Converged Ethernet (RoCE) to reduce CPU overhead and improve latency.
5. Test deduplication when supported and analyze the specific impact of compression on performance.
6. Perform scalability tests with varying cluster sizes to understand growth potential and performance trade-offs.
7. Vary queue depths to analyze concurrency impacts and test the overhead of enabling encryption (data-at-rest and in-transit).
8. Leverage advanced monitoring tools like VMware Aria Operations for deeper insights into long-term metrics.
These additions could further enhance the comprehensiveness of your testing and provide even more actionable insights for production environments!
Post a Comment