Wednesday, July 30, 2025

vSAN ESA RAID5 issue? Not really, but ...

I was observing unexpected behavior in my vSAN ESA cluster. I have a 6-node vSAN ESA cluster and a VM with a Storage Policy configured for RAID-5 (Erasure Coding). Based on the cluster size, I would expect a 4+1 stripe configuration. However, the system is using 2+1 striping, which typically applies to clusters with only 3 to 5 nodes.

RAID-5 (2+1) striping is using 133% of the raw storage

RAID-5 (4+1) striping is using 120% of the raw storage

A 13% difference is worth investigating.

Screenshots from my environment are attached below ...

6-Node vSAN ESA Cluster

Storage Policy with RAID-5 Erasure Coding

VM with RAID-5 Storage Policy on 6-node vSAN Cluster. Why 2+1 and nor 4+1?

Is there something I’m misunderstanding or doing incorrectly?

No, I'm not doing anything incorrectly and here is the explanation what happened.

I had one host in long-time maintenance, and RAID-5 was proactively rebuilt from 4+1 to 2+1, because that's how it works for 5-node vSAN clusters. This is expected behavior and I was fine with that.

When 6th host was added back to the cluster, it took 24 hours to rebuild it back to 4+1. I just did not wait long enough :-) I thought it would take a while and checked the status after 22 hours, but that was not enough.

After 24 hours, the vSAN object striping was 4+1 as depicted on screenshot below.

The problem solved itself after 24 hours

Conclusion

The issue resolved itself after approximately 24 hours (vSAN starts restriping and rebalancing) + 3 hours (real data re-syncs and re-balancing in my environment). It's important to be aware of this behavior, as it can impact capacity planning and design decisions for a 6-node vSAN ESA cluster.

I’m planning to scale out to a 7-node vSAN cluster soon, which will enable the use of RAID-6 with a consistent 4+2 erasure coding scheme. However, documenting this kind of adaptive RAID-5 behavior in 6-node configurations could be valuable for other VMware users relying on vSAN ESA and similar storage policies.

What still confuses me is that the same degraded RAID-5 policy continued to be applied to newly created vSAN objects during the 24 hours after the host exited long-term maintenance mode.

VCDX #200 The Ultimate Way to Virtualize
Blog of one VMware Infrastructure Designer

Pages

Wednesday, July 30, 2025

vSAN ESA RAID5 issue? Not really, but ...

Is there something I’m misunderstanding or doing incorrectly?

Conclusion

No comments:

Pages

Wednesday, July 30, 2025

vSAN ESA RAID5 issue? Not really, but ...

Is there something I’m misunderstanding or doing incorrectly?

Conclusion

No comments:

Subscribe To