Showing posts with label vSAN. Show all posts
Showing posts with label vSAN. Show all posts

Wednesday, January 13, 2016

Don't use 4K Native drives for VMware vSphere ESXi nor VSAN

First of all, let's be absolutely clear. Disks with 4K sector size are not currently supported by VMware. See VMware KB- Support statement for 512e and 4K Native drives for VMware vSphere and VSAN (2091600)

UPDATE: vSphere 6.5 and VSAN 6.5 introduced 512e support so 4K native drives with 512 emulation (512e) are supported. In other words, 4K native drives without 512e are still not supported. 

UPDATE 2018-04-18: vSphere 6.7 introdiuced support of 4K native drives.

IMPORTANT STATEMENTS FROM KB

Does current GA version of vSphere and VSAN support 4K Native drives?
No. 4K Native drives are not supported in current GA releases of vSphere and VSAN.

Does current GA version of vSphere and VSAN support 512e drives?
No. 512e drives are not supported with the current versions of vSphere and VSAN due to potential performance issues when using these drives. 

Therefore only 512n (native) drives are supported on any ESXi (5.x, 6.x) at the moment.

It is usually not big deal on shared storage systems (aka disk arrays) because logical volumes are virtually emulated and sector size is usually 512 by default or configurable (512 or 4K). 

I thought that it is the same with RAID controllers because virtual volumes are also emulated by RAID controller. However, I have just recently learned that it is not true. At least not for all RAID controllers. For example DELL PERC H730 (the best RAID controller DELL currently offers) doesn't allow you to choose sector size for virtual volume. Instead, sector size is passed from physical disks to operating system - ESXi hypervisor in our case.

Here is one real customer story with 4K native disks. 

The customer was not able to create datastore on some disks. The error message was ...
Call "HostDatastoreSystem.QueryVmfsDatastoreCreateOptions" for object "ha-datastoresystem" on ESXi "esxi-test" failed.  
The error message is depicted on screenshot below.



Server Model: PowerEdge R530 – System Revision I
Operating System: VMware ESXi 5.5.0 build-3343343
BIOS Version: 1.5.4
Lifecycle Controller Firmware: 2.21.21.21
RAID Controller: Perc H730Mini 

RAID1 (SYSTEM)
Physical Disk 0:1:0 Online 0 278.88 GB Not Capable SAS HDD No
Physical Disk 0:1:1 Online 1 278.88 GB Not Capable SAS HDD No
DATASTORE – successfully created during ESXi installation  – OK

RAID1 (DATA)
Physical Disk 0:1:2 Online 2 558.38 GB Not Capable SAS HDD No
Physical Disk 0:1:3 Online 3 558.38 GB Not Capable SAS HDD No
DATASTORE – Datastore cannot be created because 4K - FATAL ISSUE

RAID5 (DATA)
Physical Disk 0:1:4 Online 4 1862.50 GB Not Capable SAS HDD No
Physical Disk 0:1:5 Online 5 1862.50 GB Not Capable SAS HDD No
Physical Disk 0:1:6 Online 6 1862.50 GB Not Capable SAS HDD No
DATASTORE – datastore can be created but not officially supported by VMware because 512e – RISK

RAID Controller list of disks

T17: C0:PD   Flags    State Type Size          S N F P Vendor   Product          Rev  P C ID SAS Addr         Port Phy DevH WU BFw  BRev
T17: C0:------------------------------------------------------------------------------------------------------------------------------
T17: C0:0    f1400005 00020 00   22ecb25b      0 0 0 1 TOSHIBA  AL13SXB30EN      DK02 0 0 500003969802a076 03   04  0010   1  NA   NA - 512b (512 Native)
T17: C0:1    f1400005 00020 00   22ecb25b      0 0 0 1 TOSHIBA  AL13SXB30EN      DK02 0 0 500003969802a062 00   00  000a   1  NA   NA - 512b (512 Native)
T17: C0:2    f1400005 00020 00   8bba5f5       0 0 0 1 HGST     HUC156060CS4204  EK11 0 0 5000cca059596e59 05   06  000e   1  NA   NA - 4kn (4k Native)
T17: C0:3    f1400005 00020 00   8bba5f5       0 0 0 1 HGST     HUC156060CS4204  EK11 0 0 5000cca0595aa1cd 02   02  000c   1  NA   NA - 4kn (4k Native)
T17: C0:4    f1400005 00020 00   e8e088af      0 0 0 1 SEAGATE  ST2000NX0273     NS28 0 0 5000c5008f3a8efd 04   05  000d   1  NA   NA - 512e (512 Emulation)
T17: C0:5    f1400005 00020 00   e8e088af      0 0 0 1 SEAGATE  ST2000NX0273     NS28 0 0 5000c5008f3ae545 01   01  000b   1  NA   NA - 512e (512 Emulation)
T17: C0:6    f1400005 00020 00   e8e088af      0 0 0 1 SEAGATE  ST2000NX0273     NS28 0 0 5000c5008f3b0661 06   07  000f   1  NA   NA - 512e (512 Emulation)
T17: C0:20   01400005 00020 0d   0             0 0 0 0 DP       BP13G+           2.23 0 0 524180704c645200 00   08  0009   0  NA   NA

T17: C0:100  00400005 00020 03   0             0 0 0 0 LSI      SMP/SGPIO/SEP    4402 0 0                0 00   ff  ffff   0  NA   NA

CONCLUSION AND  LESSONS LEARNED
It is obvious that VMware will support 4K disks sometimes in the future because industry is moving there but if you are planning to use directly attached disks choose disks with 512 sector. It is ESXi limitation at the moment. VMware VSAN is also impacted by this limitation because VSAN relies on ESXi.

Update 2016-01-21:
I have just received following question from one reader ...
"How can I list physical disks connected to internal PERC8?"
Unfortunately I don't have access to any 13G Dell server but I did a research and there should be three available methods.

Method 1/ racadm
If you have DRAC (Dell Remote Access/management Card) you can leverage racadm.
Based on racadm documentation it should be possible
  • Storage.PhysicalDisk.BlockSizeInBytes (Read Only) Description This is readonly attribute. This property indicates the logical block size of the physical drive that this virtual disk belongs to. Legal Values Values: 512 or 4096
Method 2/ Export PERC Raid Controller Log with Dell Support Live Image Version 2.0

http://de.community.dell.com/techcenter/support-services/w/wiki/369.export-perc-raid-controller-log-with-dell-support-live-image-version-2-0-englisch

Method 3/ perccli
Dell has utility called perccli. You can check perccli documentation for all details but there is command for viewing physical drive details for the specified slot in the controller.

  • Syntax is perccli /c0/e32/s4 show all

Downside of this method is that perccli binaries exist just for windows or linux so you cannot use it directly from ESXi and you have to boot for example Linux live CD.

Method 4/ megacli (not supported)

Third method is leveraging LSI megacli utility. Dell PERC is manufactured by LSI so it should work. LSI has megacli VIB for ESXi but it is not officially supported by VMware nor Dell.
See details at http://de.community.dell.com/techcenter/support-services/w/wiki/909.how-to-install-megacli-on-esxi-5-x




Saturday, March 14, 2015

VMware Virtual SAN Diagnostics and Troubleshooting Reference Manual

Well known VMware's storage evangelist Cormac Hogan wrote and published another VMware VSAN related document. Well, it is the book having almost 300 pages. And the nice thing is that this document/book/manual is publicly available for free.

Snip from document Introduction Chapter ...
VMware’s Virtual SAN is designed to be simple: simple to configure, and simple to operate. This simplicity masks a sophisticated and powerful storage product. The purpose of this document is to fully illustrate how Virtual SAN works behind the scenes: whether this is needed in the context of problem solving, or just to more fully understand its inner workings.
Here is the link ... http://www.vmware.com/files/pdf/products/vsan/VSAN-Troubleshooting-Reference-Manual.pdf

So if you want to know VSAN details for diagnosis and troubleshooting you have to read it.

Monday, February 02, 2015

vSphere 6 Announcements

Bellow is a brief transcript of VMware vSphere 6 related announcements. The list of new features may not be complete because I have noted just features important and interesting for me as vSphere Architect designing datacenter infrastructures.

Disclaimer: I'm not responsible for any errors and inaccuracies in the transcript bellow.


vSphere 6 New Features

  • vSphere HA (High Availability) Cluster supports up to 64 hosts
  • vSphere FT (Fault Tolerance) supports up to 4 vCPUs
  • VM supports up to 128 vCPU/4TB vRAM
  • vMotion across vCenters
  • Long Distance vMotion (should work up to 100 ms of round trip time)
  • VVOLs released
  • NFS 4.1 support multipathing and Kerberos Authentication
  • Up to 2x increase in concurrent vCenter operations
  • 10x faster vCenter operations
  • vCenter Server Appliance supports 1,000 ESXi hosts and 10,000 VMs
  • vSphere WEB Client 5x faster
  • Platform Service Controller (PSC) introduces. SSO and SSL Certification are sub components of PSC. 

VSAN 6 New Features

  • All flash architecture supported
  • Limits increased: 64 hosts, 2000 VMs per host, 32 snapshots per VM, vDisk up to 62 TB
  • Rack Awareness (Fault Domains)
  • Health Checks

Cloud New Features

  • VMware Integrated OpenStack (VIO) - very tightly integrated vSphere 6 with Open Stack Cloud Management layer
  • Open Stack fully supported by VMware and included in support fees

You can check out VMware Online Announcement recording at http://bcove.me/m0amsphc

List of other What's new blog posts I found very useful ...

Other vExpert's vSphere 6 related blog posts ...

vExperts participating in vSphere beta program wrote lot of blog posts about various vSphere 6 topics. All these blog posts are aggregated at 

Warning: I strongly believe that all bloggers are doing great job but don't trust everything written in the internet and validate any information with VMware official documentation. 

Please, let me know if I missed or misunderstood something important.

Monday, January 26, 2015

DELL 13G servers with PERC H730 finally certified for VSAN

I'm reading and learning about VMware's VSAN a lot. I really believe there will be lot of use cases in the future for software defined distributed storage. However I don't see VSAN momentum right now because of several factors. Three most obvious factors are mentioned below:

  • Maturity
  • TCO
  • Single point of support - if you compare it to traditional SAN based storage vendors support

That's the reason I didn't have a chance and time to play with VMware VSAN so far but I'm getting lot of questions from colleagues, DELL partners, customers and folks from VMware community about the right DELL storage controller for VSAN which can be used on the latest DELL server generation.

DELL 13th server generation was unveiled September 8, 2014. Since then, there was not any DELL storage controller for DELL 13G servers officially supported by VMware for VSAN.
Today I have got information that DELL PERC H730 is officially supported by DELL and VMware for VSAN. For more information look here.
This is really great info for VSAN early adapters planning to use DELL servers. One little advice to all VSAN enthusiasts ... If you are not going to use officially supported VSAN nodes or EVO:Rail appliance and you are designing your own VSAN cluster do it very carefully and don't forget to do PoC before or during design phase and perform design and operational validation tests (aka test plan) before putting VSAN into real production. Be sure you know something about queue depth of adapters (AQLEN) and disks (DQLEN).

If you build your own software defined storage then you are the storage architect with little bit higher risk and responsibility in comparison to classic storage system (this is my opinion). That's the risk of any modern (aka emerging) technology before it's become the commodity. On the other hand, this can be your added value to your customers and there are no doubts there are some benefits.

But never forget why "data centers" are so important and business critical? Because usually we have there very valuable data which must be always available with reasonable performance. Think about 99.999% storage up time with some reasonable response time (3-20ms) for expected IOPS workload.

I wish everybody lot of success with hyper converge systems like VSAN and leave a comment of your hopefully success stories and use cases. And I'm still looking forward for my first VSAN project  :-) 

Friday, May 16, 2014

Unable unmout ESX datastore

I've just been notified about annoying problem by customer for whom I did vSphere 5.5 Design. The datastore was not  posible to unmount. In ESX logs were something similar to message below.
Cannot unmount volume 'Datastore Name: vm3:xxx VMFS uuid: 517c9950-10f30962-931f-00304830a1ea' because file system is busy. Correct the problem and retry the operation.
There is KB about this symptom. VSAN component VSANTRACE was using datastore. That was the reason of busy file system. It was pretty annoying  issue as VSAN was not used nor enabled.

The solution is to disable vsantraced service so it is necessary to issue following command on evey ESX ...
chkconfig vsantraced off 

Not so nice, right? That's the downside of fully integrated VSAN software into general ESX hypervisor. I'm not happy with this approach. In my opinion, it would be much better distribute VSAN as additional software installing as regular VIB (VMware Installable Bundle).