Wednesday, September 18, 2013

Open Manage Essentials Network Connection (tcp/udp ports)

I was recently engaged to implement DELL Datacenter version of OME (Open Manage Essentials). DELL OME is quite easy and efficient tool for basic DELL hardware management. In other words it is free of charge element system management for DELL Servers, Network and also some Storage elements. It allows you to  do typical administrator tasks like
  • Hardware Discovery and Inventory
  • Monitor Hardware Status
  • Send email Notification or trigger SNMP trap to another system management
  • Inventory and System Reporting
  • In-band (OMSA) or Out-of-band (DRAC) Server Firmware Management - upgrades and downgrades
It is important to note that DELL OME is not enterprise management system like Altiris, MS System Center, or so. For customers considering integration DELL hardware in to some enterprise management system it is very likely DELL has integration toolkit or management plugin for particular enterprise management system. But that's another story.

DELL OME  is straight forward to install in small environment but it is usually more complex to implement it in bigger enterprise environment where exist firewalls with strict polices. In such environments you have to tightly cooperate with network departments for creating firewall rules allowing communication among OME server and hardware elements.

Unfortunatelly OME User Guide doesn't describe details network connections. There are listed TCP/UDP ports but for firewall rules you need to know detail network flows and flow directions.

That's the reason I have created document "Open Manage Essentials Network Connection and useful information for creating firewall rules" and publish it on slide share here.

Direct link to the document:
http://www.slideshare.net/davidpasek/ome-network-connections-and-firewall-rules-v04

And as always ... any comments are highly appreciated.

Tuesday, September 17, 2013

DELL Force10 configuration for VMware VXLAN transportation

Right now I work on vSphere Design where network virtualization is leveraged to simplify network management and provide segmentation of multiple tenants. Therefore I was tested VXLANs in my lab. I have equipment listed bellow:
  • 1x DELL Blade Chassis  M1000e
  • 2x DELL Force10 IOA (IO Aggregators - blade chassis network modules)
  • 2x DELL Force10 S4810 as top of the rack switches
  • 1x DELL Force10 S60 acting as physical router (L3 switch)
  • 1x DELL EqualLogic storage PS-4110 (iSCSI storage module inside Blade Chassis)

Here are VXLAN physical switch requirements 
  • Minimum MTU size requirement is 1600 however we will use maximum Jumbo Frames across physical network
  • IGMP snooping should be enabled on L2 switches, to which VXLAN participating hosts are attached.
  • IGMP Querier enabled on router or L3 switch with connectivity to the multicast enabled networks.
  • Multicast routing (PIM-SM) must be enabled on routers.

Force10 switches are by default configured to allow Jumbo Frames. However physical interfaces, vlan interfaces and port-channels has to be configured explicitly. 
Force10 S-series switches interfaces MTU can be set up to 12000. In CISCO Nexus environments max MTU is 9216.

Force10 IOA (I/O Aggregator) is by default set to MTU 12000 so it is already prepared for VXLAN and nothing has to be configured.

Let's assume we use VLAN 14 for VXLAN transport.

Router config (Force10 S60)
config
igmp snooping enable
ip multicast-routing
interface vlan 14
  ip pim sparse-mode
  mtu 12000
  tagged gigabitethernet 0/46-47
  exit
! For all interfaces in VLAN 14 we have to set MTU at least 1600
interface range gigabitethernet 0/46 - 47
  mtu 12000
  end
Switch config (Force10 S4810)
! IGMP snooping must be enabled
conf
ip igmp snooping enable
interface vlan 14
  mtu 12000
  exit
interface range tengigabitethernet 0/46 , tengigabitethernet 0/48 - 51 , fortyGigE 0/56 , fortyGigE 0/60
  mtu 12000
  exit
interface range port-channel 1 , port-channel 128
  mtu 12000
  exit
end
IO Aggregator (Force10 IOA)
Force10 IOA default configuration has maximum MTU  already in default factory settings so it is VXLAN ready and no changes are required.

Here are Force10 IOA default values:
mtu 12000
ip mtu 11982
igmp snooping enabled
Check out these excellent blog articles for more details on VXLAN theory and implementation:

VXLAN requirements
http://www.yellow-bricks.com/2012/10/04/vxlan-requirements/

VXLAN on UCS and vSphere: from L3 to Nexus 1000V
http://vmtrooper.com/vxlan-on-ucs-and-vsphere-from-l3-to-nexus-1000v/

http://www.force10networks.com/CSPortal20/TechTips/0008_mtu-settings.aspx
Adjusting MTU and Configuring Jumbo Frame Settings

UPDATE 2015-02-02:
I have multicast router enabled on my VLAN 14 (see configuration of Force10 S60) therefore it works like IGMP querier. However if you have a need to have VXLAN overlay over the network without multicast router you should configure IGMP Querier on particular VLAN otherwise multicast traffic will be flooded into whole broadcast network (VLAN). IGMP querier can be configured by following command:
ip igmp snooping querier 



Sunday, September 15, 2013

Default credentials and initial setup of VMware vSphere components

vCenter Server Appliance
Username: root
Password: vmware

vShield Manager
Username:admin
Password: default

Initial setup:
  1. Log in to console to use CLI
  2. enable
  3. setup (it will start setup wizard where you can set network settings of vShield Manager appliance)
  4. Log out from console
  5. Log in to web management https://A.B.C.D/ (A.B.C.D is address of vShield Manager appliance, use default credentials)
  6. Continue configuration in web management.

High latency on vSphere datastore backed by NFS

Last week one of my customers experienced high latency on vSphere datastore backed by NFS mount. Generally, the usual root cause of high latency is because of few disk spindles used for particular datastore but that was not the case here.

NFS datastore for vSphere
Although NFS was always understood as lower storage tier VMware and NFS vendors were working very hardly on NFS improvements in recent years. Another plus for NFS nowadays is that 10Gb ethernet is already commodity which helps NFS significantly because it doesn't support multi-pathing (aka MPIO) as FC or iSCSI does. On the other hand it is obvious that NFS is another abstract storage layer for vSphere and some other details like NFS client implementation, ethernet/IP queue management, QoS, and so on can impact whole solution. Therefore when someone tell me NFS for vSphere I'm always cautious.  Don't get me wrong I really like abstractions, layering, unification and simplification but it must not have any influence on the stability and performance.

I don't want to discuss advantages and disadvantages of particular protocol as it depends on particular environment requirements and what someone wants to achieve. By the way I have recently prepared one particular design decision protocol comparison for another customer here so you can check it out and comment it there.

Here in this case the customer had really good reason to use NFS but the latency issue is potential show stopper.

I have to say that I had also bad NFS experience back in 2010 when I was designing and implementing Vblock0 for one customer. Vblock0 used EMC Celerra therefore NFS or iSCSI were the only options. NFS was better choice because of Celerra iSCSI implementation (that's another topic). We were not able to decrease disk response times bellow 30ms so at the end NFS (EMC Cellera) was used as Tier 3 storage and customer bought another block storage (EMC Clariion) for Tier 1. It is history because I was implementing new vSphere 4.1 and SIOC was just introduced without broad knowledge about SIOC benefits especially for NFS.

Since there lot of things changed with NFS so that's just one history and field experience of one engagement so lets go back to the high latency problem today on NFS and troubleshooting steps what we did with this customer.

TROUBLESHOOTING

Environment overview
Customer has vSphere 5.0 (Enterprise Plus) Update 2 patched to the latest versions (ESXi build 1254542).
NFS storage is NetApp FAS with the latest ONTAP version (NetApp Release 8.2P2 7-Mode).
Compute is based on CISCO UCS and networking on top of UCS is based on Nexus 5500.

Step 1/ Check SIOC or MaxQueueDepth
I told customer about known NFS latency issue documented in KB article 2016122 and broadly discussed on Cormag Hogan blog post here. Based on community and my own experience I have hypothesis that the problem is not related only to NetApp storage but it is most probably ESXi NFS client issue. This is just my opinion without any proof.

Active SIOC or  /NFS/MaxQueueDepth 64 is workaround documented on KB Article mentioned earlier. Therefore I asked them if SIOC is enabled as we discussed during Plan & Design phase. The answer was yes it is.

Hmm. Strange.

Step 2/ NetApp firmware
Yes. This customer has NetApp filer and in kb article is update comment that the latest NetApp firmware solve this issue. Customer has latest 8.2 firmware which should fix the issue. But it evidently doesn't help.

Hmm. Strange.

Step 3/ Open support case with NetApp and VMware
I suggested to open support case and in parallel continue with troubleshooting

I don't why but customers in Czech Republic are shame to use support line. I don't known why when they are paying significant amount of money for it. But it is like it is and even this customer didn't engaged VMware nor NetApp support and continued with troubleshooting. Ok, I understand we can solve everything by our self but why not ask for help? That's more social than technical question and I would like to known if this administrator behavior is global habit or some special habit here in central Europe. Don't be shame and speak out in comments even about this more social subject.

Step 4/ Go deeper in SIOC troubleshooting

Check if storageRM (Storage Resource Management) is running
/etc/init.d/storageRM status
Enable advanced logging in Software Advanced Settings -> Misc -> Misc.SIOControlLogLevel = 7
By default there is 0 and 7 is max value.

Customer found strange log message in "/var/log/storagerm.log"
Open /vmfs/volumes/ /.iorm.sf/slotsfile (0x10000042, 0x0) failed: permission denied 
There is not VMware KB for it but Frank Denneman bloged about it here.

So customer is experiencing the same issue like Frank in his lab.

Solution is to changed *nix file privileges how Frank was instructed by VMware Engineering (that's the beauty when you have direct access to engineering) ...

chmod 755 /vmfs/volumes/DATASTORE/.iorm.sf/slotsfile

Changes take effect immediately and you can check it in "/var/log/storagerm.log"
...
DATASTORE: read 2406 KB in 249 ops, wrote 865 KB in 244 ops avgReadLatency
1.85, avgWriteLatency 1.42, avgLatency 1.64 iops = 116.59, throughput =
773.65 KBps
...
Advanced logging can be disable in Software Advanced Settings -> Misc -> Misc.SIOControlLogLevel = 0

After this normalized latency is between 5-7 ms which is quite normal.

Incident solved ... waiting for other incidents :-)

Problem management continues ...

Lessons learned from this case
SIOC is excellent VMware technology helping with datastore wide performance fairness. In this example it help us significantly with dynamic queue management helping with NFS response times.

However even in any excellent technology can be bugs ...

SIOC can be leveraged just by customers having Enterprise Plus licenses.

Customers having lower licenses has to use static Queue value (/NFS/MaxQueueDepth) 64 or even less based on response times. BTW default Max NFS queue depth value is 4294967295.  I understand NFS.MaxQueueDepth as a Disk.SchedNumReqOutstanding for block devices. Default value of parameter Disk.SchedNumReqOutstanding is 32 helping with sharing LUN queues which usually have queue depth 256.  It is ok for usual situations but if you have more disk intensive VMs per LUN than this parameter can be tuned. This is where SIOC help us with dynamic queue management even across ESX hosts sharing same device (LUN, datastore).

For deep dive Disk.SchedNumReqOutstanding explanation i suggest to read Jason Boche blog post here.

Static queue management brings significant operational overhead and maybe other issues we don't know about right now. So go with SIOC if you can, if you have enterprise environment consider upgrade to Enterprise Plus. If you still have response times issue troubleshoot SIOC if it does what he has to do.

Anyway, it would be nice if VMware can improve NFS behavior. SIOC is just one of two workarounds we can use to mitigate risk of high latency NFS datastores.

Customer unfortunately didn't engaged VMware Global Support Organization therefore nobody in VMware knows about this issue and cannot write new or update existing KB article. I'll try to do some social network noise to help highlight the problem.

Friday, September 13, 2013

Troubleshooting Storage Performance in vSphere

Very good blog post series introduction to storage performance troubleshooting in VMware vSphere infrastructures.

Part 1 - The Basics
Part 2 - Troubleshooting Storage Performance in vSphere
Part 3 - SSD Performance

Everybody should read these storage basics before deep diving in to storage performance in shared infrastructures.

Wednesday, September 11, 2013

NFS or Fibre Channel Storage for VMware vSphere?

Final decision depends what do you want to get from your storage. Check out my newly uploaded presentation on SlideShare:  http://www.slideshare.net/davidpasek/design-decision-nfsversusfcstorage-v03 where I'm trying to compare both options with special requirements from real customer engagement.

If you have any storage preference, experience or question please feel free to speak up in the comments.

What type of NIC teaming, loadbalancing and physical switch configuration to use for VMware's VXLAN?

As a former CISCO UCS Architect I'm observing VXLAN initiative almost 2 years so I was looking forward to do the real customer project. Finally it is here. I'm working on vSphere design for vCloud Director (vCD). To be honest I'm responsible just for vSphere design and someone else is doing vCD Design because I'm not vCD expert and I have just conceptual and high-level vCD knowledge. I'm not planning to change it in near future because I'm more focused on next generation infrastructure and vCD is in my opinion just another software for selling IaaS. I'm not saying it is not important. It is actually very important because IaaS is not just technology but business process. However nobody knows everything and I leave some work for other architects :-)

We all know that vCD sits on top of vSphere providing multi-tenancy and other IaaS constructs and since vCD 5.1 the network multi-tenancy segmentation is done by VXLAN network overlay. Therefore I have finally opportunity to plan, design and implement VXLANs for real customer.

Right now I'm designing network part of vSphere architecture and I describe VXLAN oriented design decision point bellow.

VMware VXLAN Information sources:
I would like to thanks Duncan for his blog post back in October 2012 right before Barcelona VMworld 2012 where VXLANs were officially introduced by VMware. Even it is unofficial information source it is very informative and I'm verifying it against official VMware documentation and white papers.  Unfortunately I have realized that there is a lack of trustful and publicly available technical information till today and some information are contradictory. See bellow what confusion I'm facing and I would be very happy if someone help me to jump out from the circle.

Design decision point:
What type of NIC teaming, loadbalancing and physical switch configuration to use for VMware's VXLAN?

Requirements:
  • R1: Fully supported solution
  • R2: vSphere 5.1 and vCloud Director 5.1
  • R3: VMware vCloud Network & Security (aka vCNS or vShield) with VMware distributed virtual switch
  • R4: Network Virtualization and multi-tenant segmentation with VXLAN network overlay 
  • R5: Leverage standard access datacenter switches like CISCO Nexus 5000, Force10 S4810, etc.
Constraints:
  • C1: LACP 5-tuple hash algorithm is not available on current standard access datacenter physical switches mentioned in requirement R5
  • C2: VMware Virtual Port ID loadbalancing is not supported with VXLAN Source: S3
  • C3: VMware LBT loadbalancing is not supported with VXLAN Source: S3
  • C4: LACP must be used with 5-tuple hash algorithm Source: S3, S2, S1 on Page 48. [THIS IS STRANGE CONSTRAINT, WHY IT IS HASH DEPENDENT?] Updated 2013-09-11: It looks like there is a bug in VMware documentation and KB Article. Thanks @DuncanYB and @fojta for confirmation and internal VMware escalations.
Available Options:
  • Option 1: Virtual Port ID
  • Option 2: Load based Teaming
  • Option 3: LACP
  • Option 4: Explicit fail-over

Option comparison:
  • Option 1: not supported because of C1
  • Option 2: not supported because of C2
  • Option 3: supported
  • Option 4: supported but not optimal because only one NIC is used for network traffic. 
Design decision and justification:
Based on available information options 3 and 4 complies with requirements and constraints. Option 3 is better because network traffic is load balanced across physical NICs. That's not a case for option 4.

Other alternatives not compliant with all requirements:
  • Alt 1: Use physical switches with 5-tuple hash loadbalancing. That means high-end switch models like Nexus 7000, Force10 E Series, etc.
  • Alt 2: Use CISCO Nexus 1000V with VXLAN. They support LACP with any hash algorithm. 5-tuple hash is also recommended but not strictly required.
Conclusion:
I hope some information in constraints C2, C3, and C4 are wrong and will be clarified by VMware. I'll tweet this blog post to some VMware experts and hope someone will help me to jump out from the decision circle.
If you have any official/unofficial topic related information or you see anything where I'm wrong, please feel free to speak up in the comments.
Updated 2013-09-11: Constraint C4 doesn't exists and VMware doc will be updated.
Based on updated information LACP and "Explicit fail-over" teaming/load-balancing is supported for VXLANs. LACP is better way to go and  "Explicit fail-over" is alternative in case LACP is not achievable on your environment.

Tuesday, September 10, 2013

Storage System Performance Analysis with Iometer

Excellent write up about IOmeter usage is here.

Quick troubleshooting of ESX and 10Gb Broadcom NeXtreme II negotiated only to 1Gb

I have just realized that my vmnic(s) in one DELL blade server M620 (let's call him BLADE1) is connected only at 1Gb speed even I have 10Gb NIC(s) connected to Force10 IOA blade module(s). It should be connected at 10Gb and another blade  (let's call him BLADE2) with the same config is really connected at 10Gb speed.

So quick troubleshooting ... we have to find where is the difference

Let's go step by step ...

  1. NIC ports on ESX vSwitch in BLADE1 are configured to  use auto negotiation so no problem here
  2. Ports on Force 10 IOA are also configured for auto negotiation and configuration is consistent across all ports in switch modules so that's not a problem.
  3. ESX builds are the same on both blade servers.
  4. What are NIC firmwares? On BLADE1 there is 7.2.14 and on BLADE2 7.6.15

Bingo!!! Let's upgrade NIC firmwares on BLADE1 and check if this was the root cause of the problem ...

Monday, September 09, 2013

Using SSL certificates for VMware vSphere Components

Streaming the certificate replacement and management process in a VMware environment can be challenging at times. For instance, changing certificates for a vCenter 5.1 is a hugely laborious process. And in a typical environment where there are a large number of hosts running, tracking and managing their certificates is difficult and time consuming. More importantly, security breaches due to lapsed certificates can prove to be very expensive to the organization. vCert Manager from VSS Labs provides fully automated management of SSL Certificates in a VMware environment across the entire lifecycle.

VSS Labs has solution to simplify SSL management. For more info look at http://vsslabs.com/vCert.html

To be honest I had no chance to test it because I avoid signed SSL certificates if possible. However when I'll have a customer who requires SSL I definitely have to evaluate VSS Labs solution.

Wednesday, September 04, 2013

OpenManage Integration for VMware vCenter 2.0

OpenManage Integration for VMware vCenter 2.0 is new generation of DELL vCenter Management Plugin targeted as plugin for vSphere 5.5 Web Client.



Looking forward to test it with vSphere 5.5 in my lab.

Monday, September 02, 2013

Configure Force10 S4810 for SNMP

Enable SNMP in Force10 S4810 switches is straight forward. Bellow is configuration sample.

conf

! Enable SNMP for read only access
snmp-server community public ro

! Enable SNMP traps and send it to SNMP receiver 192.168.12.70
snmp-server host 192.168.12.70 version 1
snmp-server enable traps

Configuring Dell EqualLogic management interface

All credits go to Mike Poulson because he published this procedure back in 2011.
[Source: http://www.mikepoulson.com/2011/06/configuring-dell-equallogic-management.html]

I  have just rewrote, formated, and slightly changed the most important steps for EqualLogic out-of-band interface IP configuration.

The Dell EqualLogic iSCSI SAN supports an out-of-band management network interface. This is for managing the device from a separate network than the iSCSI traffic is on. So this is a quick set of commands that are used to configure the management (in this case eth2) interface on the device.

The web interface is nice and all but you have to have your 10Gig network setup before you can access it. Also the "setup" does not really give you an easy option to configure the management interface.

Steps:
Login to Console Port with grpadmin username and grpadmin password.

After you run setup you will need to know the "member name". You can get your member name by running the command
member show
This will list the name, Status, Version, Size information for each member configured on the array. Here is example

grpname> member show
Name Status Version Disks Capacity FreeSpace Connections
---------- ------- ---------- ----- ---------- ---------- -----------
member01 online V4.3.6 (R1 16 621.53GB 0MB 0
grpname>


The member name for my device is member01.

Once you know the member name you will need to set the IP address for your management interface. This IP address will need to be one that you can access from your management network. The port is an untagged port similar to other out-of-band management ports on devices (network switches).

To configure the IP use steps described below.

  • First set the interface to be management ONLY. Use the member command again.
member select member01 eth select 2 mgmt-only enable
  • Set the IP address and Network Mask
member select member01 eth select 2 ipaddress xxx.xxx.xxx.xxx netmask 255.255.255.0
  • Enable the interface (by default the MGMT (eth2) interface is disabled and will not provide a LINK).
member select member01 eth select 2 up
  • Then you will be asked to confirm that you wish to enable the Management port
This port is for group management only. If enabling, make sure it is connected to a dedicated management network. If disabling, make sure you can access the group through another Ethernet port.
 

Do you really want to enable the management interface? (y/n) [n] y
  • To view current IP and state of an Eth interface use
member select member01 show eths

Once that is complete you can use the management IP address to establish an http or https connection to the Array.

Veeam Backup Components Requirements

Veeam is excellent backup software for virtualized environments. Veeam is relatively easy to install and use. However when you have bigger environment and looking for better backup performance is really important to know infrastructure requirements and size appropriately your backup infrastructure.

Here are hardware requirements for particular Veeam components.
 
Veeam Console
  Windows Server 2008 R2
  4GB RAM + 0.5GB per concurrent backup job

Veeam Proxy
  Windows Server 2008 R2
  2GB RAM + 0.2GB per concurrent task

Veeam WAN Accelerator
  Windows Server 2008 R2
  8GB RAM
  Disk for cache

Veeam Repository
  Windows Server 2008 R2, Linux or NAS (CIFS) 
  4GB RAM + 2GB per each concurrent ingress backup job
 

Saturday, August 31, 2013

DELL Force10 S6000 as a physical switch for VMware NSX

Based on this  document http://www.vmware.com/files/pdf/products/nsx/vmw-nsx-dell-systems.pdf
DELL Force10 S6000 is going to be fully integrated with VMware NSX (NSX is software defined networking platform).

Dell Networking provides:
  • Data center switches for robust underlays for L2 overlays
  • CLI for virtual and physical networks
  • Network management and automation with Active Fabric Manager
  • S6000 Data Center Switch Gateway for physical workloads to connect to virtual networks
  • Complete end-to-end solutions that include server, storage, network, security, management and services with world wide support
Dell S6000 use cases:
  • Extend virtual networks to physical servers -  S6000 works as VXLAN gateway to VLANs on physical network (VXLAN VTEP).
  • Connect physical workloads reachable on a specific VLAN to logical networks via an L2 service
  • Connect physical workloads reachable on a specific port to logical networks via an L2 service
  • Connect to physical workloads in a Physical to virtual migration
  • Migration from existing virtualized environments to public clouds, creating hybrid clouds
  • Access physical router, firewall, load balancer, WAN optimization and other network resources

I cannot wait to test it in my lab or on customer PoC engagement. After hands-on experience I'll share it on this blog. 

Tuesday, August 27, 2013

What’s New in vSphere 5.5

On this article I'll try to collect all important (at least for me) vSphere 5.5 news and improvements announced at VMworld 2013. I wasn't there so I rely on other blog posts and VMware materials.

Julian Wood reported about vCloud Suite 5.5 news announced at VMworld 2013 at
http://www.wooditwork.com/2013/08/26/whats-new-vcloud-suite-5-5-introduction/

Chris Wahl wrote deep dive blog posts into vSphere 5.5 improvements at
http://wahlnetwork.com/category/deep-dives/5-5-vsphere-improvements/

Cormac Hogan listed storage improvements in vSphere 5.5 at
http://blogs.vmware.com/vsphere/2013/08/whats-new-in-vsphere-5-5-storage.html?utm_source=twitterfeed&utm_medium=linkedin&goback=%2Egde_3217230_member_269944857#%21 

Thanks Julian, Chris, and Cormac for excellent blog posts and keep informed as who was not able to attend VMworld 2013.

BTW: Official VMware What's New paper is at http://www.vmware.com/files/pdf/vsphere/VMware-vSphere-Platform-Whats-New.pdf 

Here are few citations with my comments from above blog posts. I'll mention just improvements which are important and/or interesting for me. I will concentrate on these topics and in near future I have to find and test more hidden details.
  1. Management: VMware is strongly recommending using a single VM for all vCenter Server core components (SSO, Web Client, Inventory Service and vCenter Server) or to use the appliance rather than splitting things out which just add complexity and makes it harder to upgrade in the future. << "This is excellent approach and I really like it."
     
  2. Management: The vCenter Appliance has also been beefed up and with its embedded database supports 300 hosts and 3000 VMs or if you use an external Oracle DB the supported hosts and VMs are the same as for Windows. << "Finally"
     
  3. Storage: vSphere 5.5 now supports VMDK disks larger than 2TB. Disks can be created up to 63.36TB in size on both VMFS and NFS.The max disk size needs to be about 1% less than the datastore file size limit. << "The last vSphere storage limit disappeared however how big datastores we will create?"
     
  4. Storage: vSphere Flash Read Cache leveraging local SSDs to eliminate read IO operations from datastores and save storage performance (IOPSes) for other purposes (writes, other workloads, etc.)  For more info look at http://wahlnetwork.com/2013/08/26/vsphere-5-5-improvements-part-5-vsphere-flash-read-cache-vflash/ or http://www.yellow-bricks.com/2013/08/26/introduction-to-vsphere-flash-read-cache-aka-vflash/ << "Sounds good but pernixdata.com looks better."
     
  5. Storage: vSphere vSAN leveraging SSD and SATA server internal disks and form it into shared storage pool. VMware promised it is match better than VSA (VMware Storage Appliance). For more info look at http://wahlnetwork.com/2013/08/26/vsphere-5-5-improvements-part-4-virtual-san-vsan/ << "We will see. have you tested VSA? I still believe real storage is real storage. At least now. However if someone considers vSAN I would recommend to invest into really good server disks and SSDs."
     
  6. Storage: PDL AutoRemove in vSphere 5.5 automatically removes a device with PDL from the host. PDL stands for Permanent Device Lost and receive it from storage array as a SCSI Sense Code. << "It would be beneficial when some storage admin removes empty LUN. Then nothing should be done on vSphere in case storage send appropriate SCSI Sense Code. MUST BE CAREFULLY TEST IT!!!"
     
  7. Networking: LACP in 5.5 gives you over 22 load balancing algorithms and you are now able to create 32 LAGs per host so you can bond together all those physical Nics. << "Finally, Nexus 1000v had it already from the beginning."
     
  8. Networking: Flow based marking and filtering provides granular traffic marking and filtering capabilities from a simple UI integrated with VDS UI. You can provide stateless filtering to secure or control VM or Hypervisor traffic. Any traffic that requires specific QoS treatment on physical networks can now be granularly marked with COS and DSCP marking at the vNIC or Port group level. << "Nice improvement, but I have never had such requirement so far."
     
  9. High Availability: Someone mentioned to me that VMware announced vSphere 5.5  Multi-processor Fault Tolerance (FTin VMworld 2013.  << "This would be interesting but must be validated as I cannot find any official statement or some blog post about it. It seems to me it was Fault Tolerance tech preview like in VMworld 2012 session I attended last year."
     
  10. Authentication:  SSO 2.0 is now a multi-master model. Replication between SSO servers is automatic and built-in. SSO is now site aware. The SSO database is completely removed. For more info look at http://wahlnetwork.com/2013/08/26/vsphere-5-5-improvements-part-7-single-sign-on-completely-redesigned/  << "Finally, previous SSO 1.0 was a nightmare!!!"
     
  11. Disaster Recovery: VMware Replication (VR) now supports more VR Server Appliances responsible for replication, more point in time instances (aka snapshots), the ability to use Storage vMotion on protected VMs, and vSphere Web Client will show you details on your vSphere Replication status when you click on the vCenter object. For more info look at http://wahlnetwork.com/2013/08/26/vsphere-5-5-improvements-part-6-site-recovery-manager-srm-and-vsphere-replication/ << "Cool. Good evolution."

Sunday, August 25, 2013

DELL OpenManage Essentials (OME)

OpenManage Essentials (OME) is a systems management console that provides simple, basic Dell hardware management and is available as a free download.

DELL OME can be downloaded at https://marketing.dell.com/dtc/ome-software?dgc=SM&cid=259733&lid=4682968

Patch 1.2.1 downloadable at
http://www.dell.com/support/drivers/us/en/555/DriverDetails?driverId=P1D4C

For more information look at DELL Tech Center.

Data Center Bridging

DCB 4 key protocols:
  •  Priority-based Flow Control (PFC): IEEE 802.1Qbb
  •  Enhanced Transmission Selection (ETS): IEEE 802.1Qaz
  •  Congestion Notification (CN or QCN): IEEE 802.1Qau
  •  Data Center Bridging Capabilities Exchange Protocol (DCBx)
PFC - provides a link level flow control mechanism that can be controlled independently for each frame priority. The goal of this mechanism is to ensure zero loss under congestion in DCB networks.PFC is independent traffic priority pausing and enablement of lossless packet buffers/queuing for particular 802.1p CoS.

ETS - provides a common management framework for assignment of bandwidth to frame priorities. Bandwidth can be dynamic based on congestion and relative ratios between defined flows. ETS provides minimum, guaranteed bandwidth allocation per traffic class/priority group during congestion and permits additional bandwidth allocation during non-congestion.

CN - provides end to end congestion management for protocols that are capable of transmission rate limiting to avoid frame loss. It is expected to benefit protocols such as TCP that do have native congestion management as it reacts to congestion in a more timely manner. Excellent blog post about CN is here.

DCBX - a discovery and capability exchange protocol that is used for conveying capabilities and configuration of the above features between neighbors to ensure consistent configuration across the network. Performs discovery, configuration, and mismatch resolution using Link Layer Discovery Protocol (IEEE 802.1AB - LLDP).

DCBX can be leveraged for many applications.
One DCBX application example is iSCSI application priority - Support for the iSCSI protocol in the application priority DCBX Type Length Value (TLV). Advertises the priority value (IEEE 802.1p CoS, PCP field in VLAN tag) for iSCSI protocol. End devices identify and tag Ethernet frames containing iSCSI data with this priority value.

Friday, August 23, 2013

DELL Force10 I/O Aggregator 40Gb Port Question

Today I have received question how to inter connect DELL Force10 IOA 40Gb uplink with DELL Force10 S4810 top of rack switches.

I assume the reader is familiar with DELL Force10 datacenter networking portfolio.

Even if you have 40Gb<->40Gb twinax cable with QSFPs between IOA and Force10 S4810 switch it is in IOA side configured by default as 4x10Gb links grouped  in Port-Channel 128.

If you connect it directly into 40Gb port in Force10 S4810 switch the 40Gb port is by default configured as 1x40Gb interface.

That’s the reason why it doesn’t work out-of-the-box. Port speeds are simply mismatched.

To make it correct you have to change 40Gb switch port to 4x10Gb port. Here is the S4810 command to change switch port from 1x40Gb to 4x10Gb:
stack-unit 0 port 48 portmode quad

Here is the snip from S4810 configuration where 40Gb port 0/48 is configure as 4x10Gb port in port-channel 128
interface TenGigabitEthernet 0/48
no ip address
!
port-channel-protocol LACP
  port-channel 128 mode active
no shutdown
!
interface TenGigabitEthernet 0/49
no ip address
!
port-channel-protocol LACP
  port-channel 128 mode active
no shutdown
!
interface TenGigabitEthernet 0/50
no ip address
!
port-channel-protocol LACP
  port-channel 128 mode active
no shutdown
!
interface TenGigabitEthernet 0/51
no ip address
!
port-channel-protocol LACP
  port-channel 128 mode active
no shutdown

interface Port-channel 128
no ip address
portmode hybrid
switchport
no shutdown

Tuesday, August 20, 2013

Best Practices for Faster vSphere SDK Scripts

Reuben Stump published excellent blog post at http://www.virtuin.com/2012/11/best-practices-for-faster-vsphere-sdk.html about performance optimization of PERL SDK Scripts.

The main takeaway is to minimize the ManagedEntity's Property Set.

So instead of

my $vm_views = Vim::find_entity_views(view_type => "VirtualMachine") ||
  die "Failed to get VirtualMachines: $!";

you have to use

# Fetch all VirtualMachines from SDK, limiting the property set
my $vm_views = Vim::find_entity_views(view_type => "VirtualMachine",
          properties => ['name', 'runtime.host', 'datastore']) ||
  die "Failed to get VirtualMachines: $!";

This small improvement have significant impact on performance because it eliminates big data (SOAP/XML) generation and transfer between vCenter service and the SDK script.

It helped me improve performance of my script from 25 seconds to just 1 second. And the impact is even better for bigger vSphere environment. So my old version of script was almost useless and this simple improvement help me so much.

Thanks Reuben for sharing this information.