Thursday, June 27, 2013

Calculating optimal segment size and stripe size for storage LUN backing vSphere VMFS Datastore

Colleague of mine (BTW very good Storage Expert) asked me what is the best segment size for storage LUN used for VMware vSphere Datastore (VMFS). Recommendations can vary among storage vendors and models but I think the basic principles are same for any storage.

I found IBM RedBook [SOURCE: IBM RedBook redp-4609-01] explanation the most descriptive, so here it is.
The term segment size refers to the amount of data that is written to one disk drive in anarray before writing to the next disk drive in the array, for example, in a RAID5, 4+1 array with a segment size of 128 KB, the first 128 KB of the LUN storage capacity is written to the first disk drive and the next 128 KB to the second disk drive. For a RAID1, 2+2 array, 128 KB of an I/O is written to each of the two data disk drives and to the mirrors. If the I/O size is larger than the number of disk drives times 128 KB, this pattern repeats until the entire I/O is completed. For very large I/O requests, the optimal segment size for a RAID array is one that distributes a single host I/O across all data disk drives. 
The formula for optimal segment size is:
LUN segment size = LUN stripe width ÷ number of data disk drives 
For RAID 5, the number of data disk drives is equal to the number of disk drives in the array minus 1, for example:
RAID5, 4+1 with a 64 KB segment size = (5-1) * 64KB = 256 KB stripe width 
For RAID 1, the number of data disk drives is equal to the number of disk drives divided by 2, for example:
RAID 10, 2+2 with a 64 KB segment size = (2) * 64 KB = 128 KB stripe width 
For small I/O requests, the segment size must be large enough to minimize the number ofsegments (disk drives in the LUN) that must be accessed to satisfy the I/O request, that is, to minimize segment boundary crossings. 
For IOPS environments, set the segment size to 256KB or larger, so that the stripe width is at least as large as the median I/O size. 
IBM Best practice: For most implementations set the segment size of VMware data partitions to 256KB.

Note: If I decrypting IBM terminology correctly then IBM mentioned term "stripe width" is actually "data stripe size". We need to clear terminology because normally is the term "stripe width" used as number of disks in RAID group. "Data stripe size" is payload without the parity. The parity is stored on another segment(s) dependent on selected RAID level.

For clear understanding terminology I've created  RAID 5 (4+1) segment/stripe visualization depicted bellow.

RAID 5 (4+1) striping example
RAID 5 (4+1) striping example

Even I found this IBM description very informative I'm not sure why they recommend to use segment size 256KB for VMware. It is true that the biggest IO size issued from ESX can be by default 32MB because bigger IOs issued from guest OS ESX splits into more IOs (for more information about big IO split see this blog post). However the most important is IO size issued from guest OSes. If you want to monitor max/average/median IO size from ESX you can use tool vscsiStats already included in ESXi for such purpose. It allows you to show histogram which is really cool (for more information about vscsiStats read this excellent blog post). So based on all these assumptions and also my own IO size monitoring in the field it seems to me that average IO size issued from ESX is usually somewhere between 32 and 64KB. So let's use 64KB as average data stripe (IO size issued from OS). Then for RAID 5 (4+1) data stripe will be composed from 4 segments and optimal segment size in this particular case should be 16KB (64/4).

Am I right or I missed something? Any comments are welcome and highly appreciated.

Update 2014/01/31:
We are discussing this topic very frequently with my colleague who work as DELL storage specialist. The theory is nice but only the real test can prove any theory. Recently he performed set of IOmeter tests against DELL PV MD3600f which is actually the same array as IBM DS3500. He found that optimal performance (# of IOPS versus response times) is when segment size is as close as possible to IO size issued from operating system. So key takeaway from this exercise is that optimal segment size for example above is not 16KB but 64KB. Now I understand IBM general recommendation (best practice) to use 256KB segment size for VMware workloads as this is the biggest segment size which can be chosen.

Update 2014/07/23:
After more thinking about this topic I've realized that idea to use the segment size bigger than your biggest IO size can make sense from several reasons

  • each IO will get single spindle (disk) to handle this IO which will use queues down the route and will be served in spindle latency time which is the minimal one for this single IO, right?
  • typical virtual infrastructure environment is running several VMs generating several IOs based on queues available in the guest OS, ESX layer disk scheduler settings (see more here on Duncan Epping blog) so at the end of the day you are able to generate lot of IOPSes by different threads and load is evenly distributed across RAID group
However, please note, that all this discussion was related to legacy (traditional) storage architectures. Some modern (virtualized) storages are doing some magic on their controllers like I/O Coalescing. I/O Coalescing is IO optimization leveraging reordering smaller IO writes to another bigger IO in controller cache and sending this bigger IO down to the disks. This can significantly change segment size recommendations so please try to understand particular storage architecture or follow storage vendor best practices and try to understand the reason of these recommendations in your particular use case. I remember EMC Clariions used IO coalescing into 64KB IO blocks. 

Related resources:

Wednesday, June 26, 2013

IOBlazer

IOBlazer is a multi-platform storage stack micro-benchmark. IOBlazer runs on Linux, Windows and OSX and it is capable of generating a highly customizable workload. Parameters like IO size and pattern, burstiness (number of outstanding IOs), burst interarrival time, read vs. write mix, buffered vs. direct IO, etc., can be configured independently. IOBlazer is also capable of playing back VSCSI traces captured using vscsiStats. The performance metrics reported are throughput (in terms of both IOPS and bytes/s) and IO latency.
IOBlazer evolved from a minimalist MS SQL Server emulator which focused solely on the IO component of said workload. The original tool had limited capabilities as it was able to generate a very specific workload based on the MS SQL Server IO model (Asynchronous, Un-buffered, Gather/Scatter). IOBlazer has now a far more generic IO model, but two limitations still remain:
  1. The alignment of memory accesses on 4 KB boundaries (i.e., a memory page)
  2. The alignment of disk accesses on 512 B boundaries (i.e., a disk sector).
Both limitations are required by the gather/scatter and un-buffered IO models.
A very useful new feature is the capability to playback VSCSI traces captured on VMware ESX through the vscsiStats utility. This allows IOBlazer to generate a synthetic workload absolutely identical to the disk activity of a Virtual Machine, ensuring 100% experiment repeatability.

TBD - TEST & WRITE REVIEW

PXE Manager for vCenter

PXE Manager for vCenter enables ESXi host state (firmware) management and provisioning. Specifically, it allows:
  • Automated provisioning of new ESXi hosts stateless and stateful (no ESX)
  • ESXi host state (firmware) backup, restore, and archiving with retention
  • ESXi builds repository management (stateless and statefull)
  • ESXi Patch management
  • Multi vCenter support
  • Multi network support with agents (Linux CentOS virtual appliance will be available later)
  • Wake on Lan
  • Hosts memtest
  • vCenter plugin
  • Deploy directly to VMware Cloud Director
  • Deploy to Cisco UCS blades
TBD - TEST & WRITE REVIEW

vBenchmark

vBenchmark provides a succinct set of metrics in these categories for your VMware virtualized private cloud. Additionally, if you choose to contribute your metrics to the community repository, vBenchmark also allows you to compare your metrics against those of comparable companies in your peer group. The data you submit is anonymized and encrypted for secure transmission.

Key Features:

  • Retrieves metrics across one or multiple vCenter servers
  • Allows inclusion or exclusion of hosts at the cluster level
  • Allows you to save queries and compare over time to measure changes as your environment evolves
  • Allows you to define your peer group by geographic region, industry and company size, to see how you stack up
TBD - TEST & WRITE REVIEW

Tuesday, June 25, 2013

How to create your own vSphere Performance Statistics Collector

Statsfeeder is a tool that enables performance metrics to be retrieved from vCenter and sent to multiple destinations, including 3rd party systems. The goal of StatsFeeder is to make it easier to collect statistics in a scalable manner. The user specifies the statistics to be collected in an XML file, and StatsFeeder will collect and persist these stats. The default persistence mechanism is comma-separated values, but the user can extend it to persist the data in a variety of formats, including a standard relational database or Key-value store. StatsFeeder is written leveraging significant experience with the performance APIs, allow the metrics to be retrieved in the most efficient manner possible.
White paper located at StatsFeeder: An Extensible Statistics Collection Framework for Virtualized Environments can give you better understanding how it work and how to leverage it.




Monday, June 24, 2013

vCenter Single Sign-On Design Decision Point

When you designing vSphere 5.1 you have to implement vCenter SSO. Therefore you have to make design decision what SSO mode to choose.

There are actually three available options

  1. Basic
  2. HA (don't mix with vSphere HA)
  3. Multisite
Justin King wrote excellent blog post about SSO here and it is worth source of information to make right design decision. I fully agree with Justin and recommending Basic SSO to my customers if possible. SSO Server protection  can be achieved by standard backup/restore methods and SSO High Availability can be increased by vSphere HA. All these methods are well known and long time used.

You have to use Multisite SSO when vCenter linked-mode is required but think twice if you really need it and benefits overweight drawbacks.

Thursday, June 20, 2013

Force10 Open Automation Guide - Configuration and Command Line Reference

This document describes the components and uses of the Open Automation Framework designed to run on the Force10 Operating System (FTOS), including:
• Smart Scripting
• Virtual Server Networking (VSN)
• Programmatic Management
• Web graphic user interface (GUI) and HTTP Server

http://www.force10networks.com/CSPortal20/KnowledgeBase/DOCUMENTATION/CLIConfig/FTOS/Automation_2.2.0_4-Mar-2013.pdf

Tuesday, June 18, 2013

How to – use vmkping to verify Jumbo Frames

Here is nice blog post about Jumbo Frame configuration on vSphere and how to test it works as expected. This is BTW excellent test for Operational Verification (aka Test Plan).

Architectural Decisions

Josh Odgers – VMware Certified Design Expert (VCDX) #90 is continuously building database of architectural decisions available at  http://www.joshodgers.com/architectural-decisions/

It is very nice example of one architecture approach.
 

Monday, June 17, 2013

PowerCLI One-Liners to make your VMware environment rock out!

Christopher Kusek wrote excellent blog post about PowerCLI useful scripts fit single line. He call it one-liners. These one-liners can significantly help you on daily vSphere administration. On top of that you can very easily learn PowerCLI constructs just from reading these one-liners.

http://www.pkguild.com/2013/06/powercli-one-liners-to-make-your-vmware-environment-rock-out/

Tuesday, June 04, 2013

Software Defined Networking - SDN

SDN is another big topic in modern virtualized datacenter so it is worth to understand what it is and how it can help us to solve real datacenter challenges.

Brad Hedlund's explanation "What is Network Virtualization"
http://bradhedlund.com/2013/05/28/what-is-network-virtualization/
Bred Hedlund is very well known netwoking expert. Now he works for VMware | Nicira participating on VMware NSX product which should be next network virtualisation platform (aka network hypervisor). He is ex-CISCO and ex-DELL | Force10 so there is big probability he fully understand what is going on.

It is obvious that "dynamic service insertion" is the most important thing in SDN. OpenFlow and CISCO vPath is trying to do it but each in different way. Same goal but with different approach. What is better? Who knows? The future and real experience will show us what is better. Jason Edelman's blog post very nicely and clearly compares both approaches.
http://www.jedelman.com/1/post/2013/04/openflow-vpath-and-sdn.html

CISCO as long term networking leader and pioneer has of course its own vision of SDN. Nexus 1000V and Virtual Network Overlays Play for CISCO Pivotal Role in Software Defined Networks. Very nice explanation of CISCO approach is available at
http://blogs.cisco.com/datacenter/nexus-1000v-and-virtual-network-overlays-play-pivotal-role-in-software-defined-networks/


Saturday, May 25, 2013

PernixData: New storage statusquo is comming

Storage SME's knows for ages that storage design begins with performance. The storage performance is usually much more important then capacity. One IOPS  cost more money then one GB of storage. Flash disks, EFD's and SSD's changed storage industry already. But the magic and the future is in software. PernixData FVP (Flash Virtualization Platform) looks like very intelligent, fully redundant and reliable software cluster aware storage acceleration platform. It leverages any local flash devices to accelerate any back-end storage used for server virtualization. Right now only VMware vSphere is supported but solution is hypervisor agnostic and it is just a matter of time when it will be ported to another server virtualization platform like Hyper-V, Xen, or KVM.

PernixData setups absolutely new storage quality in virtualized datacenter. If you have issue with storage response time (latency) then look at PernixData FVP. But what impressed me is the future because I believe the platform can be improved significantly and new functionality will come soon. I can imagine data compression and deduplication, data encryption, vendor independent replication, clonning, snapshoting, etc.

So software defined storage virtualization just began.

Happy journey PernixData.

For more information look at
http://www.pernixdata.com/
http://www.pernixdata.com/SFD3/

Wednesday, May 22, 2013

Magic Quadrant for General-Purpose Disk Arrays

http://www.gartner.com/technology/reprints.do?id=1-1ENAPKJ&ct=130325&st=sg

Pretty nice overview and comparison among storage vendors. Because I have privilege to practically design, implement and work with many storage arrays I can't agree with IBM positioning and description. In the past I was also impressed about IBM storage products but reality is little bit different. I was troubleshooting several big issues with IBM mid-range storage array IBM V7000 (Storwize) and also with high-end IBM DS8700 (Shark).
  

Monday, May 20, 2013

Difference between SCSI-2 and SCSI-3 reservation

SCSI-3 reservations are persistent across SCSI bus resets and support multiple paths from a host to a disk. In contrast, only one host can use SCSI-2 reservations with one path. If the need arises to block access to a device because of data integrity concerns, only one host and one path remain active. The requirements for larger clusters, with multiple nodes reading and writing to storage in a controlled manner, make SCSI-2 reservations obsolete.

Info retrieve from:
http://sfdoccentral.symantec.com/sf/5.0/hpux/html/vcs_install/ch_vcs_install_iofence4.html

Thursday, May 16, 2013

Reduced vCenter DB by deleting old events and tasks from vCenter database


In vCenter MS-SQL Database is storage procedure called cleanup_events_tasks_proc which deletes old data based on event and task retention settings. vCenter retention settings can be setup in vCenter Settings though vSphere Client or changed directly in database. Using vSphere Client  is recommended.


c:> "C:\Program Files\Microsoft SQL Server\90\Tools\Binn\OSQL.EXE" -S \SQLEXP_VIM -E
1> use VIM_VCDB
2> go
1> update vpx_parameter set value='' where name='event.maxAge'
2> update vpx_parameter set value='' where name='task.maxAge'
3> update vpx_parameter set value='true' where name='event.maxAgeEnabled'
4> update vpx_parameter set value='true' where name='task.maxAgeEnabled'
5> go
(1 row affected)
(1 row affected)
(1 row affected)
(1 row affected)
1> exec cleanup_events_tasks_proc
2> go
1> dbcc shrinkdatabase ('VIM_VCDB')
2> go
DbId   FileId      CurrentSize MinimumSize UsedPages   EstimatedPages
------ ----------- ----------- ----------- ----------- --------------
      5           1       81080         280       78776          78776
      5           2         128         128         128            128

(2 rows affected)
DBCC execution completed. If DBCC printed error messages, contact your system
administrator.
1> quit

Monday, April 29, 2013

DELL PowerConnect Time Configuration


Here is procedure how to setup it:

enable
configure
sntp unicast client enable
sntp server ntp.cesnet.cz
end

Here is how to verify:


console#show sntp configuration

Polling interval: 64 seconds
MD5 Authentication keys:
Authentication is not required for synchronization.
Trusted keys:
No trusted keys.
Unicast clients: Enable

Unicast servers:
Server          Key             Polling         Priority
---------       -----------     -----------     ----------
195.113.144.201 Disabled        Enabled         1
195.113.144.204 Disabled        Enabled         1

Here is how to check current time:


console#show clock

10:23:42 (UTC+0:00) Apr 29 2013
Time source is SNTP

Summary:
That's how to set time on DELL PowerConnect switches. Please note that time is in UTC +0:00 so when you want localize your time you can use "clock timezone" and "clock summer-time" command in conf mode but I don't like it. UTC time is better for troubleshooting.  



Tuesday, April 16, 2013

Network Overlays vs. Network Virtualization

Scott Lowe published very nice blog post (philosophy reflection) about "Network Overlays vs. Network Virtualization".

And this was my comment to his post ..

Scott, excellent write-up. As always. First of all I absolutely agree that good definitions, terminology, and conceptual view of particular layer is fundamental to fully understand any technology or system. Modern hardware infrastructure is complex and complexity is growing year on year.
Software programming has the same history. Who programs in assembler nowadays? Why we use object oriented programming more then 20 years? The answer is ... to avoid complexity and have control on system behavior. In software MVC model is often use and it stands for Model-View-Controller. Model is logical representation of something we want to run in software, View is simplified model presentation to end user and controller is engine behind the scene. The same concept apply to SDI (Software Defined Infrastructure) where SDN (Software Defined Network) is another example of the same story.
VMware did excellent job with infrastructure abstraction. Everything in VMware vSphere is object. Better to say managed object which has some properties and methods. So it is the model. vSphere Client or Web Client or vCLI or PowerCLI are different user interfaces into the system. So it is View. And who is Controller? Controller is vCenter because it orchestrates system behavior. vCenter controller includes prepackaged behavior (out-of-the-box) but it can be extended by custom scripts and orchestrated externally for example by vCenter Orchestrator. That's what I really love VMware vSphere. And it is from the begining architected to purely represent hardware infrastructure in software constructs.
Now back to Network Virtualization. In my opinion Network Overlay (for example VXLAN) is mandatory  component to abstract L2 from physical switches and have it in software. Particular Network overlay protocol must be implemented in "Network Hypervisor" which is software L2 switch. But "Network Hypervisor" has to implement also other protocols and components to be classified as "Network Virtualization" and not only as just another software vSwitch.
What Scott already mentioned in his post is that networking is not just L2 but also L3-7 network services so all network services must be available to speak about full "Network Virtualization". Am I correct Scott? And I feel the open question in this post ... who is the controller of "Network Virtualization"? :-)  

Monday, April 15, 2013

Tecomat: Industrial and home automation

How to get Managed Object Reference ID ( aka MoRef ) from vSphere

If you've already scripted vSphere infrastructure you probably already know that everything has software representation also known as Managed Object. Each Managed Object has unique identifier referenced as Managed Object ID. Sometimes this Managed Object ID is needed.

In PowerCLI you can get it via following two lines
$VM = Get-VM -Name $VMName 
$VMMoref = $VM.ExtensionData.MoRef.Value
You can also use Perl script leveraging VMware vSphere Perl SDK to get Managed Object ID for particular virtual machine or datastore. If you need MOID for another entity it's pretty easy to slightly change the script below.

Script is developed and tested on vMA (VMware management Assistant) in directory /usr/lib/vmware-vcli/apps/general and script name is getmoid.pl

Here is usage example how to get MOID of datastore called FreeNAS-iSCSI-01:
./getmoid.pl --server --username --password --dsname FreeNAS-iSCSI-01

Manage Object ID: datastore-162

Here is usage example how to get MOID of virtual machine called VMA:
./getmoid.pl --server --username --password --vmname VMA
Manage Object ID: vm-122

Any feedback or comments are welcome.

Monday, April 08, 2013

How to create FreeBSD memstick in running FreeBSD system

# make 2GB image file: dd if=/dev/zero of=./memstick.img bs=1m count=2000 # load image as virtual disk device: mdconfig -a -t vnode -f ./memstick.img -u 0 fdisk -iI /dev/md0 bsdlabel -wB /dev/md0s1 newfs /dev/md0s1a mount /dev/md0s1a /mnt cd /usr/src make installkernel installworld DESTDIR=/mnt umount /mnt # insert memstick now, assuming it will be /dev/da0... # raw copy virtual disk content to memstick. dd if=/dev/md0 of=/dev/da0 bs=1m

Saturday, March 30, 2013

VMware VXLAN Deployment Guide

Vyenkatesh Deshpande recently published "VMware Network Virtualization Design Guide" which can be downloaded here. However deployment guide which is here is very valuable if you really want to implement VXLAN in your environment.

Sunday, February 24, 2013

SG3_UTILS: How to send SCSI commands to devices

http://sg.danny.cz/sg/sg3_utils.html
http://linux.die.net/man/8/sg3_utils

The sg3_utils package contains utilities that send SCSI commands to devices. As well as devices on transports traditionally associated with SCSI (e.g. Fibre Channel (FCP), Serial Attached SCSI (SAS) and the SCSI Parallel Interface(SPI)) many other devices use SCSI command sets.


How the Cluster service reserves a disk and brings a disk online

http://support.microsoft.com/kb/309186

This article (link above) describes how the Microsoft Cluster service reserves and brings online disks that are managed by cluster service and related drivers.


Wednesday, February 20, 2013

PuppetLabs | Razor: Next-Generation Provisioning


System administrators require the same agility and productivity from their hardware infrastructure that they get from the cloud. In response, Puppet Labs and EMC collaboratively developed Razor, a next-generation physical and virtual hardware provisioning solution. Razor provides you with unique capabilities for managing your hardware infrastructure, including:
  • Auto-Discovered Real-Time Inventory Data
  • Dynamic Image Selection
  • Model-Based Provisioning
  • Open APIs and Plug-in Architecture
  • Metal-to-Cloud Application Lifecycle Management
Together, Razor and Puppet enable system administrators to automate every phase of the IT infrastructure lifecycle, from bare metal to fully deployed cloud applications.



Monday, February 18, 2013

Automated Storage Tiering - Sub-LUN tiering

Excellent comparisons between Automated Storage Tiering technologies of different vendors.
I personally believe automated storage tiering (AST) is really important for dynamic virtualized datacenter and because AST differs among vendors I'm going to collect important information for design considerations.  I don't want to preferred or offend against any product. Each product has some advantages and disadvantages and we as infrastructure architects has to fully and deeply understand technology to be able prepare good design which is the most important factor for reliable and well performed infrastructure.

Good mid-range storage products on the market (my personal opinion):

  • DELL Compellent
  • Hitachi HUS
  • EMC VNX

DELL Compellent
Tiers: SSD, SAS, NL-SAS (SATA)
AST Sub-LUN tiering block: 512kb, 2MB (default), 4MB
Tiering optimisation analysis period: [TBD]
Tiering optimisation relocation period: [TBD]
Tiering algorithm: [TBD]
QoS per LUN: no

Hitachi HUS (HUS 110, HUS 130, HUS 150)
Tiers: SSD, SAS, NL-SAS (SATA)
AST Sub-LUN tiering block: 32MB
Tiering optimisation analysis period: 30 minutes
Tiering optimisation relocation period: [TBD]
Tiering algorithm: [TBD]
QoS per LUN: no

EMC VNX 
Tiers: SSD, SAS, NL-SAS (SATA)
AST Sub-LUN tiering block: 1GB
Tiering optimisation analysis period: 60 minutes
Tiering optimisation relocation period: user defined
Tiering algorithm:
During user-defined relocation window, 1GB slice ae promoted according to both the rank ordering performed in the analysis stage and a tiering policy set by the user. During relocation, FAST VP relocates higher-priority slices to higher tiers; slices are relocated to lower tiers only if the space they occupy is required for a higher-priority slice. This way, FAST VP fully utilized the highest-performing spindles first. Lower-tier spindles are utilized as capacity demand grows. Relocation can be initiated manually or by a user configurable, automated scheduler. The relocation process targets to create 10% free capacity in the highest tiers in the pool. Free capacity in these tiers is used for new slice allocations of high priority LUNs between relocations.
QoS per LUN: yes


I've collected information from several public resources so if there is some wrong information please let me know directly or via comments.



Wednesday, February 13, 2013

Understand SCSI, SCSI command responses and sense codes

During troubleshooting VMware vSphere and storage related issues it is quite useful to understand SCSI command responses and sense codes.

Usually you can see in log something like "failed H:0x8 D:0x0 P:0x0 Possible sense data: 0xA 0xB 0xC"

H: means host codes
D: means device codes
P: means plugin codes
A: is Sense Key
B: is Additional Sense Code
C: is Additional Sense Code Qualifier

Some host codes:
0x2 Bus state busy
0x3 Timeout for other reason
0x5 Told to abort for some other reason
0x8 Bus reset

Some device codes:
00h  GOOD
02h  CHECK CONDITION
04h  CONDITION MET
08h  BUSY
18h  RESERVATION CONFLICT
28h  TASK SET FULL
30h  ACA ACTIVE
40h  TASK ABORTED

Some plugin codes:
00h  No error.
01h  An unspecified error occurred. Note: The I/O cmd should be tried.
02h  The device is a deactivated snapshot. Note: The I/O cmd failed because the device is a deactivated snapshot and so the LUN is read-only.
03h  SCSI-2 reservation was lost.
04h  The plug-in wants to requeue the I/O back. Note: The I/O will be retried.
05h  The test and set data in the ATS request returned false for equality.
06h  Allocating more thin provision space. Device server is in the process of allocating more space in the backing pool for a thin provisioned LUN.
07h  Thin provisioning soft-limit exceeded.
08h  Backing pool for thin provisioned LUN is out of space.

Some SCSI Sense Keys:
SCSI Sense Keys appear in the Sense Data available when a command returns with a CHECK CONDITION status. The sense key contains all the information necessary to understand why the command has failed.

Code Name
0h   NO SENSE
1h   RECOVERED ERROR
2h   NOT READY
3h   MEDIUM ERROR
4h   HARDWARE ERROR
5h   ILLEGAL REQUEST
6h   UNIT ATTENTION
7h   DATA PROTECT
8h   BLANK CHECK
9h   VENDOR SPECIFIC
Ah   COPY ABORTED
Bh   ABORTED COMMAND
Dh   VOLUME OVERFLOW
Eh   MISCOMPARE

There is VMware KB with further details here.

It is worth to read following documents
http://www.tldp.org/LDP/khg/HyperNews/get/devices/scsi.html (this is quite old document for programmers willing to write SCSI driver)
http://en.wikipedia.org/wiki/SCSI
http://en.wikipedia.org/wiki/SCSI_contingent_allegiance_condition
http://en.wikipedia.org/wiki/SCSI_Request_Sense_Command

What is SCSI reservation
http://mrwhatis.com/scsi-reservation.html

SCSI-3 Persistent Group Reservation
http://scsi3pr.blogspot.cz/


Tuesday, February 12, 2013

Using the VMware I/O Analyzer v1.5: A Guide to Testing Multiple Workloads

I encourage you to watch great video about good practice how to use VMware I/O Analyzer (VMware bundle of IOmeter).

There is mentioned very important step to get relevant results. The step is to increase the size of second disk in virtual machine (OVF appliance). Default size is 4GB which is not enough because it hits the cache of almost any storage array and results are unreal and misleading.

Video is here
bit.ly/118kWs1 
or here
http://www.youtube.com/watch?v=zHJr957kN1s&feature=youtu.be

Enjoy.

Tuesday, January 22, 2013

HP Flex-10 Design, Plan, Implement, Test

Before design phase of VMware vSphere Infrastructure I recommend to read blog post "Understanding HP Flex-10 Mappings with VMware ESX/vSphere" to get general overview about server infrastructure and advanced  network interconnect. During design phase prepare detail test plan (aka operational verification) and test it during implementation phase. You can use blog post "Testing Scenario's VMware / HP c-Class Infrastructure" as a template for your test plan. I don't doubt that you normally test infrastructure before put it into production :-)

Saturday, January 19, 2013

MSCS RDMs causing long boot of ESX

That's because RDM LUN attached to MSCS cluster has permanent SCSI reservation initiated by active node of cluster.

In ESX 5 you have to mark all such LUNs as perennially reserved and your ESX boot can be fast as usual.

Here is CLI command to mark LUN
esxcli storage core device setconfig -d naa.id --perennially-reserved=true

This has to be changed on all ESX hosts with visibility to the LUN.

More info at http://kb.vmware.com/kb/1016106

Wednesday, January 09, 2013

How to calculate storage performance from host perspective

Storage performance is usually quantified as IOPS (I/O transactions per second). The performance from storage perspective is quite easy. It really depends on speed of each particular disk - also known as spindle. Each disk has some speed and bellow are written average values which are usually used for storage performance calculation
  • SATA disk = 80 IOPS
  • SCSI DISK(SAS or FC) 10k RPM = 150 IOPS
  • SCSI DISK(SAS or FC) 15k RPM = 180 IOPS
  • SSD disk (SLC aka EFD) = 6000 IOPS
So when we need higher performance we have to bundle disks. Disks can be bundled with standard RAID technology.

Here are most common RAID types used on standard disk arrays:
  • RAID 0 - no redundancy, disk bundle, higest performance => WRITE PENALTY = 0
  • RAID 1 - disk mirror, max bundle of 2 disks, high performance => WRITE PENALTY = 2
  • RAID 10 - RAID 1 + RAID 0 for bundling disk pairs, max disk bundle depends on disk array limits, high performance => WRITE PENALTY = 2
  • RAID 5 - block level striping with rotated parity, max disk bundle depends on disk array limits, moderate performance => WRITE PENALTY = 4
  • RAID 6 - block level striping with double parity, max disk bundle depends on disk array limits, lower performance => WRITE PENALTY = 6


So performance from storage perspective and from host perspective are different. Performance from storage perspective is simply summation of speed of all disks in RAID group. Performance from host perspective depends on selected RAID type.

To calculate estimated storage performance from host perspective we need to use the formula of several variables.

First of all let's define variables

P=write penalty of selected RAID type
R=Read % of disk workload
W=Write % of disk workload
H=IOPS from host perspective
S=IOPS from storage perspective

and now we can write formula to calculate storage performance from host perspective
H = S / (R+W*P)


Do you want to know all steps how to get this formula? It is simple. Start from another formula which describes storage behavior.
R*(1*H) + W*(P*H) = S
Above formula says - each host read IOPS generates single storage IOPS but each write IOPS generates multiple IOPS based on RAID type penalty (P).

Does it make sense? If not example can help you to understand.

My RAID group has 9 SAS disks 600GB/15k RPM and I use RAID 5 (8+1).
So from storage perspective I have 9 disks where each can perform 180 IOPS which means I have performance 1620 IOPS from storage perspective. Let's assume I have strange read/write ratio 20:80.

S = 1620
P = 4 (because of RAID 5)
R = 20% = 0.2
W= 80% = 0.8
I need to know H ... storage performance from host perspective.

H = 1620 / (0.2 + 0.8 * 4) = 1620 / 3.4 = 476.47 IOPS from host perspective.

Note: Modern disk arrays often offer AST (Automated Storage Tiering). The calculation described in this blog post is valid even for those disk arrays. You have to fully understand internal architecture and design of particular storage but generally all storage pools are build from some sub disk groups bundled and protected by some RAID type. So if you have 125 disks bundled by 5 disks in RAID 5 (4+1) then the principle is the same. We have 125 spindles and write penalty is 4 because of RAID 5.

Thursday, December 20, 2012

Set the Scratch Partition from the vSphere Client

If a scratch partition is not set up, you might want to configure one, especially if low memory is a concern. When a scratch partition is not present, vm-support output is stored in a ramdisk.
The directory to use for the scratch partition must exist on the host.

1

Use the vSphere Client to connect to the host.
2

Select the host in the Inventory.
3

In the Configuration tab, select Software.
4
Select Advanced Settings.
5
Select ScratchConfig.
The field ScratchConfig.CurrentScratchLocation shows
the current location of the scratch partition.
6

In the field ScratchConfig.ConfiguredScratchLocation,
enter a directory path that is unique for this host.

Example of directory path is
/vmfs/volumes/NFS-SYNOLOGY-SSD/scratch/esx21.home.uw.cz

In the example above, I have
datastore with name NFS-SYNOLOGY-SSD
where I have subdirectory scratch
having another subdirectory esx21.home.uw.cz 

7

Reboot the host for the changes to take effect.

(copy from vSphere documentation)

For automated scratch partition configuration you can use vCLI, PowerCLI. For details see. VMware KB 1033696.

And here is my PowerCLI script inspired by KB above to set scratch location on all ESXi hosts in particular vSphere clusters.

Wednesday, December 19, 2012

ESXi strange related log entry in /var/log/vmkernel.log


I've just found in /var/log/vmkernel.log lot of following storage errors


2012-12-19T01:34:02.010Z cpu2:4098)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x93 (0x412401965f00, 5586) to dev "naa.60060e80102d5f500511c97d000000d4" on path "vmhba2:C0:T0:L2" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x96 0x32. Act:NONE
2012-12-19T01:34:02.010Z cpu2:4098)ScsiDeviceIO: 2322: Cmd(0x412401965f00) 0x93, CmdSN 0xc6fd5 from world 5586 to dev "naa.60060e80102d5f500511c97d000000d4" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x96 0x32.



The main part of log entry is "failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x96 0x32"

If I understand correctly
D: 0x2 = DEVICE CHECK CONDITION
Sense code 0x5 = ILLEGAL REQUEST

What is it? What doe's it mean?

I have ESXi 5.0 build 768111, storage HDS AMS 2300, CISCO UCS blade system, CISCO FC switches.

Update 1:
I've thought more about the root cause ... important detail is that it is happen when storage vMotion or other data migration is happening. So I've a hypotheses that it is related to VAAI. Storage is VAAI enabled and VAAI is supported. However disk block size is different on datastores (we are just in the middle of migration from VMFS-3 to VMFS-5).

So I've to do deeper diagnostic and root cause troubleshooting.

Stay tuned.


Update 2:
Solved, VAAI primitives must be enabled also on HDS Host Masking. For more information check
http://www.hds.com/assets/pdf/optimizing-the-hitachi-ams-2000-family-in-vsphere-4-environments.pdf




Friday, December 07, 2012

Storage Queues and Performance

VMware recently published a paper titled Scalable Storage Performance that delivered a wealth of information on storage with respect to the  ESX Server architecture.  This paper contains details about the storage  queues that are a mystery to many of VMware's customers and partners.   I  wanted to start a wiki article on some aspects of this paper that may  be interesting to storage enthusiasts and performance freaks.

Blog post for more information is at http://communities.vmware.com/docs/DOC-6490

These information are very useful for deep understanding of full storage stack.

Wednesday, December 05, 2012

Best Practices for Faster vSphere SDK Scripts

Source at http://www.virtuin.com/2012/11/best-practices-for-faster-vsphere-sdk.html 
The VMware vSphere API is one of the more powerful vendor SDKs available in the Virtualization Ecosystem.  As adoption of VMware vSphere has grown over the years, so has the size of Virtual Infrastructure environments.  In many larger enterprises, the increasing number of VirtualMachines and HostSystems is driving the architectural requirement to deploy multiple vCenter Servers.
In response, the necessity for automation tooling has grown just as quickly.  Automation to create daily reports, perform bulk operations, and aggregate data from large, distributed Virtual Infrastructure environments is a common requirement for managing the increasing virtual sprawl.
In a Virtual Infrastructure comprised of thousands of objects, even a simple script to list all VirtualMachines and their associated HostSystem and Datastores can result in very slow runtime execution.  Developing automation with the following, simple best practices can take orders of magnitude off your vSphere API tool's runtime.

 READ FULL ARTICLE

Monday, December 03, 2012

DELL Active System Manager

DELL Active System is managed by DELL Active System Manager. This is DELL converged infrastructure solution (blade server, networking, storage) to achieve "mainframe of 21st century" with leveraging server virtualization (hypervisors) to have enough flexibility to achieve required infrastructure SLAs.

http://www.youtube.com/watch?v=xU1I93wEHuU


Configuring a Chassis in Dell Active System Manager
http://www.youtube.com/watch?v=cRO0546yJ8U


IBM PureFlex

IBM Pure Flex System is probably another next generation computing system leveraging converged infrastructure concept. IBM Flex System Manager manages Pure Flex System. Who can honestly and precisely compare it with HP Virtual Connect, CISCO UCS, and DELL Active System?

Introduction video is available at
http://www.youtube.com/watch?v=GDGpzkQm8kU


Saturday, December 01, 2012

VAAI - VMware API for Array Integration deep dive

http://www.vmware.com/files/pdf/techpaper/VMware-vSphere-Storage-API-Array-Integration.pdf

Tuesday, November 20, 2012

Correlating vCenter Server and ESXi/ESX host build numbers to update levels

VMware software versions can be found on VMware KB Article 1014508.

Very nice list of VMware ESX server build numbers and versions mappings together with mapping to VMware tools (aka vmtools) versions is at https://packages.vmware.com/tools/versions

Brocade Secure SAN Zoning Best Practices

White Paper
http://www.brocade.com/downloads/documents/white_papers/Zoning_Best_Practices_WP-00.pdf

This paper describes and clarifies Zoning, a security feature in Storage
Area Network (SAN) fabrics. By understanding the terminology and
implementing Zoning best practices, a Brocade®
 SAN fabric can be
easily secured and scaled while maintaining maximum uptime.
The following topics are discussed:
• Zoning defined and LUN security in the fabric
• Identifying hosts and storage members of a zone
• How do SAN switches enforce Zoning?
• Avoiding Zoning terminology confusion
• Approaches to Zoning, how to group hosts and storage in zones
• Brocade Zoning recommendations and summary


What is Zoning?
Zoning is a fabric-based service in Storage Area Networks that groups host and storage nodes
that need to communicate. Zoning creates a situation in which nodes can communicate with
each other only if they are members of the same zone. Nodes can be members of multiple
zones--—allowing for a great deal of flexibility when you implement a SAN using Zoning.
Zoning not only prevents a host from unauthorized access of storage assets, but it also stops
undesired host-to-host communication and fabric-wide Registered State Change Notification
(RSCN) disruptions. RSCNs are managed by the fabric Name Server and notify end devices of
events in the fabric, such as a storage node or a switch going offline. Brocade isolates these
notifications to only the zones that require the update, so nodes that are unaffected by the
fabric change do not receive the RSCN. This is important for non-disruptive fabric operations,
because RSCNs have the potential to disrupt storage traffic. When this disruption was more
common, that is, with older Host Bus Adapter (HBA) drivers, RSCNs gained an undeserved
negative reputation. However, since that time most HBA vendors have addressed the issues.
When nodes are zoned into small, granular groupings, the occurrences of disruptive RSCNs
are virtually eliminated. See a discussion of single HBA zoning in the section of this paper
entitled, “Approaches to Zoning.”

ESX and disk issues

ESX 4 & 5: Resolving SCSI reservation conflicts
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1002293
In this KB article is described the process how to find which ESX host has SCSI reservation on LUN

ESX 5: Vmware vSphere 5 dead LUN and pathing issues and resultant SCSI errors
http://raj2796.wordpress.com/2012/03/14/vmware-vsphere-5-dead-lun-and-pathing-issues-and-resultant-scsi-errors/

All ESX versions: After repeated SAN path failovers, operations that involve VMFS changes might fail for all hosts accessing a particular LUN
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009899

ESX 4.x: ESX/ESXi hosts in APD may appear Not Responding in vCenter Server
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1030980

ESX 4.1: Virtual machines stop responding when any LUN on the host is in an all-paths-down (APD) condition
http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=1016626

ESX 5.1 has significant improvements with APD a PDL
http://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Storage-Technical-Whitepaper.pdf


Saturday, November 17, 2012

ESX Automated Provisioning on CISCO UCS

This is the demo of automation showing how VMware vSphere ESX host can be
 automatically deploy to CISCO UCS Service Profile which is booted from SAN.


If you want to know more don't hesitate to write comment bellow the blog post.

Wednesday, October 31, 2012

How to defend against ARP poisoning/spoofing attack in vSphere infrastructure

There are few vSphere Infrastructure enterprise possibilities how to deal with this type of attack.
I know about two ... Vmware vShield  and CISCO Nexus1000v.

However here I would like to share idea how to do it with open source tools integrated into enterprise infrastructure.

Disclaimer: 
Please be aware that this is not out of box enterprise solution and you have to know what you are doing and you have full responsibility for all impacts.

How we can simulate the attack?
Bellow is tutorial inspired by another tutorial from
http://blog.facilelogin.com/2011/01/arp-poisoning-with-dsniff.html
You can simply change installation procedures based on your OS distribution.

ARP poisoning with dsniff
dsniff is a collection of tools for network auditing and penetration testing. dsniff, filesnarf, mailsnarf, msgsnarf, urlsnarf, and webspy passively monitor a network for interesting data (passwords, e-mail, files, etc.). arpspoof, dnsspoof, and macof facilitate the interception of network traffic normally unavailable to an attacker (e.g, due to layer-2 switching). sshmitm and webmitm implement active monkey-in-the-middle attacks against redirected SSH and HTTPS sessions by exploiting weak bindings in ad-hoc PKI.

To install dsniff on CentOS 6.

yum -y install wget

cd /usr/src
wget http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-7.noarch.rpm
rpm -ivh epel-release-6-7.noarch.rpm
yum –y install openssl gcc flex bison libpcap-devel libnet

yum install dsniff

Now you need to find out two things,

1. IP address of the target machine - say 192.168.1.4
2. IP address of the Gateway - say 192.168.1.1

Let's start ARP poisoning from the attacker's machine - with arpspoof tool which comes with dsniff.

$ sudo arpspoof -i en1 -t 192.168.1.4 192.168.1.1

This will update target machine's ARP table with attacker's MAC address against the IP address of the gateway.

Now - start a tcpdump on the same interface from your machine - start viewing all the traffic going to and from the target machine.

$ sudo tcpdump -i en1

How we can detect the attack?
We can use aprwatch for example on my favorite OS FreeBSD running in virtual machine,

Installation is simple as
cd /usr/ports/net-mgmt/arpwatch/
make install
...
then you have to add
arpwatch_enable="YES"
in to your /etc/rc.conf
...
 
And last but not least is to enable promiscuous mode on VMware vSwitch portgroup where arpwatch vm is connected to. The best way is to create another portgroup (single port is enough) with the same VLAN ID as protected VLAN and in Security we have to set Promiscuous Mode to Accept. 

... then arpwatch send message to syslog. So you can see something similar in your /var/log/messages
Oct 31 09:08:02 ips arpwatch: flip flop a.b.c.d 0:50:56:8d:2e:bc (54:52:0:fe:47:95)

Arpwatch can also send a e-mail message about incident. The message looks like

hostname:
ip address: 95.80.240.1
ethernet address: 54:52:0:fe:47:95
ethernet vendor:
old ethernet address: 0:50:56:8d:2e:bc
old ethernet vendor: VMWare, Inc.
timestamp: Wednesday, October 31, 2012 8:57:33 +0100
previous timestamp: Wednesday, October 31, 2012 8:57:33 +0100
delta: 0 seconds

How we can protect against the attack?

Well this is another story. It really depends on your environment but in my environment I have vSphere Distributed Switch and all virtual machines have VMtools installed so I trust VMware MAC:IP associations and based on these information (MAC) I can find the attacker port in distributed switch and disable it.

And that's just a small step to do it in automated way by leveraging VMware vCLI (aka VMware Perl SDK).

WARNING!!!
Currently described solution works only on single ESX host and doesn't work among more ESX hosts because of virtual networking principles. VMware virtual switch (or module of distributed switch) is not a switch but port extender. The difference is significant. The ARP poisoning attack is based on permanently unicasting ARP replies to the victim so when Arpwatch server is not on the same ESX as the attacker or the victim then even promiscuous port on distributed switch portgroup will not help us to catch it because arp reply packet is not visible on Arpwatch server.

The solution would be to have arpwatch VM on each ESX host. VMware introduced the concept of ESX agents (aka Agent Virtual Machines) so I believe this is the right use case for ESX agent implementation.
For more information about "Deploying vSphere Solutions, vServices, and ESX Agents" read this document.

I have to test this architecture ... so stay tuned.

Any comments or thoughts are appreciated.

Wednesday, October 24, 2012

Accessing Microsoft SQL Server from Linux using DBD::Sybase

Citation from: http://www.perlmonks.org/?node_id=392385
Author:  Lindsay Leeds (2004 Sep 20)

Recently, I made yet another attempt to get Perl to access Microsoft SQL Server using DBD.  Usually, when I want to connect to a Microsoft SQL Server, it is from Perl on Windows.  So I take the easy route and use DBD::ODBC and use an ODBC connection.  This time though, I wanted to connect to Microsoft SQL Server 2000 from a Linux box.  Having no ODBC to fall back on, I looked for native DBD driver of some sort.
It took me several hours of struggling to make it work.  I almost gave up several times, so I am writing outline to help anyone else trying to accomplish this same task.
In the end, we will use the DBD::Sybase perl module from CPAN to access the Microsoft SQL Server.  Before we can do that however, we must first compile the freetds library.

Note: From now on I will refer to Microsoft SQL Server as SQL Server.  Please do not confuse this with a generic sql server.  We can all now pause to gripe about the lack of imagination in product naming at Microsoft.
Compiling Freetds
Download and compile freetds from http://www.freetds.org/.

once you unzip and untar it, enter the directory and run:
./configure --prefix=/usr/local/freetds --with-tdsver=7.0
make
make install

Configuring Freetds
Now we have the freetds compiled, but we still have configure it.  This is the part that threw me off and is so different from other DBD drivers.  The DBD::Sybase driver will ultimately be affected by the contents of the /usr/local/freetds/etc/freetds.conf file.  If that file is not configured correctly, your DBD::Sybase connection will fail.
Okay, now that we have established there is a  relationship between the freetds.conf file and the DBD::Sybase module, let's edit the freetds.conf file.
The strategic modifications I made to the freetds.conf file were:
1) uncomment the following lines and modify if necessary:
try server login = yes
try domain login = no

Note: this forces the module to attempt a database login instead of a domain login.  I could not get domain login to work, though I will admit I did not try very hard.
2) uncomment the following line and modify if necessary:
tds version = 7.0
This supposedly sets the default tds version to establish a connection with.  I have only SQL Server 2000 servers, and they won't talk at any lower version.  So I set it to 7.0.  If for some reason you had older SQL Servers, you might leave it at the default 4.2.
3) create a server entry for my server sql1:
[sql1] host = sql1 port = 1433 tds version = 8.0
Note: My server here is sql1.  Ping sql1 worked, so I am sure I can resolve it using DNS.  You can also specifcy an ip address instead of the host name.  The sql1 in the brackets is just a descriptor.  It could be 'superduperserver' and it would still work as long as my 'host =' is set correctly.  I tried 'tds version 7.0' for my SQL Sever 2000 and it worked.  Version 5.0 though resulted in an error.  You might want to verify your SQL Server is listening on port 1433 with a 'netstat -a -n' run from the command line on the SQL Server.
At this point you can verify your configuration.
/usr/local/freetds/bin/tsql -S sql1 -U sqluser
You will then be prompted for a password and if everything is well, you will see a '1)' waiting for you to enter a command.  If you can't get the 1) using tsql, I doubt your DBD::Sybase perl code is going to work.  Please note that sqluser is not an Active Directory/Windows Domain user, but an SQL Server user.
Compiling DBD::Sybase
Now that we have the freetds library prerequisite for DBD::Sybase installed and configured, we can compile the DBD::Sybase perl module.  Obtain it from www.cpan.org if you haven't already.
once you have untarred it and are in the directory, run:
export SYBASE=/usr/local/freetds
perl Makefile.PL
make
make install
Note: The export line is to let the compilation process know where to find the freetds libraries.

Using DBD::Sybase
You are now ready to test your DBD::Sybase module.
#!/usr/bin/perl

use DBI;

$dsn = 'DBI:Sybase:server=sql1';

my $dbh = DBI->connect($dsn, "sqluser", 'password');
die "unable to connect to server $DBI::errstr" unless $dbh;

$dbh->do("use mydatabase");

$query = "SELECT * FROM MYTABLE";
$sth = $dbh->prepare ($query) or die "prepare failed\n";
$sth->execute( ) or die "unable to execute query $query   error $DBI::errstr";

$rows = $sth->rows ;
print "$row rows returned by query\n";

while ( @first = $sth->fetchrow_array ) {
   foreach $field (@first) {
      print "field: $field\n";
   }
}

Good luck!

Sunday, October 07, 2012

Adding ESXi 5.1 to "5.1 vCenter Appliance" fail

Finally I found time to install vSphere 5.1 in my home lab. I have 5.0 environment running so I've bought another old DELL PE 2950 on czech "ebay like" system Aukro (www.aukro.cz) for just 6.500 CZK (approx. 330 USD) to leave my current lab untouched and try 5.1.

So, I upgraded BIOS and DRAC to latest firmwares and installed DELL version of ESXi 5.1 (embedded) on my DELL PE 2950. Then I installed vCenter appliance (OVF) on top of this new ESX.

I was able to add my old ESX5.0 to this new vCenter but not the new one.

WHAT'S WRONG???

Troubleshooting process ...

In vCenter event log I can see message: "A general system error occurred: Timed waiting for vpxa to start".

In /var/log/vpxa.log ...

2012-10-07T08:33:29.941Z [FFFE1B90 error 'SoapAdapter'] Unsupported namespace "urn:vpxa3" in content of SOAP body
-->
--> while parsing SOAP body
--> at line 9, column 0
-->
--> while parsing SOAP envelope
--> at line 2, column 0
-->
--> while parsing HTTP request before method was determined
--> at line 1, column 0

So, what  versions am I running?


ESXi 5.1.0 (799733)
vCenter Appliance 5.0.0 (755629)


Oooops ... I believed I installed vCenter 5.1 because in my local install archive it is stored as
VMware-vCenter-Server-Appliance-5.1.0.5100-799730_OVF10


So the cause is clear ... bad installation image management cost me one hour of troubleshooting :-(





Tuesday, October 02, 2012

NAKIVO - another virtual infrastructure backup software

NAKIVO (http://nakivo.com) is another virtual infrastructure backup software. It can be installed on Windows or Linux (Ubuntu) server. Linux installation is something which interest me. I have to test it and compare it against Veeam Backup and Replication. 

New Nexus 1000v (2.1) will be also available as free edition

Source

Nexus 1000v version 2.1 will have (2.1 is currently beta) two editions. Essential edition is free of charge so VMware Enterprise Plus customers can leverage CISCO virtual networking. Advanced edition is paid version but with significantly enhanced features. The most interesting thing is that VSG (Virtual Security Gateway) is also included in Nexus 1000v advanced edition.

Monday, October 01, 2012

Enabling Nested ESXi in vSphere 5.1

Nice article how to check physical ESX host capability to virtualize ESX (aka nested ESX).

esxcli for vSphere 5

Excellent introduction into esxcli.

Automating ESXi 5 Kickstart Tips & Tricks

Here is the link to excellent blog post.

iReasoning MIB browser - Free MIB Browser

iReasoning MIB browser is a powerful and easy-to-use tool powered by iReasoning SNMP API . MIB browser is an indispensable tool for engineers to manage SNMP enabled network devices and applications. It allows users to load standard, proprietary MIBs, and even some mal-formed MIBs. It also allows them to issue SNMP requests to retrieve agent's data, or make changes to the agent. A built-in trap receiver can receive SNMP traps and handle trap storm.

Major features:

    Intuitive GUI
    Complete SNMPv1, v2c and v3 (USM and VACM) support
    Complete SNMPv3 USM support, including HMAC-MD5, HMAC-SHA, CBC-DES, CFB128-AES-128, CFB128-AES-192, CFB128-AES-256 (128-bit, 192-bit and 256-bit AES) algorithms
    Robust and powerful SMIv1/SMIv2 MIB parser
    IPv6 support
    Trap Receiver
    Trap Sender
    Log window to display application log and SNMP packets exchanged between browser and agents
    Port view (bandwidth utilization, error percentages) for network interface cards
    Switch port mapper for mapping switch ports
    Table view for MIB tables
    SNMPv3 USM user management (usmUserTable in SNMP-USER-BASED-SM-MIB)
    Device snapshot
    Cisco device snapshot
    Performance graph tool for monitoring of numerical OID values
    Ping and traceroute tools
    SNMP Agents Comparison
    Network discovery tool
    Runs on Windows, Mac OS X, Linux and other UNIX platforms

http://ireasoning.com/mibbrowser.shtml

Note: another free MIB browsers is getif, mibble