Friday, February 28, 2014

VMware Site Recovery Manager network ports

Here are documented network port numbers and protocols that must be open for Site Recovery Manager, vSphere Replication, and vCenter Server. Very nice and useful VMware KB article however during my last SRM implementation I have realized that some ports are not documented on KB article mentioned above.

We spent some time with customer's network admin to track what other ports are required so here they are. These other ports must be opened for full functionality of  SRM + vSphere Replication.

SourceTargetProtocol_Port
SRM SERVERVCENTER SERVERhttp_80, https_443, tcp_80, tcp_8095
SRM SERVERESX HOSTStcp/udp_902
VCENTER SERVERSRM SERVERhttp_9085, https_9086, tcp_8095, tcp_9085
REPLICATION APPLIANCEVCENTER SERVERtcp_80
REPLICATION APPLIANCEESX HOSTShttp_80, tcp/udp_902
ESX HOSTSREPLICATION APPLIANCEtcp_31031, tcp_44046
VCENTER SERVERVCENTER SERVERhttp_80, tcp_10443, https_443


If you use external MS-SQL database don't forget to allow network communication to database server. It is typically udp_1434 (MS-SQL Resolver) and tcp port of MS-SQL instance.

Credits: Network protocols and ports has been grabbed by customer's network admins (Ladislav Hajek and Ondrej Safranek) contributing with me on SRM project.

Storage design verification - performance test

I had a unique chance to work with relatively big customer on VMware vSphere Architecture Design from the scratch. I prepared vSphere Architecture Design based on their real business and technical requirements and the customer used the outcome to prepare hardware RFI and RFP to buy the best hardware technology on the market from technical and also cost point of view. Before design I did capacity and performance monitoring of customer's current environment and we used these numbers for capacity sizing of new infrastructure. I designed the logical hardware architecture of fully integrated compute/storage/network infrastructure blocks (aka PODs - Performance Optimized Datacenter) where PODs are leveraged as vSphere Clusters with predefined and well known performance characteristics and ratios among CPU, memory, storage and network.

We all know the most complicated is storage performance sizing especially with leveraging automated storage tiering technology existing in almost all modern storage systems. I was able to prepare some estimations based on standard storage calculations and my experience however we left final responsibility on hardware vendors and their technical pre-sales teams. Our requirement was pretty easy - 60TB of capacity and 25,000 IOPSes generated from servers in R/W ratio 70/30.

Validation and acceptance test of storage was clearly defined. The storage systems must be able to handle a 25,000 IOPS workload synthetically generated leveraging free tool IOmeter. Test environment was composed from 250 linux VMs with single Worker (IOmeter dynamo). All these workers were connected to single IOmeter GUI reporting total workload nicely in single place. Each of 250 workloads were defined as described below:
  • Outstanding IO: 1
  • IO size: 64KB
  • Workload pattern Random/Sequence ratio: 70:30
  • Read/Write Ratio: 70:30
Hardware vendor was informed that we will run this workload during 24 hours and we want to see average performance of 25,000 IOPSes with response times below 25 ms.

Selected hardware vendor delivered storage with following disk tiers:
  • Tier 1: 8x 400GB 6G SAS 2.5” MLC SSD R5 (3+1)
  • Tier 2: 128x 300GB 6G SAS 15K 2.5” HDD R5 (7+1)
  • Tier 3: 40x 900GB 6G SAS 10K 2.5” HDD R5 (7+1)
We asked hardware vendor how LUNs for vSphere datastores should be created to fulfil our capacity and performance requirement. The vendor recommended to leverage automated storage tiering and stretch the LUN across all disk tiers. We were able to choose particular disk tier for first write into LUN. It was selected to Tier 2. It is important to mention that process for automated storage tiering runs by default one a day and it can be changed. However from my experience it is usually even worse because if you generate continuous storage workload and AST background process starts then it generate another workload on already highly loaded storage and response times becomes unpredictable and sometimes it makes even bigger problem. AST is good technology for typical enterprise workloads when you have good capacity and performance ratio among Tiers and you have tiering window when your storage is lightly loaded and you can run AST background process to optimize your storage system. It's important to mention that AST requires really good planning and it is not good technology for continuous stress workload. But that's what hardware vendor pre-sales team has to know, right?

The result where we are right now is that we are able to achieve 15,600 front-end IOPSes which can be simply recalculated into back-end IOPSes based on read/write ratio and write penalty which is 4 for RAID 5. On figure below is screenshot from IOmeter just for illustration but final achievement was really 15,600 IOPS average from the beginning of the test.


Backend IOPSes = 10920 reads + ( 4 x 4680 writes) = 29,640 which can be recalculated into IOPSes per disk = 29640/128 = 231 IOPS. 231 IOPSes per 15k/rpm disk is pretty high and other Storage tiers are not leveraged so we are calling hardware vendor and asking how we can achieve our numbers.

BTW: this is acceptance hardware test and vendor has to prove this numbers otherwise he has to upgrade his storage (at his expense) or take the hardware out and return money.

To be continued ... stay tuned ...

UPDATE: Long story short ... at the end of the day storage vendor had to add additional disk enclosures with more spindles. Storage vendor had to pay it and it is worth to mention that it was significant additional cost covered 100% from his margin!!! No additional single cent paid by my customer. It is just another reason to engage subject matter expert for Infrastructure Design because when logical infrastructure design along with test plan is prepared before RFI and RFP your RFP strict requirements can be properly written and clearly articulated to all tender participants. 

VMware vSphere 5 Memory Management and Monitoring

Do you think you fully understand VMware vSphere ESXi memory management?
Compare your understanding with memory diagram at  VMware KB 2017642.

Now another question. Do you still think you are able to exactly know how much memory is used and how much is available? Do you? It is very important to know that this task is complex in any operating system because of lot of memory virtualization layers, memory garbage algorithm, caching, buffering, etc .... therefore nobody is able to know exact numbers. Of course you can monitor ESX memory usage but that is always estimated number.

Real memory over allocation and potential memory issue can be monitored by several mechanisms

  • Running VMs ballooning - because ballooning starts only when there is not enough memory
  • VMs (ESX) swapping - mainly swap in/out rate higher then 0 because that's the real indication you have memory problem

Wednesday, February 26, 2014

DELL Force10 S4810 fans

The S4810 comes from the factory with one power supply and two fan modules installed in the chassis. Both the fan module and the integrated fan power supply are hot-swappable if a second (redundant) power supply is installed and running. With redundant power supplies, traffic will not be interrupted if a fan module is removed. In addition to the integrated fan power-supply modules, fan modules can be ordered separately and additional modules can be inserted in the chassis.


The S4810 system fans are supported with two air-flow options. Be sure to order the fans that are suitable to support proper ventilation for your site. Use a single type of fan in your system. Do not mix Reverse and Normal air-flows in a single chassis. The system will shut down in one minute if the airflow directions are mismatched.

Air-flow options:
  •     Normal is airflow from I/O panel to power supply
  •     Reversed is airflow from power supply to I/O panel

So if you want to use S4810 as a top of rack switch for servers in the server rack you probably want to have ports (I/O panel) on the rear of the rack to simplify cable management. The reversed air-flow option is the way to go for this use case.

Monday, February 24, 2014

VMware vShiled Manager - VXLAN limit

We all know that all technologies has some limits. Only important thing is to know about particular limits limiting your solution.

Do you know VMware vShield Manager has limit for number of virtual networks?

There is the limit 5,000 networks even you use VXLAN network virtualization. So even VXLAN can have theoretically up to 16M segments (24-bit segment ID) you are effectively limited to 5,000 which is not significantly more then old VLAN ID limit of 4,096 (12-bit segment ID).

The most strange thing is that this limit is not documented on vSphere Configuration Maximums. There are documented only following network limits:
  • Static/Dynamic port groups per distributed switch = 6,500
  • Ports per distributed switch = 60,000
  • vCloud Director "Number of networks" = 10,000
Thanks Tom Fojta for this information and link to VMware KB 2042799.

On top of that the current VMware VXLAN implementation provide VXLAN based network overlay only in single vCenter domain so it will not help you with network tunnel for DR (VMware SRM) use case where two vCenters are required.   

So only two benefits of current VMware VXLAN implementation I see today are:
  • software defined network segments in single vCenter domain allowing automation of VXLAN provisioning. Nice blog about it is here.
  • split between physical networking segments (system VLANs, internet VLANs, MPLS VLANs, ...) and multi-tenant virtual network segments used for tenant's private use.
To be honest even those two benefits are very useful and limits will increase year by year as technology evolves and matures. That's usual technology behavior.

Sunday, February 16, 2014

Good or Bad Backup Job?

Veeam is very good backup software specialized on agent-less VM backups. But we all  know that bugs are everywhere and Veeam is not the exception. If you have VMware vSphere VM with independent disk Veeam cannot successfully perform a backup. That's logical because independent disks cannot have snapshots which are mandatory for agent-less VM backups leveraging VMware API for Data Protection (aka VADP). However the problem is that backup job of independent virtual disk is green. That can give you impression that everything is OK. But it is not. You have false expectation that you have correct backup. But you haven't and if you don't check logs you can find it really late ... during restore procedure which is not possible.

You can see what happen below on the screenshot.     

Click to enlarge

The correct behavior would be if backup job fails and backup administrator can repair the issue. This behavior was seen in Veeam version 6.5. Veeam support has been informed about this wrong behavior so it hopefully will be repaired in the future.

Performance Data charts for datastore LUNs report the message: No data available

Performance Data charts for datastore LUNs are extremely useful to have clue to understand storage performance trend.

However sometimes you can see message like this
"Performance Data charts for datastore LUNs report the message: No data available"
I didn't know the root cause. Recently colleague of mine told me he has found what is the root cause which is described at VMware KB 2054403.

Workaround is to not use network virtual adapter E1000E. If you have larger environment it's not big fun to search these adapters. My colleague wrote useful PowerCLI one-liner to find VM with E1000E which should be manually changed. Here is the my colleague's script:
Get-VM | Get-NetworkAdapter | Where-object {$_.Type -like "Unknown" -or $_.Type -like "E1000E" } | Select @{N="VM";E={$_.Parent.Name}},Name,Type | export-Csv  c:\VM-Network_Interface.csv -NoTypeInformation 

He asked me to share this information with community so enjoy it.

Wednesday, February 12, 2014

Sunday, February 09, 2014

VMware vSphere: Migration of RDM disk to VMDK

I have received following question from my customer ...
"We have business critical application with MS-SQL running in virtual machine on top of VMware vSphere. OS disk is vmdk but data disk is on RDM disk. We want to get rid of RDM and migrate it into normal vmdk disk. We know there are several methods but we would like to know the safest method. We cannot accept too long service downtime but we prefer certainty against shorter down time."
Let's write down customer requirements
  • migrate RDM into VMDK
  • migrate business critical application
  • service downtime as shorter as possible
  • guarantee seamless migration
So here are my recommended options ...

IMPORTANT: First of all you have to do backup before any migration.

Assumptions
  • RDM disk is in virtual mode (if not, you have to remove physical RDM from VM and connect RDM in virtual mode)
  • Latest system and data backup exist
  • At least two datastores exists. One where VM currently resides and second one where you will do migration.
  • Just for Option 1: Experience with VMware Cold Migration
  • Just for Option 2: Experience with VMware live Disk Migration (aka Storage vMotion)
  • Just for Option 2: Availability of VMware vSphere Storage vMotion licence
Option 1 - Cold Migration
Procedure
  1. Shutdown operating system
  2. Use VMware Migrate function and migrate VM in power off state to another datastore. You must select another virtual disk format (for example Lazy Zeroed) and another datastore than VM current datastore. It will convert RDM to VMDK during migration.
  3. Power On VM and validate system functionality
Option 2 - Live Migration without server downtime
Procedure
  1. Use VMware Migrate function and migrate VM in power on state to another datastore. You must select another virtual disk format (for example Lazy Zeroed) and different datastore than VM currently resides. It will convert RDM to VMDK during data migration.
  2. Validate system functionality
Options comparison

Option 1
Advantages
  • system is in power off so it is just disk conversion which is very safe method
Drawbacks
  • offline migration which means service downtime
Option 2
Advantages
  • No service downtime because of online disk migration without service disruption
  • Leveraging your investment into VMware enterprise capabilities 
Drawbacks
  • potential issues specially on disks with high load
  • if there is high disk load on RDM migration will generate another I/O which can lead into worse response times and overall service quality and availability
  • migration of system where all services are running so there is potential risk of data corruption but the risk is very low and mitigated by existing data backup
Dear Mr. customer. Final decision what method is better for your particular use case is up to you. Both methods are relatively safe but Option 1 is probably little bit safer and Option 2 is absolutely without downtime and totally transparent for running services inside VM.

There are even other methods how to convert RDM to VMDK but these two options are relatively easy, fast, save and doesn't require any special software. It simply leverage native vSphere capabilities.

Hope this helps.

Wednesday, February 05, 2014

Configure default settings on a VMware virtual distributed switch


Original blog post and full text is here. All credits go to http://kickingwaterbottles.wordpress.com

Here is the PowerCLI script that will set the ‘Teaming and Failover’ defaults on the vDS to work with etherchannel and two active uplinks.

connect-viserver vCenter
$vDSName = “”
$vds = Get-VDSwitch $vDSName
$spec = New-Object VMware.Vim.DVSConfigSpec
$spec.configVersion = $vds.ExtensionData.Config.ConfigVersion

$spec.defaultPortConfig = New-Object VMware.Vim.VMwareDVSPortSetting
$uplinkTeamingPolicy = New-Object VMware.Vim.VmwareUplinkPortTeamingPolicy

# Set load balancing policy to IP hash
$uplinkTeamingPolicy.policy = New-Object VMware.Vim.StringPolicy
$uplinkTeamingPolicy.policy.inherited = $false
$uplinkTeamingPolicy.policy.value = “loadbalance_ip”

# Configure uplinks. If an uplink is not specified, it is placed into the ‘Unused Uplinks’ section.
$uplinkTeamingPolicy.uplinkPortOrder = New-Object VMware.Vim.VMwareUplinkPortOrderPolicy
$uplinkTeamingPolicy.uplinkPortOrder.inherited = $false
$uplinkTeamingPolicy.uplinkPortOrder.activeUplinkPort = New-Object System.String[] (2) # (#) designates the number of uplinks you will be specifying.
$uplinkTeamingPolicy.uplinkPortOrder.activeUplinkPort[0] = “dvUplink1″
$uplinkTeamingPolicy.uplinkPortOrder.activeUplinkPort[1] = “dvUplink2″

# Set notify switches to true
$uplinkTeamingPolicy.notifySwitches = New-Object VMware.Vim.BoolPolicy
$uplinkTeamingPolicy.notifySwitches.inherited = $false
$uplinkTeamingPolicy.notifySwitches.value = $true

# Set to failback to true
$uplinkTeamingPolicy.rollingOrder = New-Object VMware.Vim.BoolPolicy
$uplinkTeamingPolicy.rollingOrder.inherited = $false
$uplinkTeamingPolicy.rollingOrder.value = $true

# Set network failover detection to “link status only”
$uplinkTeamingPolicy.failureCriteria = New-Object VMware.Vim.DVSFailureCriteria
$uplinkTeamingPolicy.failureCriteria.inherited = $false
$uplinkTeamingPolicy.failureCriteria.checkBeacon = New-Object VMware.Vim.BoolPolicy
$uplinkTeamingPolicy.failureCriteria.checkBeacon.inherited = $false
$uplinkTeamingPolicy.failureCriteria.checkBeacon.value = $false

$spec.DefaultPortConfig.UplinkTeamingPolicy = $uplinkTeamingPolicy
$vds.ExtensionData.ReconfigureDvs_Task($spec)
and here is simplified version


$vDSName = “XXX”  ## << dvSwitch name
$vds = Get-VDSwitch $vDSName
$spec = New-Object VMware.Vim.DVSConfigSpec
$spec.configVersion = $vds.ExtensionData.Config.ConfigVersion

$spec.defaultPortConfig = New-Object VMware.Vim.VMwareDVSPortSetting
$uplinkTeamingPolicy =  New-Object VMware.Vim.VmwareUplinkPortTeamingPolicy

# Set load balancing policy to IP hash
$uplinkTeamingPolicy.policy = New-Object VMware.Vim.StringPolicy
$uplinkTeamingPolicy.policy.inherited = $false
$uplinkTeamingPolicy.policy.value = “loadbalance_ip”   ## << Teaming Policy Type

$spec.DefaultPortConfig.UplinkTeamingPolicy = $uplinkTeamingPolicy
$vds.ExtensionData.ReconfigureDvs_Task($spec)
 

Network Port list of vSphere 5.5 Components

Year by year vSphere platform becomes more complex. It is pretty logical as Virtualization is de facto standard on modern datacenters and new enterprise capabilities are required by VMware users.

At the beginning of Vmware Server Virtualization there were just vCenter (Virtual Center, database and simple integration with active directory). Today vSphere management plane is composed from more software components integrated over network. So it becomes more complex ... 

Although using, consulting and architecting vSphere daily, sometimes I get lost in the network ports of vSphere components.

That's the reason I have created and will maintain following vSphere Component network ports table.

ComponentL7 ProtocolL3 Protocol/Port
vCenter  Single Sign-Onhttpstcp/7444
vSphere Web Client HTTPS port
https://WebClient_host_FQDN_or_IP:9443
httpstcp/9443
vSphere Web Client HTTP porthttptcp/9090
vCenter Inventory Service
https://Inventory_Service_host_FQDN_or_IP:10443
httpstcp/10443
vCenter Inventory Service management portunknowntcp/10109
vCenter Inventory Service Linked Mode communication portunknowntcp/10111
vCenter SSO Lookup Service
https://SSO_host_FQDN_or_IP:7444/lookupservice/sdk
httpstcp/7444
vCenter Server HTTPS porthttpstcp/443
vCenter Server HTTP porthttptcp/80
vCenter Server Management Web Services HTTPhttptcp/8080
vCenter Server Management Web Services HTTPShttpstcp/8080
vCenter Server Web Service - Change Service Notificationhttpstcp/60099
vCenter Server Appliance (VCSA) - VAMI management GUI
https://VCSA_host_FQDN_or_IP:5480
httpstcp/5480
I'll add other components to the list as needed in the future ...

Monday, February 03, 2014

DELL Storage useful links

Shared storage is essential  and common component in today's era of modern virtualized datacenters. Sorry hyper-converged evangelists, that's how it is today :-) DELL has two very popular datacenter storage products EqualLogic and Compellent. Useful links for datacenter architects and/or administrators are listed below.
 

EqualLogic
Switch Configuration Guides for EqualLogic SANs provide step-by-step instructions for configuring Ethernet switches for use with EqualLogic PS Series storage using Dell Best Practices.

Another switch configuration guides are in "Rapid EqualLogic Configuration Portal by SIS"

Compellent

Friday, January 31, 2014

Working with VCSA embedded database

It's not often but sometimes you have to work with vCenter database. Usually it should be done only if you are instructed by VMware Support or there is VMware KB article (like this one http://kb.vmware.com/kb/1005680) solving your problem.

Please do it very carefully in production systems.

VMware vSphere admin veterans usually have experience with MS-SQL but what about vCenter Server Appliance (VCSA) with embedded database? It is not very different. VMware uses Postgresql database (aka vPostgres) so logically it is the same as in any other SQL database. I would say even easier than in MS-SQL but that's highly dependent on administrator background and previous experience. I'm probably biases due to my *nix history and open-source (GNU) general preference.

Here are basic logical steps how to work with vCenter database.
  • Connect to database server
  • Discover database tables
  • Issue SQL commands
  • Exit from database server
CONNECT TO DATABASE SERVER

Change working directory to vpostgres
cd /opt/vmware/vpostgres/current/bin/
Display database configuration
cat /etc/vmware-vpx/embedded_db.cfg
output should looks like
EMB_DB_INSTALL_DIR='/opt/vmware/vpostgres/9.0'
EMB_DB_TYPE='PostgreSQL'
EMB_DB_SERVER='127.0.0.1'
EMB_DB_PORT='5432'
EMB_DB_INSTANCE='VCDB'
EMB_DB_USER='vc'
EMB_DB_PASSWORD='WZL2^y<-k8boy br="" fa="">EMB_DB_STORAGE='/storage/db/vpostgres'
connect to database
./psql VCDB -U vc
Update 2015-09-15: For VCSA 6 use /opt/vmware/vpostgres/current/bin/psql 
-d VCDB -U postgres (password is not required)
and you are in.

DISCOVER DATABASE TABLES

It's really good to know what tables are in the database. You need table names to compose SQL commands allowing you to select, insert and update data in the database.

Postgresql have special DBA (database administrator) commands witch start with character \ (slash). You can list all DBA commands by sequence \?

The output looks like this
vc01:/opt/vmware/vpostgres/current/bin # ./psql VCDB -U vc
psql.bin (9.0.13)
Type "help" for help.

VCDB=> \?
  \d[S+]                 list tables, views, and sequences
  \d[S+]  NAME           describe table, view, sequence, or index
  \da[S]  [PATTERN]      list aggregates
  \db[+]  [PATTERN]      list tablespaces
  \dc[S]  [PATTERN]      list conversions
  \dC     [PATTERN]      list casts
  \dd[S]  [PATTERN]      show comments on objects
  \ddp    [PATTERN]      list default privileges
  \dD[S]  [PATTERN]      list domains
  \des[+] [PATTERN]      list foreign servers
  \deu[+] [PATTERN]      list user mappings
  \dew[+] [PATTERN]      list foreign-data wrappers
We want list database tables so the command we are looking for is
\dt
where output looks like
                    List of relations
 Schema |              Name              | Type  | Owner
--------+--------------------------------+-------+-------
 vpx    | vpx_access                     | table | vc
 vpx    | vpx_alarm                      | table | vc
 vpx    | vpx_alarm_action               | table | vc
 vpx    | vpx_alarm_disabled_actions     | table | vc
 vpx    | vpx_alarm_expr_comp            | table | vc
 vpx    | vpx_alarm_expression           | table | vc
 vpx    | vpx_alarm_repeat_action        | table | vc
 vpx    | vpx_alarm_runtime              | table | vc
 vpx    | vpx_alarm_state                | table | vc
 vpx    | vpx_binary_data                | table | vc
 vpx    | vpx_bulletin_operation         | table | vc
 vpx    | vpx_change_tag                 | table | vc
 vpx    | vpx_compliance_status          | table | vc
 vpx    | vpx_compute_res_failover_host  | table | vc
 vpx    | vpx_compute_res_user_hb_ds     | table | vc
 vpx    | vpx_compute_resource           | table | vc
 vpx    | vpx_compute_resource_das_vm    | table | vc
 vpx    | vpx_compute_resource_dpm_host  | table | vc
 vpx    | vpx_compute_resource_drs_vm    | table | vc
 vpx    | vpx_compute_resource_vsan_host | table | vc
ISSUE SQL COMMANDS

If we want select and view some data from database we use SQL statement SELECT. As an example  we will use first table from the list an it is vpx_access. Table vpx_access contains all vCenter users/groups who has access to vCenter and their roles. Here is SELECT statement:
select * from vpx_access
and output

 id  |          principal          | role_id | entity_id | flag
-----+-----------------------------+---------+-----------+------
   1 | root                        |      -1 |         1 |    1
 101 | VSPHERE.LOCAL\Administrator |      -1 |         1 |    1
 201 | VPOD01\vsphere-admins       |      -1 |         1 |    3
(3 rows)
Update and delete statements can be composed in similar manner following ANSI SQL Standard. Postgresql is ANSI-SQL:2008 standard.

EXIT FROM DATABASE SERVER

To exit from database server simply use DBA command \q

That's it pretty easy, isn't it? Working with vCenter database is not daily task of vSphere admin however we all know that sometimes you can be instructed by VMware support or KB to change something in the database. Don't be afraid - it's easy.

Saturday, January 18, 2014

DELL NPAR and VMware vSphere

DELL NPAR is Network Partitioning of single 10Gb NIC or better to say 10Gb CNA (Converged Network Adapter). NPAR technology is implemented on modern Broadcom and QLogic CNAs which allows to split single physical NIC up to 4 logical NICs. More about NPAR can be found for example here or here.

Please be aware that
  • NPAR is not implemented on Intel 10G NIC (X520, X540)
  • NPAR is not SR-IOV. More about SR-IOV is here and here.
The biggest NPAR value propositions are
  • More logical interfaces partitioned from single interface which appears in the OS as normal PCI-e adapter.
  • Switch independent solution. I'll explain what does it mean in the minute.
I have seen several customers complaining about NPAR. NPAR is just another technology and each technology has to be used correctly with respect for what purpose it was invented and designed. I have depicted NPAR architecture in the drawing bellow.


Let's describe the picture. On the picture you can see one physical server with ESXi hypervisor and two CNAs. Each CNA is divided into four logical partitions where each partition act as independent NIC with unique MAC address. You can see two physical wires interconnecting CNA ports with switch ports. Inside each physical wire are four "virtual wires" interconnecting CNA logical interfaces with single physical switch port. That's important!!! Four virtual ports on CNA are connected into single switch port. You can imagine it like four connectors on one side of the wire and just single connector on the other side.

That's not common, right?
The benefit of this architecture is switch independence.
The drawback is that ethernet flows between NPAR interfaces on single CNA port will fail.

So with this information in the mind let's explain NPAR architecture behavior in bigger detail.

Physical switch will never forward Ethernet frame back to the port from which the frame is coming. So, if src-mac and  dst-mac is on the same physical port switch (these are entries in switch mac-address-table) the L2 communication will be broken. That’s standard Ethernet switch behavior.
So what happen in NPAR architecture where are 4 virtual cables (NPAR interfaces with independent MAC addresses) connected into single physical switch port? No communication.

It is shown on picture below.




That’s the reason CISCO has VN-TAG (802.1Qbh) and HP has multi-channel VEPA (802.1Qbg)
These solutions multiplex Ethernet on both sides of the wire.

Note:
I have hands-on experirence with CISCO VN-TAG so I can admit it works correctly but I have never tested HP VEPA.

NPAR is relatively good technology to separate and prioritize Storage and Ethernet traffic on unified (converged) ethernet networks. It can be also used to separate and prioritize L2 traffic. But it will not work if L2 communication between logical NPAR interfaces are required.

Problematic scenarios can be for example following configurations
  • vCenter in VM <-> ESX vmkernel management port in the same L2 segment but different portgroups routed through separated NPAR interfaces (uplinks) as depicted above.
  • Cisco Nexus 1000v VSM in VM <-> ESX VEM communicate over L2 protocol routed through separated NPAR interfaces.
Hope this helps DELL and VMware community.

Monday, January 13, 2014

Deploying ESXi 5.x using the Scripted Install feature

Unfortunately I had no chance to design and implement automated vSphere deployment for any customer. I tried several automated deployment possibilities in the lab but I have never met the customer with such requirement. That's probably because right now I do vSphere consulting for small country in the middle of Europe where 32 ESX farm is "PRETTY BIG" vSphere environment ;-)
 
Nevertheless, excellent VMware KB article about PXE & KickStart file method of ESXi scripted installation is here.

Sunday, January 12, 2014

VMware Update Manager DELL depot

DELL has VMware Update Manager (VUM) Depot at https://vmwaredepot.dell.com/index.xml

You can simply add the depot into VUM  Download Settings. It should looks like on the screenshot below.


You have to wait for next download task or you can click button "Download Now" to start download patches immediately. When patches are downloaded you can see them in "Patch Repository".


Why someone would use DELL VUM Depot? There are two DELL software components simplifying hardware management.

First component is OpenManage (a.k.a OpenManage Server Administrator or OMSA). This component is necessary when you want integrate your ESX host with 1:many management console OpenManage Essentials or with vSphere Management Plugin called "OpenManage Integration for VMware vCenter"

Second Component is iSM - Integrated Dell Remote Access Controller(iDRAC) Service Module. It is a lightweight optional software application that can be installed on Dell 12G Server or later. The iDRAC Service Module complements iDRAC interfaces – Graphical User Interface (GUI), RACADM CLI and Web Service Management (WSMAN) with additional monitoring data.

The nice thing on VUM is that everything is done automatically based on baselines and you don't need to search what version of plugin you need for different ESX versions.

Maybe you know I work for DELL Global Infrastructure Services so I can stop here. However I often do consulting for customers running non-DELL equipment in their datacenters. Right now designing vSphere on HP Blade system and 3PAR storage. So for HP hardware you can add HP VUM depot located at http://vibsdepot.hp.com/index.xml

Saturday, January 04, 2014

VMware All Paths Down (aka APD)

All Paths Down (APD), a feature of the VMware ESXi host used in cases where all paths to the VM
go down because of storage failure or administrative error, is properly handled in ESX 5.1 as a
result of feature enhancement performed by VMware. Previously, in ESX versions 5.0 or 4.1, the
host would try continuously to revive the storage links and, as a result, performance would be
impacted for working VMs. A host reboot was required to clear this error.

I was engaged by several customers impacted with APD issue and it was always disaster. If you operate ESX 5.0 and older consider upgrade to ESX 5.1 or even better to ESX 5.5.

What is SAN Fill Word?

This is snip from Brocade SAN Admin Best Practicies ...

Note: Fill Word (apply for 8 Gbps platform only)

Prior to the introduction of 8 Gb, IDLEs were used for link initialization, as well as fill words after link initialization. To help reduce electrical noise in copper-based equipment, the use of ARB (FF) instead of IDLEs was standardized. Because this aspect of the standard was published after some vendors had already begun development of 8 Gb interfaces, not all equipment can support ARB (FF). IDLEs are still used with 1, 2, and 4 Gb interfaces. To accommodate the new specifications and different vendor implementations, Brocade developed a user-selectable method to set the fill words to either IDLEs or ARB (FF). Currently, setting the fill word can be done only via the CLI command portCfgFillWord (Ex: portcfgfillword [slot/]port, mode). There are four modes:

Mode 0 - Use IDLEs in link initialization and IDLEs as fill word (default mode).
Mode 1 - Use ARB (FF) in link initialization and ARB (FF) as fill words.
Mode 2 - Use IDLEs in link initialization and ARB (FF) as fill words.
Mode 3 - Try Mode 1 first; if it fails, then try Mode 2.

Traffic outside of frame traffic is made up of fill words: IDLEs or ARB (F0) or ARB (FF). Encoding errors on fill words are generally not considered impactful. This is why you may see very high counts of enc_out (encoding outside of the frame) and not have customer traffic affected. If many fill words are lost at once, the link may lose synchronization. On standard E_Ports, primitives are set to ARB, regardless of the portcfgfillword setting when not in R_RDY mode.

The recommended best practices are:
  • Ensure that the fill word is configured to Mode 3.
  • When connecting to a HDS storage device, set to Mode 2.
  • When upgrading firmware, recheck the settings, since the fill word primitive has evolved over several Brocade FOS releases.

Friday, January 03, 2014

Do you know - MS Excel max file path is 213?

I have just tried open the .xls file in MS Excel 2010 and it failed with message like ...

"File could not be found. Check the spelling of the file name, and verify that the file location is correct."
... and because I've open the file by double click I was pretty sure file exists. BTW Notepad was able to open it. So what's the hell? The only idea what could be wrong was the absolute path length to the file. So I tried what is the maximum file path and I was surprised it is just 213 characters!!!

It's good to know, isn't it?

Thursday, January 02, 2014

GSM/GPRS Modem Siemens ES75 - usefull AT commands

I have been asked by one customer to prepare some automated system which can dial admin cellular phone number in case of any trouble. They use PRTG for monitoring their environment. PRTG is IMHO very good monitoring system. It can send an email notification when sensor is down or some threshold is matched. Email is OK but when you have 24/7/365 SLAs it is important to know about critical events as soon as possible. My idea was to prepare simple system which checks periodically PRTG sensors over API and dial cellular phone in case of any critical sensor downtime.

So here is the system description. Hardware is based on SOEKRIS or ALIX hardware systems with FreeBSD installed on read-only CompactFlash. I use GSM modem Siemens ES75 connected via RS-232 serial cable to dial GSM phone number.

This blog post is not about hardware, FreeBSD or PRTG API integration but about Siemens ES75 usage but I believe recent overview is important to show you full context.

So, first of all we have to connect to the modem. We need some terminal emulator like Windows Hyper Terminal, putty, Minicom, etc. I use default unix terminal programm cu.

Default terminal speed of Siemens ES75 is 115200 bauds.

So here here is cu command syntax to connect modem over my USB<->RS-232 reduction for Mac.  
cu -s 115200 -l /dev/tty.usbserial-00007324
If you have FreeBSD the cu syntax is the same. Only COM port device is different. Below is connection over COM2 (/dev/cuau1).
cu -s 115200 -l /dev/cuau1
or

cu -s 115200 -l /dev/ttyU0
So when we are connected to the modem we can use AT commands to work with modem. Useful AT commands follows.  

Set the modem into factory defaults
at&f
If you want disable echo use
ate0
to enable echo use
ate1
Write running configuration to EEPROM
at&w
To slow down modem terminal speed to 38400 bauds
at+ipr=38400
Get modem vendor
at+cgmi 
Get modem model
at+cgmm
In my modem Siemens ES75 it Vendor and Model strings looks like this
at+cgmi
Cinterion
 
OK
at+cgmm
MC75i
OK 
To display signal strength of the device  
at+csq
Returned signal value can be compared with table here.

Display SIM card identification number
at^scid
Extended event indicator control
at^sind
Here is example how to get all available  indicators
at^sind?
^SIND: battchg,1,5
^SIND: signal,1,99
^SIND: service,1,0
^SIND: sounder,1,0
^SIND: message,1,1
^SIND: call,1,0
^SIND: roam,1,0
^SIND: smsfull,1,0
^SIND: rssi,1,4
^SIND: audio,0,0
^SIND: simstatus,0,5
^SIND: vmwait1,0,0
^SIND: vmwait2,0,0
^SIND: ciphcall,0,1
^SIND: adnread,0,1
^SIND: eons,0,0,"","T-Mobile CZ"
^SIND: nitz,0,,,
^SIND: lsta,0,0
^SIND: band,0,3
^SIND: simlocal,0,1
OK
Before you can use GSM network you usually have to register and authenticate by your PIN. Here is example of AT+CPIN read command which will return if SIM PIN authentication is required.

at+cpin?
+CPIN: SIM PIN
OK

The return is SIM PIN so it means we have to enter PIN to register in to GSM network. Here is how to authenticate with PIN 3303

at+cpin=3303
OK


Right now we are registered in GSM network.  You can verify it by running AT+CPIN? read command again
at+cpin=?
OK 

There is no other authentication required so this is the proof we are registered in GSM network and we can use it. If you want completely disable PIN authentication you can use command

at+clck="SC",0,"3303"
So now let's call some mobile number.

atd602123456;
BUSY
Here I dialed phone number 602123456 on my mobile and because I dropped the call the status was  returned as BUSY.

And if you want to check incoming calls during the ringing you can see on terminal

RING

RING

RING

for every ring.

If you want to see caller phone number (aka calling line identification presentation) then you have to instruct modem by following command

at+clip=1 
OK
and during ringing you will also see caller identification

RING

+CLIP: "+420602123456",145,,,,0

RING

+CLIP: "+420602123456",145,,,,0



RING



+CLIP: "+420602123456",145,,,,0

or you can ask for caller phone number during ringing by command
at+clcc
and response is

RING

RING

RING
at+clcc
+CLCC: 1,1,4,0,0,"+420602525736",145

OK

RING

RING

And if you want to hang up incoming call you can use following command
ath
OK 
That's it for now. If you need more AT commands for GSM modem Siemens ES75 ask google for document "mc75_atc_01001_eng.pdf". I found one document here