Wednesday, April 08, 2015

Force10 Link Dampening

First of all let's explain why we should use Link Dampening?

Interface state changes occur when interfaces are administratively brought up or down or if an interface state changes. Every time an interface changes a state or flaps, routing protocols are notified of the status of the routes that are affected by the change in state. These protocols go through the momentous task of re-converging. Flapping; therefore, puts the status of entire network at risk of transient loops and black holes. Link dampening minimizes the risk created by flapping by imposing a penalty for each interface flap and decaying the penalty exponentially. After the penalty exceeds a certain threshold, the interface is put in an Error-Disabled state and for all practical purposes of routing, the interface is deemed to be “down.” After the interface becomes stable and the penalty decays below a certain threshold, the interface comes up again and the routing protocols re-converge.

Dampening parameters:
Syntax: dampening [[[[half-life] [reuse-threshold]] [suppress-threshold]] [max-suppress-time]]
·         half-life
o    The number of seconds after which the penalty is decreased. The penalty decreases half after the half-life period expires. The range is from 1 to 30 seconds. The default is 5 seconds.
·         reuse-threshold
o    The number as the reuse threshold, the penalty value below which the interface state is changed to “up”. The range is from 1 to 20000. The default is 750.
·         suppress-threshold
o    The number as the suppress threshold, the penalty value above which the interface state is changed to “error disabled”. The range is from 1 to 20000. The default is 2500.
·         max-suppress-time
o    The maximum number for which a route can be suppressed. The default is four times the half-life value. The range is from 1 to 86400. The default is 20 seconds.

Dampening algorithm:
With each flap, Dell Networking OS penalizes the interface by assigning a penalty (1024) that decays exponentially depending on the configured half-life. After the accumulated penalty exceeds the suppress threshold value, the interface moves to the Error-Disabled state. This interface state is deemed as “down” by all static/dynamic Layer 2 and Layer 3 protocols. The penalty is exponentially decayed based on the half-life timer. After the penalty decays below the reuse threshold, the interface enables.

Dampening settings timing example: 
Lets say we have dampening 10 100 1000 60
·         half-life = 10 seconds
·         reuse-threshold = 10
·         suppress-threshold = 1000
·         max-suppress-time = 60 second
Time after flap
Penalty
Port state
Comment
0s
1024
Down
Penalty set to 1024
Penalty (1024) > Supress-threshold (1000)  then port state down
10s
512
Down
Penalty set to 1024 / 2
Penalty (512) > Reuse-threshold (100) then port state still down
20s
256
Down
Penalty set to 512 / 2
Penalty (256) > Reuse-threshold (100) then port state still down
30s
128
Down
Penalty set to 256 / 2
Penalty (256) > Reuse-threshold (100) then port state still down
40s
64
Up
Penalty set to 128 / 2
Penalty (64) < Reuse-threshold (100) then port state is changed to UP



Saturday, April 04, 2015

ESXi root password complexity

Warning: This is just for lab experimenting and not for production use. 

When experimenting with ESXi in the lab sometimes you have to reset ESXi to default settings. After "Reset System Configuration"from DCUI your password is removed and you have to set the new one.  I prefer to have simple root password in the lab. However ESXi requires pretty strength password complexity and below is procedure how to decrease it.

1/ Login to ESXi shell console.

2/ Edit /etc/pam.d/passwd  (vi /etc/pam.d/passwd)
By default password complexity is set like that
password     requisite    /lib/security/$ISA/pam_passwdqc.so retry=3 min=disabled, disabled,disabled,7,7

3/ Change password requisite to
password     requisite    /lib/security/$ISA/pam_passwdqc.so retry=3 min=disabled, disabled,disabled,1,1

4/ Change root password by command passwd

For  more information look at vSphere documentation.

Tuesday, March 31, 2015

VCDX Application submitted - time for mock defenses

I have just submitted my VCDX application for June defense in Frimley, UK. I assume all my readers know what VCDX stands for. For those who don't look at VCDX.vmware.com for further details. I don't want to write about VCDX defense process, preparation, etc. because there are lot of other blog posts and resources available on the internet.

I think that VCDX is about continuous lifelong learning at home and practicing in the field. However I believe that learning must be significantly boosted before the defense because in VCDX panel are sitting the most skilled vSphere architects on this planet. Therefore your success probability increases when you are prepared for any question regarding your design.

Preparing together is better. That's the reason I'm looking for other VCDX candidates already submitted VCDX applications and targeting July defense. I would be more then happy to organize study sessions or mock defenses over the webex.

Below are session times best suiting me. However, if you prefer another time just write a comment or send a tweet to @david_pasek and I can arrange another sessions.

All times are in Central European Time (GMT+2). If you want to register send a tweet to @david_pasek or post a comment with date(s) you are planning to attend.


Session time
Location & Topic
Attendees

April 06, Mon 
9pm – 11pm
Location: webex TBD
Topic: TBD
David Pasek (O)
S
April 13, Mon
9pm – 11pm
Location: webex link
Topic: Mock defense
David Pasek (O)
Olivier B (A,P)
S
April 20, Mon
9pm – 11pm
Location: webex link
Topic: Mock defense
David Pasek (O,G)
Olivier B (A,P)
@nickbowienz(A,P)
Shady Ali (A)
Kiran Reid (A)
S
April 27 Mon 
9pm – 11pm
Location: webex link
Topic: Larus's Mock defense
David Pasek (O,P)
Larus Hjartarson(G)
Simon H. (P)
S
May 04 Mon 
9pm – 11pm
Location: webex link
Topic: Simon's Mock defense
David Pasek (O,P)
Larus Hjartarson(P)
Simon H. (G)
S
May 11, Mon
9pm – 11pm
Location: webex link
Topic: David's Mock defense
David Pasek (O,G)
Larus Hjartarson(P)
Simon H. (P)
S
May 18, Mon
9pm – 11pm
Location: webex link
Topic: Larus's Mock defense
David Pasek (O,P)
Larus Hjartarson(G)
Simon H. (P)
S
May 25, Mon
9pm – 11pm
Location: webex link
Topic: Simon's Mock defense
David Pasek (O)
Larus Hjartarson(P)
Simon H. (G)

June 01, Mon
 9pm – 11pm
Location: webex TBD
Topic: TBD







Legend:
S - Session scheduled
(O) - Organizer
(A) - Attendee
(P) - Panelist
(G) - VCDX candidate to be grilled :-)

Sunday, March 15, 2015

DELL Force10 : mVLT – Ethernet Loop Free Topology Design

Last week I have received following question from one of my reader …
I came to your blog post http://blog.igics.com/2014/05/dell-force10-vlt-virtual-link-trunking.html and I am really happy that you shared this information with us. However I was wondering if you have tested a scenario with 4 S4810 with VLT configured on 2 x 2 and connected together (somewhere called mLAG). How do you continue to add VLT couples to the setup? I would be really happy if you could provide any info regarding such setup.
So let’s deep dive into VLT port-channel between two Force10 VLT Domains also known as mVLT. Please note that VLT can be configured not only between two Force10 VLT domains but also between Force10 VLT domain and other multi chassis port-channel technology like for instance CISCO virtual Port Channel (vPC). However, this blog post is focused to single vendor solution mVLT on DELL S-Series Switches (previously known as Force10 S-Series).

If you are not familiar with DELL Force10 VLT technology read my previous blog post where is VLT described in detail. It is really important to understand VLT before you try to understand mVLT (Multi-domain VLT). By the way mVLT is called eVLT (Enhanced VLT) in Force10 documentation so it might be little bit confusing. Anyway mVLT is nothing else then regular virtual port channel (VLT) between  two VLT domains. Therefore mVLT is quite good term if you ask me.

mVLT Logical Design
mVLT logical design is pretty straight forward. It is required to achieve stretched L2 over two datacenters without any loops. This topology is often called loop free topology and it is depicted on figure below from spanning tree (STP) point of view.


However we would like to have hardware and link redundancy therefore multi chassis port-channel technology (Force10 VLT in our particular case) is used to still have simple loop free topology from spanning tree point of view but with switch unit and physical link redundancy. Force10 mVLT solution is logically depicted on figure below.


Please note, that each single VLT Domain act in spanning tree as a single logical switch.

DELL highly recommends using four links between VLT domains because of higher redundancy and optimal data flow. However, sometimes your are constraint with links between sites. Two links DCI is also supported design but not recommended because there is obviously lower link redundancy and therefore higher probability of communication over VLTi which adds hop and therefore latency. Two links mVLT DCI also known as square design is depicted on figure below. 


Even the topology is loop free and from logical view we have just one switch on each datacenter spanning tree protocol should be enabled and configured just in case of human error or VLT domain failure or split. Rapid Spanning Tree (RSTP) protocol is good enough therefore used later in physical configurations.

mVLT Physical Design and Configuration
Physical design below shows connectivity of four (2x two) Force10 S4810 switches leveraging four links for DCI port-channel (mVLT).


Physical design for just two links DCI is depicted on following schema.


And switch configuration snippets for four links mVLT are listed below for completeness. Two link DCI is just variation of similar configurations so you can simply reuse and slightly change four link configuration.

DCA-SWCORE-A – acts as primary Root Bridge in RSTP in case of loop
!
hostname DCA-SWCORE-A
!
protocol spanning-tree rstp
 no disable
 hello-time 1
 max-age 6
 forward-delay 4
 bridge-priority 4096
!
vlt domain 1
 peer-link port-channel 128
 back-up destination 172.16.201.2
 primary-priority 1
 system-mac mac-address 02:00:00:00:00:01
 unit-id 0
 peer-routing
!
 proxy-gateway lldp
  peer-domain-link port-channel 127
!
interface TenGigabitEthernet 0/46
 no ip address
 mtu 12000
 port-channel-protocol LACP
  port-channel 127 mode active
 dampening 10 100 1000 60
 no shutdown
!
interface TenGigabitEthernet 0/47
 no ip address
 mtu 12000
 port-channel-protocol LACP
  port-channel 127 mode active
 dampening 10 100 1000 60
 no shutdown
!
interface fortyGigE 0/56
 no ip address
 mtu 12000
 no shutdown
!
interface fortyGigE 0/60
 no ip address
 mtu 12000
 no shutdown
!
interface ManagementEthernet 0/0
 ip address 172.16.201.1/24
 no shutdown
!
interface Port-channel 127
 description "mVLT - interconnect link"
 no ip address
 mtu 12000
 switchport
 vlt-peer-lag port-channel 127
 no shutdown
!
interface Port-channel 128
 description "VLTi - interconnect link"
 no ip address
 mtu 12000
 channel-member fortyGigE 0/56,60
 no shutdown
!

DCA-SWCORE-B  – acts as secondary Root Bridge in RSTP in case of loop
!
hostname DCA-SWCORE-B
!
protocol spanning-tree rstp
 no disable
 hello-time 1
 max-age 6
 forward-delay 4
 bridge-priority 8192
!
vlt domain 1
 peer-link port-channel 128
 back-up destination 172.16.201.1
 primary-priority 8192
 system-mac mac-address 02:00:00:00:00:01
 unit-id 1
 peer-routing
!
 proxy-gateway lldp
  peer-domain-link port-channel 127
!
interface TenGigabitEthernet 0/46
 no ip address
 mtu 12000
 port-channel-protocol LACP
  port-channel 127 mode active
 dampening 10 100 1000 60
 no shutdown
!
interface TenGigabitEthernet 0/47
 no ip address
 mtu 12000
 port-channel-protocol LACP
  port-channel 127 mode active
 dampening 10 100 1000 60
 no shutdown
!
interface fortyGigE 0/56
 no ip address
 mtu 12000
 no shutdown
!
interface fortyGigE 0/60
 no ip address
 mtu 12000
 no shutdown
!
interface ManagementEthernet 0/0
 ip address 172.16.201.2/24
 no shutdown
!
interface Port-channel 127
 description "mVLT - interconnect link"
 no ip address
 mtu 12000
 switchport
 vlt-peer-lag port-channel 127
 no shutdown
!
interface Port-channel 128
 description "VLTi - interconnect link"
 no ip address
 mtu 12000
 channel-member fortyGigE 0/56,60
 no shutdown
!
DCB-SWCORE-A – acts as tertiary Root Bridge in RSTP in case of loop
!
hostname DCB-SWCORE-A
!
protocol spanning-tree rstp
 no disable
 hello-time 1
 max-age 6
 forward-delay 4
 bridge-priority 12288
!
vlt domain 2
 peer-link port-channel 128
 back-up destination 172.16.202.2
 primary-priority 1
 system-mac mac-address 02:00:00:00:00:02
 unit-id 0
 peer-routing
!
 proxy-gateway lldp
  peer-domain-link port-channel 127
!
interface TenGigabitEthernet 0/46
 no ip address
 mtu 12000
 port-channel-protocol LACP
  port-channel 127 mode active
 dampening 10 100 1000 60
 no shutdown
!
interface TenGigabitEthernet 0/47
 no ip address
 mtu 12000
 port-channel-protocol LACP
  port-channel 127 mode active
 dampening 10 100 1000 60
 no shutdown
!
interface fortyGigE 0/56
 no ip address
 mtu 12000
 no shutdown
!
interface fortyGigE 0/60
 no ip address
 mtu 12000
 no shutdown
!
interface ManagementEthernet 0/0
 ip address 172.16.202.1/24
 no shutdown
!
interface Port-channel 127
 description "mVLT - interconnect link"
 no ip address
 mtu 12000
 switchport
 vlt-peer-lag port-channel 127
 no shutdown
!
interface Port-channel 128
 description "VLTi - interconnect link"
 no ip address
 mtu 12000
 channel-member fortyGigE 0/56,60
 no shutdown
!

DCB-SWCORE-B – acts as quaternary Root Bridge in RSTP in case of loop
!
hostname DCB-SWCORE-B
!
protocol spanning-tree rstp
 no disable
 hello-time 1
 max-age 6
 forward-delay 4
 bridge-priority 16384
!
vlt domain 2
 peer-link port-channel 128
 back-up destination 172.16.202.1
 primary-priority 8192
 system-mac mac-address 02:00:00:00:00:02
 unit-id 1
 peer-routing
!
 proxy-gateway lldp
  peer-domain-link port-channel 127
!
interface TenGigabitEthernet 0/46
 no ip address
 mtu 12000
 port-channel-protocol LACP
  port-channel 127 mode active
 dampening 10 100 1000 60
 no shutdown
!
interface TenGigabitEthernet 0/47
 no ip address
 mtu 12000
 port-channel-protocol LACP
  port-channel 127 mode active
 dampening 10 100 1000 60
 no shutdown
!
interface fortyGigE 0/56
 no ip address
 mtu 12000
 no shutdown
!
interface fortyGigE 0/60
 no ip address
 mtu 12000
 no shutdown
!
interface ManagementEthernet 0/0
 ip address 172.16.202.2/24
 no shutdown
!
interface Port-channel 127
 description "mVLT - interconnect link"
 no ip address
 mtu 12000
 switchport
 vlt-peer-lag port-channel 127
 no shutdown
!
interface Port-channel 128
 description "VLTi - interconnect link"
 no ip address
 mtu 12000
 channel-member fortyGigE 0/56,60
 no shutdown
!

Conclusion

Force10 mVLT is great technology for loop free L2 network topology. It can be leveraged for local loop free topologies inside single datacenter or as L2 extension between datacenters. However our networks are usually built to support IP traffic therefore L3 considerations has to be addressed as well. Just think about default IP gateway behavior and potential DCI potential trombone.  That’s where other VLT features peer-routing and proxy-gateway come in to play and mitigate DCI trombone issue. You can see these technologies configured in VLT configurations above. But that’s another topic for another blog post.

To be absolutely honest I personally don't recommend L2 interconnects between datacenters without any good justification. I strongly recommend L3 datacenter interconnects and when stretched L2 is needed then some network overlay technology can be leveraged. L3 will guarantee independent availability zones and splitting L2 failure domain. But on the other hand such network overlay needs some other bits and pieces which in some cases increase complexity and cost. Therefore mVLT can be seriously considered for cost effective datacenter L2 extensions.  That's a typical "it depends" scenario where these two design decision options has to be compared and final decision clearly justified.   

If you want to know more about these technologies or use cases just ask and we can go deeper or broader. And as always any feedback and/or comment is highly appreciated.

Saturday, March 14, 2015

VMware Virtual SAN Diagnostics and Troubleshooting Reference Manual

Well known VMware's storage evangelist Cormac Hogan wrote and published another VMware VSAN related document. Well, it is the book having almost 300 pages. And the nice thing is that this document/book/manual is publicly available for free.

Snip from document Introduction Chapter ...
VMware’s Virtual SAN is designed to be simple: simple to configure, and simple to operate. This simplicity masks a sophisticated and powerful storage product. The purpose of this document is to fully illustrate how Virtual SAN works behind the scenes: whether this is needed in the context of problem solving, or just to more fully understand its inner workings.
Here is the link ... http://www.vmware.com/files/pdf/products/vsan/VSAN-Troubleshooting-Reference-Manual.pdf

So if you want to know VSAN details for diagnosis and troubleshooting you have to read it.