A quick NSX microsegmentation example

This short post demonstrates the power of NSX. My example is a DMZ full of webservers – you don’t want any of your webservers talking to each other. If one of your webservers happens to be compromised, you don’t want the attacker to then have an internal launching pad to attack the rest of the webservers. They only need to communicate with your application or database servers.

We’ll use my lab’s Compute Cluster A as an a sample. Just pretend it’s a DMZ cluster with only webservers in it.

Compute Cluster A


I’ve inserted a rule into my Layer 3 ruleset and named it “Isolate all DMZ Servers”. In my traffic source, you can see that you’re not stuck with IP addresses or groups of IP addresses like a traditional firewall – you can use your vCenter groupings like Clusters, Datacenters, Resource Pools, or Security Tags to name a few.

Rule Source

I add Computer Cluster A as the source of my traffic. I do the same for the destination.

NSX Source Cluster


My rule is now ready to publish. As soon as I hit publish changes, all traffic from any VM in this cluster will be blocked if it’s destined for any other VM in this cluster.

Ready to publish


Note that these were only Layer3 rules – so we’re secured traffic going between subnets. However, nothing’s stopping webservers on the same subnet from talking to each other. No worries here though, we can implement the same rule at layer 2.

Once this rule gets published, even VMs that are layer 2 adjacent in this cluster will be unable to communicate with each other!

NSX layer 2 block

This is clearly not a complete firewall policy as our default rule is to allow all. We’d have to do more work to allow traffic through to our application or database servers, and we’d probably want to switch our default rule to deny all. However, because these rules are tied to Virtual Center objects and not IP addresses, security policies apply immediately upon VM creation. There is no lag time between VM creation and application of the firewalling policy – it is instantaneous!  Anybody who’s worked in a large enterprise knows it can take weeks or months before a firewall change request is pushed into production.

Of course, you still have flexibility to write IP-to-IP rules, but once you start working with Virtual Center objects and VM tags, you’ll never want to go back.

Alzheimer’s Association – Forgotten Donation

Chris Wahl just put up this blog post showing the donation of royalties from his book, Networking for VMware Administrators. I won my copy for free and at the time I promised to donate to the Alzheimer’s association. I failed to do so, but I have rectified that today.

Below is my personal donation along with VMware’s matching gift. $31.41 is the minimum donation to receive a matching gift with VMware’s matching program.  alz-donation alz-matching

Moving VMs to a different vCenter

I had to move a number of clusters into a different Virtual Center and I didn’t want to have to deal with manually moving VMs into their correct folders. In my case I happened to have matching folder structures in both vCenters and I didn’t have to worry about creating an identical folder structure on the target vCenter. All I need to do was to record the current folder location and move the VM to the correct folder in the new vCenter.

I first run this script against the source cluster in the source vCenter. It generates a CSV file with the VM name and the VM folder name

$VMCollection = @()
Connect-VIServer "Source-vCenter
$CLUSTERNAME = "MySourceCluster"
$vms = Get-Cluster $CLUSTERNAME | Get-VM
foreach ( $vm in $vms )
	$Details = New-Object PSObject
	$Details | Add-Member -Name Name -Value $vm.Name -membertype NoteProperty 
	$Details | Add-Member -Name Folder -Value $vm.Folder -membertype NoteProperty
	$VMCollection += $Details
$VMCollection | Export-CSV "folders.csv"

Once the first script is run, I disconnect each host from the old vCenter and add it into a corresponding cluster in the new vCenter. I can now run this command aginst the new vCenter to ensure the VMs go back into their original folders.

Connect-VIServer "Dest-vCenter"
$vmlist = Import-CSV "folders.csv"
foreach ( $vm in $vmlist )
	Move-VM -VM $vm.Name -Destination $vm.Folder

The parent virtual disk has been modified since the child was created

Some VMs in my environment had virtual-mode RDMs on them, along with multiple nested snapshots. Some of the RDMs were subsequently extended at the storage array level, but the storage team didn’t realize there was an active snapshot on the virtual-mode RDMs. This resulted in immediate shutdown of the VMs and a vSphere client error “The parent virtual disk has been modified since the child was created” when attempting to power them back on.

I had done a little bit of work dealing with broken snapshot chains before, but the change in RDM size was outside of my wheelhouse, so we logged a call with VMware support. I learned some very handy debugging techniques from them and thought I’d share that information here. I went back into our test environment and recreated the situation that caused the problem.

In this example screenshot, we have a VM with no snapshot in place and we run vmkfstools –q –v10  against the vmdk file
-q means query, -v10 is verbosity level 10

The command opens up the disk, checks for errors, and reports back to you.



In the second example, I’ve taken a snapshot of the VM. I’m now passing the snapshot VMDK into the vmkfstools command. You can see the command opening up the snapshot file, then opening up the base disk.




In the third example, I  pass it the snapshot vmdk for a virtual-mode RDM on the same VM –  it traverses the snapshot chain and also correctly reports that the VMDK is a non-passthrough raw device mapping, which means virtual mode RDM.



Part of the problem that happened was the size of the RDM changed (increased size) but the snapshot pointed to the wrong smaller size.  However, even without any changes to the storage, a corrupted snapshot chain can  happen  during an out-of-space situation.

I have intentionally introduced a drive geometry mismatch in my test VM below – note that the value after RW in the snapshot TEST-RDM_1-00003.vmdk  is 1 less than the value in the base disk  TEST-RDM_1.vmdk



Now if I run it through the vmkfstools command, it reports the error that we were seeing in the vSphere client in Production when trying to boot the VMs – “The parent virtual disk has been modified since the child was created”. But the debugging mode gives you an additional clue that the vSphere client does not give– it says that the capacity of each link is different, and it even gives you the values (20368672 != 23068671).

The fix was to follow the entire chain of snapshots and ensure everything was consistent. Start with the most current snap in the chain. The “parentCID” value must be equal to the “CID” value in the next snapshot in the chain. The next snapshot in the chain is listed in the “parentFileNameHint”.  So TEST-RDM_1-00003.vmdk is looking for a ParentCID value of 72861eac, and it expects to see that in the file TEST-RDM_1.vmdk.

If you open up Test-RDM_1.vmdk, you see a CID value of 72861eac – this is correct.  You also see an RW value of 23068672. Since this file is the base RDM, this is the correct value. The value in the snapshot is incorrect, so you have to go back and change it to match.  All snapshots in the chain must match in the same way.



I change the RW value in the snapshot back to match  23068672 – my vmkfstools command succeeds, and I’m also able to delete the snapshot from the vSphere client6_vmkfstools


VMware load balancing with Cisco C-series and 1225 VIC card

I recently did a UCS C-series rackmount deployment. The servers came with a 10gbps 1225 VIC card and the core routers were a pair of 4500s in VSS mode.

The 1225 VIC card lets you carve virtual NICs from your physical NICs. You can put COS settings directly on the virtual NICS, enabling you to prioritize traffic directly on the physical NIC. For this deployment, I created 3 virtual NIC for each pNIC – Management, vMotion, and VM traffic. By setting COS to 6 for management, 5 for VMs, and 4 for vMotion on the vNICs, I ensure that management traffic is never interrupted, and I also guarantee that VM traffic will be prioritized over vMotion. This still allows me to take full advantage of 10gbps of bandwidth when the VMs are under light load.

Cisco 1225 VIC vNIC

Cisco 1225 VIC vNIC

My VCAP5-DTD exam experience

I took the VCAP5-DTD beta exam on January 3rd, 2013. Like many people, I received the welcome news today that I passed the exam.

I’m laughing a little to myself as I write this post because my certification folder contains a log of my studying. I downloaded the beta blueprint on December 17, 2012, but I already had Microsoft exams scheduled for December 28th.  I did no studying for this VCAP until the day before the exam, January 2rd, where you can clearly see my feverish morning download activity. I will say though that I have several years of View deployments under my belt, so my knowledge on the engineering side was up-to-date and at the front of my mind.

VCAP5-DTD Folder

I downloaded every PDF referenced in the exam blueprint, and I already had most of the product documentation already downloaded. I am primarily a delivery engineer, but to be successful on the exam you need to put on your designer’s hat. I tried to keep that in mind as I pored through the PDFs – it does make a difference because different information will stand out if you actively look for design elements.

My exam was just after lunch and it was well over an hour away, so I left early and brought my Kindle. I continued going through the PDFs until exam time. The sheer volume of information you have to read through makes VMware design exams quite difficult. I suggest reading the answers before you read the question – this helps you identify clues in the question. There are detailed descriptions requiring 6 or more paragraphs of reading just to answer a single multiple choice question.

The GA version of the exam has 115 questions and 6 diagramming scenarios. Keep track of the number of diagramming questions you get so you can budget your time appropriately. You should not spend any more than 15 minutes on a diagram. Keep in mind that 15 * 6 = 90 minutes, leaving you only 105 minutes to answer 109 questions. The pace you have to sustain is mentally exhausting. The beta was even more difficult with 131  questions, plus the expectation to provide comment feedback on the questions.

I found the diagramming questions to be even more involved than the DCD questions.. I’d say the tool was a bit better behaved than the DCD exam, but not by much. It’s easy to get sucked in to a design scenario and waste far too much time. Remember that you’re not designing the perfect system, it just has to be good enough to meet the stated requirements.

Is It Time To Remove the VCP Class Requirement – Rebuttal

This post is a rebuttal of @networkingnerd‘s blog post Is It Time To Remove the VCP Class Requirement.

I would like to acknowledge that it’s easy for me to have the perspective I do as a VCP holder since version 3. I’ve already got it, so I naturally want it to remain valuable. The fact that my employer at the time paid for the class has opened up an entire career path for me that would have otherwise been closed. But I believe the VCP cert remains fairly elite specifically because of the course requirement.

First, consider Microsoft’s certifications. As a 15-year veteran of the IT industry, I believe I am qualified to state unequivocally that Microsoft certifications are utterly worthless. This is partially because of the proliferation of braindumps. There is no knowledge requirement whatsover to sit the Microsoft exams. You don’t even need to look at a Microsoft product to pass a Microsoft test – go memorize a braindump and pass the test. We’ve all encountered paper MCSEs – their existence completely devalues the certification. I consider the MCSE nothing more than a little checkbox on some recruiter’s wish list.

I would go so far as to say that Microsoft’s test are specifically geared towards memorizers – they acutally encourage braindumping by focusing on irrelevant details and not on core skills. Passing a Microsoft exam has everything to do with memorization and almost nothing to do with your skill as a Windows admin.

On the other hand, to sit the VCP exam you have to go through a week of training. At the very least, you’ve touched the software. You installed it. You configured it. You (wait for it)… managed it.  Obviously there are braindumps out there for the VCP exam too, but everybody starts with the same core of knowledge. The VCP exams have improved to a point where they are not memorize-and-regurgitate. A person who has worked with the product actually stands a reasonable chance of passing the exam.

Quoted directly from the blog post:

For those that say that not taking the class devalues the cert, ask yourself one question. Why does VMware only require the class for new VCPs? Why are VCPs in good standing allowed to take the test with no class requirement and get certified on a new version? If all the value is in the class, then all VCPs should be required to take a What’s New class before they can get upgraded. If the value is truly in the class, no one should be exempt from taking it. For most VCPs, this is not a pleasant thought. Many that I talked to said, “But I’ve already paid to go to the class. Why should I pay again?” This just speaks to my point that the value isn’t in the class, it’s in the knowledge. Besides VMware Education, who cares where people acquire the knowledge and experience? Isn’t a home lab just as good as the ones that VMware built.

There is a tiny window of opportunity after the release of new vSphere edition to go take the exam without a course requirement. Those of us who are able to pass the exam in that small window are the people who do exactly as you say – we are downloading and installing the software in our labs. We are putting in the time to make sure that our knowledge of the newest features is up to par. Many of us partipate in alpha and beta programs, spending far more time with the software than a typical certification candidate. Some of us participate in the certification beta program, where we have just a couple of short weeks to study for and book the exam. I’ve put in quite a few late nights prepping for beta exams.

VMware forces us to learn the new features by putting a time limit on the upgrade period. We have a foundation of knowledge that was created by the original class that we took. There isn’t enough time for braindumps to leak out there, and the vast majority of upgraders wouldn’t use one anyhow. VMware can be reasonably certain that a VCP upgrader without the class really is taking the time to learn the new features. @networkingnerd is correct, the value IS in the knowledge, but the focus is ensuring that every VCP candidate starts with the same core of knowledge.

@networkingnerd suggests an alternative lower level certification such as a VCA with a much less expensive course requirement. I think it’s an interesting idea, but I don’t know how you’d put it into practice. I’m not sure what a 1-day class could prepare you for. It’s one thing for experienced vSphere admins to attend a 2-day What’s New class. But what could you really teach and test on? Just installing vSphere? There’s not a whole lot of value for an engineer who can install but not configure.

Again quoting from the article:

Employers don’t see the return on investment for a $3,000US class, especially if the person that they are going to send already has the knowledge shared in the class. That barrier to entry is causing VMware to lose out on the visbility that having a lot of VCPs can bring.

This suggests that the entry-level certification from the leader in virtualization is somehow not well-known. While I would agree that the VCAP-level certifications do not enjoy the same level of name recognition as the CCNP, I talk to seniors in college who know what the VCP is. There is no lack of awareness of the VCP certification. I also agree that it’s ridiculous to send a VMware admin who has years of experience to the Install Configure Manage class. That’s why the Optimize and Scale and the Fast Track classes exist.

I don’t believe dropping the course requirement would do anything to enhance VMware’s market share. The number of VCP individuals has long since reached a critical mass.  Nobody is going to avoid buying vSphere because of a lack of VCPs qualified to administer the environment. While I agree that Hyper-V poses a credible threat, Microsoft is just now shipping features that vSphere has had for years. Hyper-V will start to capture the SMB market, but it will be a long time before it has the chance to unseat vSphere in the enterprise.

VMware View Composer starts, but does no work.

I worked on a client outage over the weekend, Virutal Center and View Composer were down. It started with a disk full situation on the SQL server hosting the vCenter, Composer, and Events databases. The client was shut down for winter break, so the Composer outage was not noticed for several days. After fixing the SQL Server disk space problem, everything came back up. I was able to restart all services and they appeared to be running. Composer started without issue, but it didn’t respond to any commands – any operations I requested in View Manager were ignored. I didn’t find any obvious errors in the logs.

I ran through the troubleshooting options in KB1030698 without finding any issues. I validated the SDK was responding by going to https://vcenteripaddress/sdk/vimService.wsdl . I couldn’t find any cause for the outage, so I opened up a Sev-1 ticket with VMware Support.

The support tech concluded that a problem with the ADAM database was preventing Composer from doing the job. He had me shut down all but one connection broker, then restart the View services on the remaining broker. At this point, commands issued on the broker were obeyed by Composer. We deleted or refreshed all of the desktops listed under Problem Desktops. Once we were sure that the ADAM database reflected the true state of the environment as reflected in vCenter, we restarted the other brokers. They synced databases and the problem was resolved.