Alzheimer’s Association – Forgotten Donation

Chris Wahl just put up this blog post showing the donation of royalties from his book, Networking for VMware Administrators. I won my copy for free and at the time I promised to donate to the Alzheimer’s association. I failed to do so, but I have rectified that today.

Below is my personal donation along with VMware’s matching gift. $31.41 is the minimum donation to receive a matching gift with VMware’s matching program.  alz-donation alz-matching

Custom Views and Dashboards in vRealize Operations

This post covers a few of the most common questions my customers ask me as I demonstrate what you can do with vROps. I’m going to take you through an example of needing to frequently check the CPU ready % of your VMs – this was my customer’s most recent request, but know that you can make this happen for any metric collected by vROps.

First, we’re going to create a custom view for CPU Ready % by going to Content>Views, then clicking on the green Plus to add a new View.

1-AddView

I named this one “Custom-CPU Ready” and gave it a description.

2-CustomCPUReady

Next, pick what our View looks like. In this case, I want to see all of the data in a list format, so I pick List.

3-Presentation

Now to select the subjects – these are the objects that the View will be looking at. We want CPU Ready % on Virtual Machines, so we pick the vCenter Adapter and scroll down until we find Virtual Machine.

4a-Subjects

 

4b-Subjects

We now need to find the CPU Ready % metric

5-CPUData

Double-click on it when you find it in the list on the left, it will then appear in the Data section. Change the Sort order to descending because we want to see the VM with the highest CPU ready on top.

6-SelectReady

The Availability options let you control where inside vROps the View will be usable. I lef the defaults.

7-Visibility

You now see the custom view in the Views list.

8-CustomView

How can we use our brand new view? We want to see the CPU ready for all VMs in the Production cluster. Go to Environment, then drill down into the vSphere World until you reach the Production cluster. Click on the Details tab, you can then scroll down and find the custom View that we created. Click on it and all of your VMs show up, sorted by highest CPU ready.

9-ShowReady

 

Let’s say this is a metric that you look at at daily or multiple times a day. You can create a custom dashboard so the metric is immediately visible if you’re using vROps Advanced.

To create a new dashboard, from the Home menu, click Actions, then Create Dashboard

1-AddDashboard

Name the dashboard and select a layout.

2-NameDashboard

We want to show a View in the widget, so we drag View over into the right pane.

2a-WidgetList

Click the Edit icon in the blank View to customize it.

3-EditWidget

Click on Self Provider to allow us to specify the Production Cluster object on the left, then select our Custom CPU Ready View on the right and click Save.

4-AddWidget

The dashboard is now ready. The CPU Ready for the Production VMs will now show up in the dashboard.

5-FinalDashboard

 

 

 

 

VMware Hands-on Labs

It always surprises me how few customers are aware of the Hands-on Labs. The labs are full installations of VMware software running in VMware’s cloud, accessible from any browser and completely free for anyone to use.

Each self-paced lab is guided with a step-by-step lab manual. You can follow the manual from start to finish for a complete look at the specific software product. Alternatively, you can focus on specific learning areas due to the modular structure of the lab manual. You can even ignore the manual completely and use the lab as a playground for self-directed study.

You can give the labs a try here: http://labs.hol.vmware.com/

Moving VMs to a different vCenter

I had to move a number of clusters into a different Virtual Center and I didn’t want to have to deal with manually moving VMs into their correct folders. In my case I happened to have matching folder structures in both vCenters and I didn’t have to worry about creating an identical folder structure on the target vCenter. All I need to do was to record the current folder location and move the VM to the correct folder in the new vCenter.

I first run this script against the source cluster in the source vCenter. It generates a CSV file with the VM name and the VM folder name

$VMCollection = @()
Connect-VIServer "Source-vCenter
$CLUSTERNAME = "MySourceCluster"
 
$vms = Get-Cluster $CLUSTERNAME | Get-VM
foreach ( $vm in $vms )
{
	$Details = New-Object PSObject
	$Details | Add-Member -Name Name -Value $vm.Name -membertype NoteProperty 
	$Details | Add-Member -Name Folder -Value $vm.Folder -membertype NoteProperty
	$VMCollection += $Details
}
 
$VMCollection
$VMCollection | Export-CSV "folders.csv"

Once the first script is run, I disconnect each host from the old vCenter and add it into a corresponding cluster in the new vCenter. I can now run this command aginst the new vCenter to ensure the VMs go back into their original folders.

Connect-VIServer "Dest-vCenter"
$vmlist = Import-CSV "folders.csv"
 
foreach ( $vm in $vmlist )
{
	$vm.Name
	$vm.Folder
	Move-VM -VM $vm.Name -Destination $vm.Folder
}

The parent virtual disk has been modified since the child was created

Some VMs in my environment had virtual-mode RDMs on them, along with multiple nested snapshots. Some of the RDMs were subsequently extended at the storage array level, but the storage team didn’t realize there was an active snapshot on the virtual-mode RDMs. This resulted in immediate shutdown of the VMs and a vSphere client error “The parent virtual disk has been modified since the child was created” when attempting to power them back on.

I had done a little bit of work dealing with broken snapshot chains before, but the change in RDM size was outside of my wheelhouse, so we logged a call with VMware support. I learned some very handy debugging techniques from them and thought I’d share that information here. I went back into our test environment and recreated the situation that caused the problem.

In this example screenshot, we have a VM with no snapshot in place and we run vmkfstools –q –v10  against the vmdk file
-q means query, -v10 is verbosity level 10

The command opens up the disk, checks for errors, and reports back to you.

1_vmkfstools

 

In the second example, I’ve taken a snapshot of the VM. I’m now passing the snapshot VMDK into the vmkfstools command. You can see the command opening up the snapshot file, then opening up the base disk.

 

2_vmkfstools

 

In the third example, I  pass it the snapshot vmdk for a virtual-mode RDM on the same VM –  it traverses the snapshot chain and also correctly reports that the VMDK is a non-passthrough raw device mapping, which means virtual mode RDM.

 

3_vmkfstools

Part of the problem that happened was the size of the RDM changed (increased size) but the snapshot pointed to the wrong smaller size.  However, even without any changes to the storage, a corrupted snapshot chain can  happen  during an out-of-space situation.

I have intentionally introduced a drive geometry mismatch in my test VM below – note that the value after RW in the snapshot TEST-RDM_1-00003.vmdk  is 1 less than the value in the base disk  TEST-RDM_1.vmdk

4_vmkfstools

 

Now if I run it through the vmkfstools command, it reports the error that we were seeing in the vSphere client in Production when trying to boot the VMs – “The parent virtual disk has been modified since the child was created”. But the debugging mode gives you an additional clue that the vSphere client does not give– it says that the capacity of each link is different, and it even gives you the values (20368672 != 23068671).

5_vmkfstools
The fix was to follow the entire chain of snapshots and ensure everything was consistent. Start with the most current snap in the chain. The “parentCID” value must be equal to the “CID” value in the next snapshot in the chain. The next snapshot in the chain is listed in the “parentFileNameHint”.  So TEST-RDM_1-00003.vmdk is looking for a ParentCID value of 72861eac, and it expects to see that in the file TEST-RDM_1.vmdk.

If you open up Test-RDM_1.vmdk, you see a CID value of 72861eac – this is correct.  You also see an RW value of 23068672. Since this file is the base RDM, this is the correct value. The value in the snapshot is incorrect, so you have to go back and change it to match.  All snapshots in the chain must match in the same way.

4_vmkfstools

 

I change the RW value in the snapshot back to match  23068672 – my vmkfstools command succeeds, and I’m also able to delete the snapshot from the vSphere client6_vmkfstools

 

VMware recertification

VMware just announced a new recertification policy for the VCP. A VCP certification expires 2 years after it is achieved. You can recertify by taking any VCP or VCAP exam.

Part of VMware’s justification for this change is “Recertification is widely recognized in the IT industry and beyond as an important element of continuing professional growth.” While I do agree with this statement in general, I don’t believe this decision makes much sense for several reasons:

  • Other vendors – Cisco and Microsoft as two examples – expire after 3 years, not 2 years. Two years is unnecessarily short. It’s also particularly onerous given the VMware course requirement for VCP certification. It’s hard enough to remain current with all of the vendors recertification policies at 3 years.

 

  • Other vendors – again, Cisco and Microsoft as examples – have no version number tied to their certifications. You are simply “MCSE” or “CCNA”. With VMware, you are “VCP3″, “VCP4″, or “VCP5″. The certifications naturally age themselves out. A VCP3 is essentially worthless at this point. The VCP4 is old, and the VCP5 is current. An expiration policy doesn’t need to be in place for this to remain true.

 

  • The timing of this implementation is not ideal. VMware likes to announce releases around VMworld, so we’re looking at August 2014 for 6.0.  Most VMware technologists will be interested in keeping certifications with the current major release, so demand for the VCP6 will be high. Will the certification department release 6 in time for everybody to test before expiration? It’s really a waste of my time and money to force me to recertify on 5 when 6 is right around the corner.

 

  • The expiration policy makes no sense in light of the policy on VCAPs and VCPs. Currently, any VCP makes you eligible to take a VCAP in any of the three tracks, and achieving the VCAP in a track automatically gives you a VCP in the same track. This is a significant timesaver for those of us who are heavily invested in VMware – skip the entry level exam and go straight to the advanced exam. VCAP exam development is obviously even slower than VCP exam development. I have doubts that the VCAPs will come out quickly enough to meet the March 2015 deadline.

 

  • Adam Eckerle commented in his blog post “I also think it is important to point out that I think it encourages individuals to not only keep their skills up to date but also to branch out. If your VCP-DCV is going to expire why not take a look at sitting the VCP-DT or Cloud, or IaaS exams?  If you don’t use the Horizon products or vCloud Suite as part of your job that can be difficult.”I agree that in some cases, this might encourage you to pursue a certification in a separate track. Before I had the desktop certifications, I might have considered accelerating exam preparation to prepare for this recertification date.  However, I already own 4 of 6 VCAPs. Even as a consultant I have no use for vCloud, there’s just not enough demand from our customers to build a practice area around it. There’s currently no business benefit in pursuing the Cloud track.

It’s VMware’s program and they can do as they please, but I hope they consider 3 years instead of 2 for recertification.

Christian Mohn’s blog has a fairly lively discussion going on, and Vladan Seget also has some thoughts and comments

More loathing of Pearson Vue … or, my [redacted] beta exam experience

This is what I get for taking beta exams. I understand. I create my own mess. But that doesn’t change the fact that Pearson Vue is spectacularly incompetent. I’ve previously blogged my strongly negative opinions of Pearson Vue and today’s experience doesn’t do much to improve my outlook.

I sat the [redacted] VMware beta exam today and there were problems with the remote environment. It took me almost 40 minutes to complete the first two questions because the performance of the environment was so lousy. Per my beta exam instructions, after about 15 minutes I asked the Vue proctor to contact VMware for assistance.

The proctor returned to say that we should reboot my local exam station, as if that had anything at all to do with the slow response from the remote lab. I asked her who told her to do that – she had called the Vue helpdesk, not VMware. I told her to call VMware, to which she replied “We can’t do that.”  By then the environment had improved from ‘lousy’ to ‘nearly tolerable’ so I gave up complaining.

After the exam I spoke with the VMware certification team and I received confirmation of the following:

  1. VMware has been assured by Pearson Vue that VCAP candidates will be able to get in contact with VMware’s support team for problems with the exam environment
  2. VMware’s support team has the authority to grant extended time in the event of an environment failure
  3. Pearson Vue has the capability to extend the exam time.

The slow performance of the environment set me back too far to recover;  I was unable to complete the exam. Maybe it will cost me a passing grade, maybe it won’t, but Pearson Vue’s failure to rectify the situation is inexcusable.

VMware load balancing with Cisco C-series and 1225 VIC card

I recently did a UCS C-series rackmount deployment. The servers came with a 10gbps 1225 VIC card and the core routers were a pair of 4500s in VSS mode.

The 1225 VIC card lets you carve virtual NICs from your physical NICs. You can put COS settings directly on the virtual NICS, enabling you to prioritize traffic directly on the physical NIC. For this deployment, I created 3 virtual NIC for each pNIC – Management, vMotion, and VM traffic. By setting COS to 6 for management, 5 for VMs, and 4 for vMotion on the vNICs, I ensure that management traffic is never interrupted, and I also guarantee that VM traffic will be prioritized over vMotion. This still allows me to take full advantage of 10gbps of bandwidth when the VMs are under light load.

Cisco 1225 VIC vNIC

Cisco 1225 VIC vNIC

My VCAP5-DTD exam experience

I took the VCAP5-DTD beta exam on January 3rd, 2013. Like many people, I received the welcome news today that I passed the exam.

I’m laughing a little to myself as I write this post because my certification folder contains a log of my studying. I downloaded the beta blueprint on December 17, 2012, but I already had Microsoft exams scheduled for December 28th.  I did no studying for this VCAP until the day before the exam, January 2rd, where you can clearly see my feverish morning download activity. I will say though that I have several years of View deployments under my belt, so my knowledge on the engineering side was up-to-date and at the front of my mind.

VCAP5-DTD Folder

I downloaded every PDF referenced in the exam blueprint, and I already had most of the product documentation already downloaded. I am primarily a delivery engineer, but to be successful on the exam you need to put on your designer’s hat. I tried to keep that in mind as I pored through the PDFs – it does make a difference because different information will stand out if you actively look for design elements.

My exam was just after lunch and it was well over an hour away, so I left early and brought my Kindle. I continued going through the PDFs until exam time. The sheer volume of information you have to read through makes VMware design exams quite difficult. I suggest reading the answers before you read the question – this helps you identify clues in the question. There are detailed descriptions requiring 6 or more paragraphs of reading just to answer a single multiple choice question.

The GA version of the exam has 115 questions and 6 diagramming scenarios. Keep track of the number of diagramming questions you get so you can budget your time appropriately. You should not spend any more than 15 minutes on a diagram. Keep in mind that 15 * 6 = 90 minutes, leaving you only 105 minutes to answer 109 questions. The pace you have to sustain is mentally exhausting. The beta was even more difficult with 131  questions, plus the expectation to provide comment feedback on the questions.

I found the diagramming questions to be even more involved than the DCD questions.. I’d say the tool was a bit better behaved than the DCD exam, but not by much. It’s easy to get sucked in to a design scenario and waste far too much time. Remember that you’re not designing the perfect system, it just has to be good enough to meet the stated requirements.

Moving PVS VMs from e1000 to VMXNET3 network adapter

A client needed to remove the e1000 NIC from all VMs in a PVS pool and replace it with the VMXNET3 adapter. PVS VMs are registered by MAC address – replacing the NIC means a new MAC, and PVS has to be updated to allow the VM to boot.

I needed a script to remove the old e1000 NIC, add a new VMXNET3 NIC, and register the new NIC’s MAC with PVS. I knew I would easily accomplish the VM changes with PowerCLI, but I didn’t know what options there were with Citrix. I found what I needed in MCLIPSSNapin, a PowerShell snap-in installed on all PVS servers. The snap-in gives you Powershell control over just about anything you need to do on a PVS server.

I didn’t want to install PowerCLI on the production PVS servers, and I didn’t want to install PVS somewhere else or try manually copying files over. I decided I needed one script to swap out the NICs and dump a list of VMs and MAC address to a text file. Then a second script to read the text file and make the PVS changes.

First, the PowerCLI script. We put the desktop pool into maintenance mode with all desktops shut down. It takes about 10 seconds per VM to execute this script.

Param(
	[switch] $WhatIf
,
	[switch] $IgnoreErrors
,
	[ValidateSet("e1000","vmxnet3")]
	[string] 
 	$NICToReplace = "e1000"
)

# vCenter folder containing the VMs to update
$FOLDER_NAME = "YourFolder"

# vCenter Name
$VCENTER_NAME = "YourvCenter"

#The portgroup that the replacement NIC will be connected to
$VLAN_NAME = "VLAN10"

#If you want all VMs in $FOLDER_NAME, leave $VMFilter empty. Otherwise, set it to a pipe-delimited list of VM names
$VMFilter = ""
#$VMFilter = "DESKTOP001|DESKTOP002"

$LOG_FILE_NAME = "debug.log"

Connect-VIServer $VCENTER_NAME

$NICToSet = "e1000"

if ( $NICToReplace -eq "e1000" )
{
	$NICToSet = "vmxnet3"
}
elseif ( $NICToReplace -eq "vmxnet3" )
{
	$NICTOSet = "e1000"
}


function LogThis
{
	Param([string] $LogText,
      	[string] $color = "Gray")
 Process
 {
    write-host -ForegroundColor $color $LogText 
    Add-Content -Path $LOG_FILE_NAME $LogText
 }
}

if ( Test-Path $LOG_FILE_NAME )
{
    Remove-Item $LOG_FILE_NAME
}

$errStatus = $false
$warnStatus = $false
$msg = ""

if ( $VMFilter.Length -eq 0 )
{
	$vms = Get-Folder $FOLDER_NAME | Get-VM
}
else
{
	$vms = Get-Folder $FOLDER_NAME | Get-VM | Where{ $_.Name -match $VMFilter }
}

foreach ($vm in $vms)
{
	$vm.Name
	$msg = ""


	if ( $vm.NetworkAdapters[0] -eq $null )
	{
		$errStatus = $true
		$msg = "No NIC found on " + $vm.Name
		LogThis $msg "Red"

	}
	else
	{
		if ( ($vm.NetworkAdapters | Measure-Object).Count  -gt 1)		{
			$errStatus = $true
			msg = "Multiple NICs found on " + $vm.Name
			LogThis $msg "Red"

		}
		else
		{
			if ( $vm.NetworkAdapters[0].type -ne $NICToReplace )
			{
				$warnStatus = $true
				$msg = "NIC is not " + $NICToReplace + ", found" + $vm.NetworkAdapters[0].type + " on " + $vm.Name
				LogThis $msg "Yellow"				
			}

				LogThis $vm.Name,$vm.NetworkAdapters[0].MacAddress

		}

	}



}

if ( $errStatus = $true -and $IgnoreErrors -ne $true)
{
	LogThis "Errors found, please correct and rerun the script." "Red"
 
}
else
{
	if ( $warnStatus = $true )
	{
		LogThis "Warnings were found, continuing." "Yellow"
	}
	foreach ( $vm in $vms )
	{
		if ( $WhatIf -eq $true )
		{
			$msg = "Whatif switch enabled, would have added " + $NICToSet + " NIC to " + $vm.Name
			LogThis $msg
		}
		else
		{
			$vm.NetworkAdapters[0] | Remove-NetworkAdapter -confirm:$false
			$vm | New-NetworkAdapter -NetworkName $VLAN_NAME -StartConnected -Type $NICToSet -confirm:$false
		}
	}

	if ( $VMFilter.Length -eq 0 )
	{
		$vms = Get-Folder $FOLDER_NAME | Get-VM
	}
	else
	{
		$vms = Get-Folder $FOLDER_NAME | Get-VM | Where{ $_.Name -match $VMFilter }
	}

	LogThis("Replaced MAC addresses:")
	foreach ( $vm in $vms )
	{
		LogThis $vm.Name,$vm.NetworkAdapters[0].MacAddress
	}
	
	
}

The script offers a -Whatif switch so you can run it in test mode without actually replacing the NIC. It writes all its output to $LOG_FILE_NAME. First it logs the VMs with their old MAC, then the replaced MAC. The output looks something like this:
VD0001 00:50:56:90:00:0a
VD0002 00:50:56:90:00:0b
VD0003 00:50:56:90:00:0c
VD0004 00:50:56:b8:00:0d
VD0005 00:50:56:b8:00:0e
Replaced MAC addresses:
VD0001 00:50:56:90:57:1b
VD0002 00:50:56:90:57:1c
VD0003 00:50:56:90:57:1d
VD0004 00:50:56:90:57:1e
VD0005 00:50:56:90:57:1f

Scan the logfile for any problems in the top section. The data after “Replaced MAC addresses:” is what the PVS server needs. Copy this over to the PVS host. Now we need to use MCLIPSSnapin, but first we have to register the DLL. I followed this Citrix blog for syntax:
“C:\Windows\Microsoft.NET\Framework64\v2.0.50727\installutil.exe” “C:\Program Files\Citrix\Provisioning Services Console\McliPSSnapIn.dll”

I copied the VM names and new MAC addresses to a text file vmlist.txt and put it on my PVS server, in the same folder as the following PowerShell script. It runs very quickly, it takes only a few seconds even if you are updating hundreds of VMs.

Add-PSSnapIn mclipssnapin
$vmlist = get-content "vmlist.txt"
foreach ($row in $vmlist)
{
	$vmname=$row.Split(" ")[0]
	$macaddress=$row.Split(" ")[1]
	$vmname
	$macaddress
	Mcli-Set Device –p devicename=$vmname –r devicemac=$macaddress
}

Now, replace the PVS pool’s image with one that is prepared for a VMXNET3 adapter and boot the pool. Migration complete!