NSX per-VM licensing compliance

I had a customer with a production cluster of around 100 VMs. They needed to segment off a few VMs due to PCI compliance, and were looking at a large expense to create a physically separate PCI DMZ. I suggested instead the purchase of our per-VM NSX solution. This is a great cost-effective way to license NSX when you’re in a situation like this that doesn’t require socket licenses for your whole cluster.

The problem with per-VM licensing is with compliance. VMware doesn’t have a KB explaining a way to make sure you aren’t using more than the number of licenses that you bought. If you add a 25-pack of NSX licenses to a cluster with 100 VMs in it, the vCenter licensing portal will show that you’re using 100 licenses but only purchased 25. VMware KB 2078615 does say “There is no hard enforcement on the number of Virtual Machines licensed and the product will be under compliance.” However, this post is related to the way per-socket licensing displays when you add it to vCenter, not related to per-VM pricing.

I’ve had a few conversations with the NSX Business Unit (NSBU) and the intent of per-VM licensing is to allow customers to use the NSX distributed firewall without physically segmenting clusters. You can run NSX-protected VMs in a cluster alongside non-NSX-protected VMs. However, you have to take some steps to ensure that you’re remaining in licensing compliance. This post shows you how to do it.

One way to do this is to avoid using ‘any’ in your firewall rules. If all of your firewall rules are VM to VM or security group to security group, all you have to do is keep the total VM count below your purchased VM count. It is difficult to craft a firewall policy without using ‘any’, though this is the simplest method if your requirements lend themselves to this method.

An alternative way is to use security tags. It’s a bit more involved but lets you have precise control over where your NSX security policy is applied.

First, I create two custom tags, Custom_NSX.FirewallDisabled and Custom_NSX.FirewallEnabled


I then assigned tags to my VMs as shown below. The disadvantage to this method is you have to keep making sure that you security tag VMs. But it does make rule writing easier. I’m only creating two groups – NSX enabled and disabled. However, there’s nothing stopping you from creating multiple tags – maybe you have a DMZ1 and DMZ2, maybe PCI and HIPAA are separate tags.

In this case, I assign all of my PCI VMs the FirewallEnabled tag and the rest of my VMs the FirewallDisabled tag.


Now, instead of going to the Firewall section, I go to Service Composer. Don’t be confused by the fact that the security groups already exist – I took the screenshot of the Security Groups tab after I created the groups.


First, I create an NSX_Disabled group with a dynamic membership of CUSTOM_NSX.FirewallDisabled.


Next, I create an NSX_Enabled security group with a dynamic membership of CUSTOM_NSX.FirewallEnabled


I then specifically exclude NSX_Disabled from the NSX_Enabled group. This guarantees that no firewall rules can touch my excluded VMs.


I create a new security policy in Service Composer


In the Firewall Rules section, NSX has something called “Policy’s Security Groups”.  If we assign the policy to the NSX_enabled security group, we can safely use an ‘any’ rule as long as the other side is ‘Policy’s Security Groups’. So source could be ‘any’ if dest is Policy’s Security Groups, or dest could be ‘any’ if source is Policy’s Security Groups. The security group we made enforces that NSX won’t apply rules on VMs that aren’t in the NSX_enabled group.


I then apply my new policy to the NSX_Enabled security group.

policy_apply security-group-select

Doing a security policy this way is a bit more involved than simply using the Firewall section of NSX, but it’s worth considering. It’s a perfect way to ensure 100% compliance in a per-VM model. It’s also helping you unlock the power of NSX – all you have to do is security tag VMs and they automatically get their security policy.


OpenStack introduction

I have a large deal on the table and they are asking about VMware’s support for OpenStack. Since I know nothing about OpenStack, other than the fact that VMware offers VMware Integrated Openstack, I decided it was time to find some training. Fortunately I have many specialists inside VMware who can help answer customer questions around VIO.

There’s plenty of VIO training internally at VMware, but I needed something even more basic, just an intro. I went to my trusty PluralSight subscription and found Eric Wright‘s Introduction to OpenStack. This is a great course to come up to speed on the basics of OpenStack in only 2.5 hours.

Unable to connect virtual NIC in vCloud Air DRaaS

I had a customer open a service request, they were in the middle of a DR test using vCloud Air DRaaS and were unable to connect 1 virtual machine to the network. It kept erroring out with a generic unable to connect error.

It turns out that their VM had a VMDK sized with a decimal point, like 50.21GB instead of just 50GB. I don’t see it often, but this sometimes happens when P2V a machine. The vCloud Director backend can’t handle the decimal point in the disk size, so it errors out.

I’m not entirely sure why the error happens, but the fix is to resize your source disk to a non-decimal number and run replication again.

Life and death monitoring

The Brookfield Zoo posted this statement on Facebook today and also emailed it out to all members.

On July 10, there was a drop in the oxygen level at Brookfield Zoo’s Stingray Bay habitat. Veterinary staff was promptly on the scene to provide medical treatment to the affected stingrays. Additionally, immediate action was taken by animal care staff to rectify the situation and get the levels back to normal. Despite tireless efforts by staff, all the animals, which included four southern stingrays and 50 cownose rays, succumbed.

“We are devastated by the tragic loss of these animals,” said Bill Zeigler, senior vice president of animal programs for the Chicago Zoological Society, which operates the zoo. “Our staff did everything possible to try and save the animals, but the situation could not be reversed.”

Staff is currently analyzing the life support system to determine the exact cause of the malfunction. At this time, the Chicago Zoological Society has made the decision to not reopen the summer-long temporary exhibit for the remainder of the season. The popular exhibit has been operating since 2007.

The zoo posted a further clarification on what kind of monitoring was used in the enclosure.

Brookfield Zoo 15 minute monitoring - Facebook

Brookfield Zoo 15 minute monitoring – Facebook

I’m not a zoologist, I have no experience with monitoring systems for animals. But I do have vast experience monitoring critical services. A simple Google search finds $200 monitors for home aquariums that take readings every 6 seconds. It seems reasonable to assume that a commercial system would offer something similar.

In IT, we take painstaking care to ensure that our critical servers stay online.  Most of what we do has nothing to do with life-and-death. Even though we’re only protecting our company financials and reputation, any halfway decent system that I stand up has:

  • Independent power feeds from separate areas of the building
  • Dual power distribution units, dual power on all equipment
  • N+1 architecture to sustain the loss of one host in the cluster
  • Redundant storage controllers with redundant paths
  • Battery backup
  • Appropriate monitoring

In general, a monitoring interval of 5 minutes is the maximum I would ever allow for a Production server. Critical servers could be monitored as frequently as 1 minute. Load balancers watch services as frequently as every 5-10 seconds. All of this work is done to ensure availability of the services.

Anything can go wrong and it’s possible that a more frequent monitoring interval would not have made a difference in this case. But at first glance, a 15 minute interval seems negligent. If I can monitor my goldfish every 6 seconds, it seems that the zoo should have been monitoring the rays more closely. A 15 minute monitoring interval means that you can’t expect a human response for at LEAST 20 minutes, and that doesn’t seem sufficient when the lives of 54 animals depend on it.

A quick NSX microsegmentation example

This short post demonstrates the power of NSX. My example is a DMZ full of webservers – you don’t want any of your webservers talking to each other. If one of your webservers happens to be compromised, you don’t want the attacker to then have an internal launching pad to attack the rest of the webservers. They only need to communicate with your application or database servers.

We’ll use my lab’s Compute Cluster A as an a sample. Just pretend it’s a DMZ cluster with only webservers in it.

Compute Cluster A


I’ve inserted a rule into my Layer 3 ruleset and named it “Isolate all DMZ Servers”. In my traffic source, you can see that you’re not stuck with IP addresses or groups of IP addresses like a traditional firewall – you can use your vCenter groupings like Clusters, Datacenters, Resource Pools, or Security Tags to name a few.

Rule Source

I add Computer Cluster A as the source of my traffic. I do the same for the destination.

NSX Source Cluster


My rule is now ready to publish. As soon as I hit publish changes, all traffic from any VM in this cluster will be blocked if it’s destined for any other VM in this cluster.

Ready to publish


Note that these were only Layer3 rules – so we’re secured traffic going between subnets. However, nothing’s stopping webservers on the same subnet from talking to each other. No worries here though, we can implement the same rule at layer 2.

Once this rule gets published, even VMs that are layer 2 adjacent in this cluster will be unable to communicate with each other!

NSX layer 2 block

This is clearly not a complete firewall policy as our default rule is to allow all. We’d have to do more work to allow traffic through to our application or database servers, and we’d probably want to switch our default rule to deny all. However, because these rules are tied to Virtual Center objects and not IP addresses, security policies apply immediately upon VM creation. There is no lag time between VM creation and application of the firewalling policy – it is instantaneous!  Anybody who’s worked in a large enterprise knows it can take weeks or months before a firewall change request is pushed into production.

Of course, you still have flexibility to write IP-to-IP rules, but once you start working with Virtual Center objects and VM tags, you’ll never want to go back.

Enable and Disable GPOs with PowerShell

Post Updated April 13, 2015

I received a comment below saying the BSonPosh link to Microsoft was dead. It appears that Microsoft has retired the code modules. It also looks like they have a native PowerShell equivalent, examples of how to use it are here.

If you don’t want to modify the script to use the native Microsoft method, I do still have my original download of the modules. Here’s a link to the original BSonPosh modules BSonPosh.zip

Original Post Feb 19, 2011

We had a need to enable and disable groups of GPOs on a recurring basis and wanted to automate the process.

This script relies on the BSonPosh module. It also relies on the Windows PowerShell Group Policy cmdlets. The Group Policy cmdlets are on Windows 2008 R2 DCs, a server with the GPMC installed, or Windows 7 with the RSAT installed.

To use the script, create a text file with the name of each GPO that you want to control via the script. The script takes 2 parameters, whether to enable or disable the GPOs, and the name of the textfile with the list of GPOs.

	#Enabled or Disabled, whether you want the GPOs enabled or disabled
	[string]$GPOStatus = $(Throw '$GPOStatus is required'),
	#List of GPOs to enable/disable
	[string]$GPOList =  $(Throw '$GPOList is required')
	$GPO_DISABLED = "AllSettingsDisabled"
	$GPO_ENABLED  = "AllSettingsEnabled"
	#Change the specified GPO's GpoStatus property
	function SetGPOStatus( [string]$GPOName, [string]$Status )
		$gpo=Get-GPO $GPOName -server $PDC.ServerName -errorAction SilentlyContinue
		if ( $gpo -eq $null )
			write-host "Could not locate" $GPOName
			$gpo.GpoStatus = $Status
			Write-Host "Set"$gpo.DisplayName"to"$gpo.GpoStatus
	#Attempt to load a module with Import-Module
	function TryImportModule( [string]$ModuleName )
		if ( (Get-Module $ModuleName ) -eq $null )
			Import-Module $ModuleName
			if ( (Get-Module $ModuleName ) -eq $null )
				Write-Host "Unable to load module" $ModuleName
				return $false
		return $true
	# Microsoft module to manage Group Policy
	$retval = TryImportModule "grouppolicy"
	if ( $retval -eq $false )
	# Community module that will help retrieve FSMO roles
	$retval = TryImportModule "bsonposh"
	if ( $retval -eq $false )
	# Modify the GPOs on the server with the PDC Master FSMO role
	$PDC = Get-Fsmo -role "PDCMaster" -errorAction SilentlyContinue
	if ( $PDC -eq $null )
		write-host "Could not locate PDC Master"
	# Validate Status flag input
	if ( $GPOStatus.ToLower() -eq "disabled" )
		$SetFlag = $GPO_DISABLED
	elseif ( $GPOStatus.ToLower() -eq "enabled" )
		$SetFlag = $GPO_ENABLED
		Write-Host "Invalid value '$GPOStatus' for paramGPOStatus. Allowed values: [Disabled|Enabled]".
	# Ensure we actually have a list of GPOs in our text file
	if ( (Test-Path $GPOList) -eq $false )
		write-host "Could not locate"$GPOList
		$AllGPOs = Get-Content $GPOList
		if ( $AllGPOs -eq $null )
			write-Host $GPOList" is empty."
		foreach ( $myGPO in $AllGPOs )
			if ( $myGPO.SubString(0,1) -ne "#" ) #Allows comments in the text file
				SetGPOStatus $myGPO $SetFlag

Example usage: .\SetGPOStatus.ps1 -GPOStatus “Disabled” -GPOList “gpolist.txt”