Custom Dashboards in vROps 6

I was doing a vRealize Operations demo with a customer today and they had a specific request on how to see some CPU data. They wanted to see a list of physical hosts, CPU utilization metrics on each of those hosts, and then be able to drill into specific CPU stats for the VMs running on the host. We will create a custom dashboard to easily display this information.

Here’s the finished product first. On the top left, we want all of our hosts and clusters to display. When you click on a host or cluster, we want the host metrics to show up in the top right box. Then, we want all of the VMs in the host or cluster to show up in the bottom left box with VM-specific metrics.

18-DashboardResults2-ViewHost

First, we want to create a new custom view for the host CPU metrics. There are many out-of-the-box views inside vROps – you can use any of them, or create your own. We will see both methods in this post – a custom view for the host metrics, and an out-of-the-box view for the VM metrics.

To create a new view, go into Views and click the green plus.

1-NewView

Name the view and add a description.

2-ViewName

Pick how our data will display – in this case we’ll want the data displayed in a list format with columns.

3-ListView

We want vSphere host metrics, which come from the vCenter Adapter. We pick vCenter Adapter, then Host System

4-SubjectsvCenterAdapter

5-Subjects-HostSystem

We want 3 host metrics to show. CPU Demand will show me how much CPU the VMs on the host are demanding. CPU Capacity Usage will show me how much CPU is actually used. CPU Demand can be higher than CPU Capacity Used due to limits, either directly on the VM or imposed by resource pools. There are resource pool limits in this test environment, so we might expect to see higher CPU demand than usage. The final metric we want is CPU Contention. We drag them from the metrics on the right to the include box in the middle.6-Metrics

Finally, we pick the availability settings for our new view. We want to be able to include it in Dashboards, so we make sure that box is checked. A couple other boxes are checked by default – we leave them checked.

7-Visibility

 

Now we create a new dashboard from the Home screen, click on Actions, then Create Dashboard.

8-CreateDashboard

Name the dashboard and provide a description.9-NameDashboard

We’re going to add 3 widgets to our dashboard. First, we drag an Object List widget into the top left corner. We then drag a View widget into the top right and bottom left.

10-AddWidgets

 

 

Now, we customize the Object List. Click on the Edit button.

11-EditObjectList

We name the Object List. We only want vSphere Hosts and Clusters showing up, so we expand the Object Types option.

12-ModifyObjectList

We want Cluster Compute Resources and Host System. We click on the first one, then Ctrl-Click to highlight both.

13-SelectClusterResource14-SelectHostSystem

After these changes, we save the object list.

Now, we edit the View widget on the top right. We name it Host CPU Summary, then pick our Custom CPU view that we created at the beginning of this post.

15-HostCPUSummary

We edit the bottom left view widget. We name it VM CPU Details, and we pick a standard view called Virtual Machine CPU Diagnose List.

16-VMCPUDiagnose

Finally, we modify the Widget interactions. When we select a host or cluster object in the Host / Cluster list box, we want it to change the two view boxes. We configure the Widget Interactions to use the Host / Cluster List selection as the data source, and we have it feed the Host CPU Summary and VM CPU Details view boxes. Click Apply Interactions to save the interactions.

10a-WidgetInteraction

 

In our completed dashboard, we click on the demo-mgmt Cluster. All of the hosts in the cluster show up in the Host CPU Summary box. All of the VMs in the cluster show up in the VM CPU Details box.

17-DashboardResults1-ViewCluster

This is an example of clicking a single host – only the metrics for the one host show up in the Host CPU Summary box, and the VMs running on that one host show up in the VM CPU Details box.

18-DashboardResults2-ViewHost

Here we see more of the metrics available in the Virtual Machine CPU Diagnose List view. Again, we could have created a custom view for the widget instead – it all depends on what metrics you want to show.

19-DashboardResults3-ViewHost

Here is a link to the zipfile containing the JSON dashboard definition and the XML definition for the Custom CPU view that we created.

vROps Custom CPU exported objects

Custom Groups and Policies in vROps 6

This post is based on our Hands-On Lab HOL-SDC-1610, which you can use for free at http://labs.hol.vmware.com

This shows you how to create a custom monitoring policy in vROps 6.

First, this is what my cluster looks like in the lab vCenter Web Client. In this scenario, we want to have a custom monitoring policy for all VMs in Cluster Site A because they are critical VMs and need a more aggressive monitoring policy. We want to change the memory % contention object to give us an alert at a lower percentage of contention.

1-Site-A-VMs

 

We go into Custom Groups from inside vROps and click on the plus to add a new group

2-CustomGroups

 

We name the group “VMs in Cluster A Prod”, pick a Group Type of “Function”, and for now pick the Default Policy. There are various group types – in this case we are separating the VMs based on function (Critical Prod).  We check the Keep group membership up to date box. This ensures that new VMs added to the cluster get picked up by the group.

We want to select VMs, so the Object Type is Virtual Machine. We want to select VMs based on the cluster that they’re on. In the vROps nav tree, VMs are descendants of a cluster. We set the object criteria to Relationship, Descendant of, and contains. We set the nav tree dropdown to “vSphere Hosts and Clusters”

3-NewGroup

 

The name box autofills as we type – Cluster Site A appears, we click on it to fill the box. We now have our custom group of all VMs inside Cluster Site A.

4-NewGroup_2

 

We now move into the Policy Library. The default policy is indicated with a priority of “D”. The concept of inheritance lets you have a master default policy, and you can then override a few specific settings based on group membership.

5-DefaultPolicy

 

We’re going to create a new policy for Cluster A and base it on the Default policy.

6-NewPolicy

 

We jump down to the Alert/Symptom Definitions.

6a-SymptomDef

 

To easily find  our symptom, we can pick vCenter Adapter>Virtual Machine from this dropdown, and then use “memory” on the filter box to find all VM-related memory Symptoms.

7-MemoryPolicy

 

Here, I’ve changed the State to Local, and Override, then changed the threshold from 10 to 8. Any VMs bound to this policy will alert when the memory contention reaches 8% instead of the default of 10%.

8-OverrrideContention

The final step is to select our groups that will use the new policy. We check the box for our VMs in Cluster A Prod custom group.

9-PickGroup

Here is the Default policy with its subordinate policies. In Lab 1610, there is also another subordinate policy for a specific VM, linux-App-02a. This is an example of how granular you can get with your policies, getting down to overriding settings even for a specific VM.

10-Policy

 

We have a YouTube video on this topic as well: Customize Operational Policies in vRealize Operations Manager

Workspace One screenshots

Today, VMware announced the launch of Workspace One and I wanted to throw a couple of screenshots out there. As a field engineer, I use Horizon Workspace every day to access my work applications. I’ve been using Workspace One for the last month and I’m happy with how responsive it is.

This is the Android URL to get the Workspace One App on your phone:
https://play.google.com/store/apps/details?id=com.airwatch.vmworkspace&hl=en

And the Apple AppStore:
https://itunes.apple.com/us/app/vmware-workspace-one/id1031603080?mt=8

This is what my workspace looks like in Google Chrome. I’ve got the Favorites showing, which are the 6 primary apps I use at VMware. Our catalog is full of many dozens of apps, it’s nice to have a quick Favorites list.

Workspace One - Chrome

 

This is the Workspace One app on my iPhone. It’s an almost identical look and feel, and the favorites I set while in Chrome are the same favorites on my iPhone.

Workspace One- iPhone

 

At VMware, we use two factor authentication to access Workspace One. However, I only had to enter my credentials with RSA key once. After that, I can go back into the app with my Touch ID stored on the iPhone.

Workspace One - Credentials

Using snapshots to back up SQL under heavy IOPS loads

I find this problem coming up frequently in the field – you’ve virtualized your SQL server and you’re trying to back it up with your snapshot-based software.  In many cases, the SQL server is under such a heavy load that it’s impossible to commit the snapshot after the backup is taken. There’s just too much IO demand. You end up having to take a SQL outage to stop IO long enough to get the snapshot committed.

Here’s one strategy for setting up your virtual SQL servers to avoid this problem altogether. It uses a disk mode called independent persistent. An independent persistent disk is excluded from snapshots – all data written to an independent persistent disk is immediately committed, even if a snapshot is active on the VM. By placing SQL datafile and logfile drives in independent persistent mode, they will never be snapshot, eliminating the problem of having to commit a post-backup snapshot.

Here’s a disk layout that I’ve used for SQL servers. These drives are set to standard mode, so a snapshot picks them up.

C:\ – 40GB, SCSI 0:0
D:\ – 30GB, SQL Binaries   SCSI 0:1
L:\ – Logs, 1GB  SCSI 1:0
O:\  – Datafiles, 1GB  SCSI 2:0
Y:\   – Backups, 1GB  SCSI 3:0
Y:\SQLBAK01 – SCSI3:1, 2TB+ mounted filesystem under Y:\

Your backup drive is limited to 2TB -512B if you’re using vSphere 5.1 or earlier, but can go up to 62TB in later versions of vSphere.

L:\Logs01 – SCSI 1:1, independent persistent, variable size, mounted filesystem under L:
O:\SQLData01 – SCSI 2:1, independent persistent, variable size, mounted filesystem under O:\

Part of why we used mountpoints was for consistency – no matter what, L: was always logs, O: was always SQL data, and Y: was always backups. There were no questions as to whether a particular SQL server had a certain number of drives for a specific purpose – the entire structure was under a single, consistent drive letter.

Depending on the workload, O:\SQLData01 might have only 1 heavily used database on a single LUN, or it might have a bunch of small databases.  When we needed another one, we’d attach another mountpoint O:\SQLData02 on SCSI 2:2, L:\Logs02 on SCSI 1:2, Y:\SQLBAK02 on SCSI 3:2. Nightly backup jobs wrote SQL backups out to Y:\.  Since the Y drives are all in standard mode, backup jobs picked up the dumps in the normal snapshotting process.

If you had a complete loss of the entire SQL VM, you could restore from backup and you’d still have the L:, O:, and Y: drives with their mountpoints (although they might not have any disk attached to them), and you’d have to restore the database from the SQL dumps on Y:\.  Depending on what the nature of VM loss was, you may have to spend some time manually fixing the mounts.

It took a little bit of work to maintain, but our backups worked every time. Part of setting up a new database was that the DBAs wrote a restore script and stored it in the root of Y: which got backed up as part of the snapshot. Once the VM came back from Veeam restore, the DBAs would bring up SQL, hit the restore scripts, and we were off and running. You also need to coordinate your DBA’s backup schedule carefully with your backup software schedule – what you don’t want is to have backups being written to the Y: drive at the same time you’ve got an active snapshot in place – you could easily fill up the datastore if your backups are large enough. Some backup software allows you to execute pre-job scripting, it’s a fairly simple task to add some code in there to check if an active SQL backup was running. If so, postpone your backup snapshot and try again later.

NSX per-VM licensing compliance

I had a customer with a production cluster of around 100 VMs. They needed to segment off a few VMs due to PCI compliance, and were looking at a large expense to create a physically separate PCI DMZ. I suggested instead the purchase of our per-VM NSX solution. This is a great cost-effective way to license NSX when you’re in a situation like this that doesn’t require socket licenses for your whole cluster.

The problem with per-VM licensing is with compliance. VMware doesn’t have a KB explaining a way to make sure you aren’t using more than the number of licenses that you bought. If you add a 25-pack of NSX licenses to a cluster with 100 VMs in it, the vCenter licensing portal will show that you’re using 100 licenses but only purchased 25. VMware KB 2078615 does say “There is no hard enforcement on the number of Virtual Machines licensed and the product will be under compliance.” However, this post is related to the way per-socket licensing displays when you add it to vCenter, not related to per-VM pricing.

I’ve had a few conversations with the NSX Business Unit (NSBU) and the intent of per-VM licensing is to allow customers to use the NSX distributed firewall without physically segmenting clusters. You can run NSX-protected VMs in a cluster alongside non-NSX-protected VMs. However, you have to take some steps to ensure that you’re remaining in licensing compliance. This post shows you how to do it.

One way to do this is to avoid using ‘any’ in your firewall rules. If all of your firewall rules are VM to VM or security group to security group, all you have to do is keep the total VM count below your purchased VM count. It is difficult to craft a firewall policy without using ‘any’, though this is the simplest method if your requirements lend themselves to this method.

An alternative way is to use security tags. It’s a bit more involved but lets you have precise control over where your NSX security policy is applied.

First, I create two custom tags, Custom_NSX.FirewallDisabled and Custom_NSX.FirewallEnabled

nsx-custom-tags

I then assigned tags to my VMs as shown below. The disadvantage to this method is you have to keep making sure that you security tag VMs. But it does make rule writing easier. I’m only creating two groups – NSX enabled and disabled. However, there’s nothing stopping you from creating multiple tags – maybe you have a DMZ1 and DMZ2, maybe PCI and HIPAA are separate tags.

In this case, I assign all of my PCI VMs the FirewallEnabled tag and the rest of my VMs the FirewallDisabled tag.

assign-security-tag

Now, instead of going to the Firewall section, I go to Service Composer. Don’t be confused by the fact that the security groups already exist – I took the screenshot of the Security Groups tab after I created the groups.

service-composer

First, I create an NSX_Disabled group with a dynamic membership of CUSTOM_NSX.FirewallDisabled.

custom-disabled

Next, I create an NSX_Enabled security group with a dynamic membership of CUSTOM_NSX.FirewallEnabled

custom-enabled

I then specifically exclude NSX_Disabled from the NSX_Enabled group. This guarantees that no firewall rules can touch my excluded VMs.

nsx-exclude

I create a new security policy in Service Composer

new-security-policy

In the Firewall Rules section, NSX has something called “Policy’s Security Groups”.  If we assign the policy to the NSX_enabled security group, we can safely use an ‘any’ rule as long as the other side is ‘Policy’s Security Groups’. So source could be ‘any’ if dest is Policy’s Security Groups, or dest could be ‘any’ if source is Policy’s Security Groups. The security group we made enforces that NSX won’t apply rules on VMs that aren’t in the NSX_enabled group.

policys-security-groups

I then apply my new policy to the NSX_Enabled security group.

policy_apply security-group-select

Doing a security policy this way is a bit more involved than simply using the Firewall section of NSX, but it’s worth considering. It’s a perfect way to ensure 100% compliance in a per-VM model. It’s also helping you unlock the power of NSX – all you have to do is security tag VMs and they automatically get their security policy.

 

OpenStack introduction

I have a large deal on the table and they are asking about VMware’s support for OpenStack. Since I know nothing about OpenStack, other than the fact that VMware offers VMware Integrated Openstack, I decided it was time to find some training. Fortunately I have many specialists inside VMware who can help answer customer questions around VIO.

There’s plenty of VIO training internally at VMware, but I needed something even more basic, just an intro. I went to my trusty PluralSight subscription and found Eric Wright‘s Introduction to OpenStack. This is a great course to come up to speed on the basics of OpenStack in only 2.5 hours.

Unable to connect virtual NIC in vCloud Air DRaaS

I had a customer open a service request, they were in the middle of a DR test using vCloud Air DRaaS and were unable to connect 1 virtual machine to the network. It kept erroring out with a generic unable to connect error.

It turns out that their VM had a VMDK sized with a decimal point, like 50.21GB instead of just 50GB. I don’t see it often, but this sometimes happens when P2V a machine. The vCloud Director backend can’t handle the decimal point in the disk size, so it errors out.

I’m not entirely sure why the error happens, but the fix is to resize your source disk to a non-decimal number and run replication again.