I worked on a client outage over the weekend, Virutal Center and View Composer were down. It started with a disk full situation on the SQL server hosting the vCenter, Composer, and Events databases. The client was shut down for winter break, so the Composer outage was not noticed for several days. After fixing the SQL Server disk space problem, everything came back up. I was able to restart all services and they appeared to be running. Composer started without issue, but it didn’t respond to any commands – any operations I requested in View Manager were ignored. I didn’t find any obvious errors in the logs.
I ran through the troubleshooting options in KB1030698 without finding any issues. I validated the SDK was responding by going to https://vcenteripaddress/sdk/vimService.wsdl . I couldn’t find any cause for the outage, so I opened up a Sev-1 ticket with VMware Support.
The support tech concluded that a problem with the ADAM database was preventing Composer from doing the job. He had me shut down all but one connection broker, then restart the View services on the remaining broker. At this point, commands issued on the broker were obeyed by Composer. We deleted or refreshed all of the desktops listed under Problem Desktops. Once we were sure that the ADAM database reflected the true state of the environment as reflected in vCenter, we restarted the other brokers. They synced databases and the problem was resolved.
I am building 3 new ESXi hosts to add to an existing 3 host cluster. The cluster is supporting a View install. The backend SAN is a fibre channel VNX. I attached my ethernet and installed ESXi via kickstart. As always, I did not attach the HBAs to ensure no accidental overwrites of a VMFS datastore.
I joined my 3 new hosts to the cluster and ran Update Manager to get them patched and up to date. While that ran, I started looking at zoning the fibre switches. As I did that, a problem with the desktop image was reported to me. I fixed the image, took a snapshot, and went to recompose my test pool to validate that the fix worked. The recompose fails with “View Composer Fault: VMware.Sim.Fault.VcDatastoreInaccessibleFault”
Hmm. I checked all three existing hosts and they were able to see all datastores. A quick Google search turned up KB2001736. The article states “This issue occurs when one or more hosts in the cluster do not have access to a datastore required by the pool. You may also encounter this if one of the hosts is in maintenance mode.”
This article doesn’t seem to convey the seriousness of the problem. View composer will fail to recompose if ANY host in your cluster loses access to the storage. It doesn’t matter if the host is in maintenance mode or not. I performed several tests to validate and they were consistent – any host with all paths down causes Composer to bomb.
Evicting the offending host from the cluster is the only solution. While this isn’t a major problem for my brand new hosts, it is inconvenient. For an actual production host, it is quite annoying. If you have a failing HBA, you can’t recompose unless you evict the host. This causes you to lose all of your historic performance statistics. Again, not a showstopper – but it’s not a choice you should have to make. View Composer should ignore hosts in maintenance mode.