vCenter Event Broker Appliance – Part VIII – Basic Troubleshooting Techniques

In Part VII of this series, we deployed a second sample function – the Host Maintenance functions written in PowerShell. In this post, we discuss some basic troubleshooting techniques. This post was updated on 2020-03-07 with updated screenshots for the VEBA 0.3 release.

The VEBA is running Kubernetes.  Going into this project, I knew nothing about Kubernetes – and still don’t know very much. But I’ve learned enough to be able to do some basic troubleshooting. In the end it’s all a series of Linux commands.

You might want to SSH to the appliance instead of using the console. To do this, log in as root on the console, then execute the command systemctl start sshd.  SSH will run, but not automatically start on the next reboot.

VEBA code is running in Kubernetes pods, and those pods are running in namespaces. kubectl lets you run commands against Kubernetes. So first we list out namespaces with kubectl get namespaces

One of the namespaces you will see is openfaas. All of your functions will be running inside their own container. We issue the command kubectl get pods -n openfaas and see all of our pods that are running OpenFaaS.

PROTIP: kubectl get pods -A gives you all the pods and their associated namespaces in one command.

kubectl logs gets us logfiles from the pods. The VMware Event Router pod is responsible for communicating with vCenter. Obviously these logs could contain interesting information if you’re trying to troubleshoot VEBA communications with vCenter.

kubectl logs vmware-event-router-5dd9c8f858-nph7k -n vmware –follow

 

Here is the output in a screengrab and also pasted.  Note that we see events topics firing (AlarmStatusChangedEvent) and successful interception and function invocation. Rather than trying to dig through API documentation, one great way to figure out how to react to a vCenter event is to start tailing the logfile with –follow – this is essentially the same as tail-f . Then perform an action on vCenter and see which topic fired. You can then build a function to react to the event.

[OpenFaaS] 2020/03/08 00:28:19 invoking function(s) on topic: AlarmStatusChangedEvent
2020/03/08 00:28:19 Invoke function: powershell-datastore-usage
2020/03/08 00:28:20 Syncing topic map

[OpenFaaS] 2020/03/08 00:28:21 processing event [1] of type *types.AlarmStatusChangedEvent from source https://******************************/sdk: &{AlarmEve nt:{Event:{DynamicData:{} Key:8689704 ChainId:8689704 CreatedTime:2020-03-08 00:28:15.875458 +0000 UTC UserName: Datacenter:0xa84c560 ComputeResource:<nil> Host:<nil> V m:<nil> Ds:0xa84c6c0 Net:<nil> Dvs:<nil> FullFormattedMessage:Alarm ‘Datastore usage on disk’ on WorkloadDatastore changed from Gray to Green ChangeTag:} Alarm:{EntityE ventArgument:{EventArgument:{DynamicData:{}} Name:Datastore usage on disk} Alarm:Alarm:alarm-7}} Source:{EntityEventArgument:{EventArgument:{DynamicData:{}} Name:Datace nters} Entity:Folder:group-d1} Entity:{EntityEventArgument:{EventArgument:{DynamicData:{}} Name:WorkloadDatastore} Entity:Datastore:datastore-60} From:gray To:green}

 

In the above log output, we can see that OpenFaaS caught a VM powering up, and invoked the function powershell-datastore-usage.  But we don’t know what happened inside that function.

PROTIP: you can also use –since=2m for the last two minutes of logs, or –tail=20 for the last 20 lines of log.

Let’s look directly into the function logs. Your functions run in the openfaas-fn namespace. We list out the pods.
kubectl get pods -n openfaas-fn

We then look at the logs with:
kubectl logs -n openfaas-fn powershell-datastore-usage-847d5c7875-286hv

Looking at the logs, we see the function firing.

 

Let’s look at a function that I broke intentionally. I update my secret with a configuration that has a bad password – you can test this behavior with any of the sample functions. We happen to be using the pytag function below.

 

Now I want to look at the logs for my pytag function. Again, the commands are

kubectl get namespace
kubectl get pods -n openfaas-fn
kubectl logs pytag-fn-f66d6cffc-x5ghk -n openfaas-fn

An astute reader might notice that my pytag container changed names between the previous screenshot and the current one. The content for this post was written in 2 sessions – when I came back for the second screenshot, I had redeployed the function numerous times. This means I got a new pod with a new name.You can always find the pod name with the get pods command.

Here are the logs when I try to power up a VM after I pushed the broken vcconfig secret:

The log clues us in to an unauthorized error. It doesn’t specifically say password error, but you know there’s something wrong with authentication – at least you have a place to start troubleshooting

2019/12/29 00:46:07 Path /
{“status”: “500”, “message”: “could not connect to vCenter 401 Client Error: Unauthorized for url: https://vc01.ad.patrickkremer.com/rest/com/vmware/cis/session”}

 

Now let’s fix it – we update the secret with the correct password

You can see in the logfile there was one more authentication failure before the new secret was picked up. I’m not sure at the time I posted this exactly what governs picking up a new secret, but it is important to know that it is not instant.

 

In Part IX, we deploy the datastore usage alarms sample function.

4 thoughts on “vCenter Event Broker Appliance – Part VIII – Basic Troubleshooting Techniques

  1. Pingback: vCenter Event Broker Appliance – Part IX – Deploying the Datastore Usage Email sample script in VMC | Patrick Kremer

  2. Pingback: vCenter Event Broker Appliance – Part Ia – AWS EventBridge Deployment | Patrick Kremer

  3. Pingback: vCenter Event Broker Appliance – Part I – Deployment | Patrick Kremer

  4. Pingback: vCenter Event Broker Appliance – Part VII – Deploy the Sample Host Maintenance Function | Patrick Kremer

Leave a Reply

Your email address will not be published. Required fields are marked *