ASUS stock firmware routing problem?

I have a very simple setup with an ASUS as my edge router /24, a routed connection to my homelab Cisco layer 3 switch, and a few /24 SVIs on the Cisco. I have static routes on the ASUS pointing to the Cisco SVIs, and a default route on the Cisco pointing to the ASUS.

A few months back, lightning struck nearby the house and fried my cable modem, ASUS, and Cisco switch. I replaced all of them, but I could never correctly communicate with the homelab. When I was directly connected to the Cisco switch (3750), I had no problems and could communicate with all SVIs. I could ping back and forth between the 3750 and the ASUS (RT-AC66U_B1). But I could never SSH (or drive any other traffic) from the 3750 to the RT-AC66U, or RT-AC66U to 3750 . This has baffled me for some time, but I was bypassing it by directly connecting to the lab with an ethernet cable. I finally sat down to solve it today.

Even though my ethernet cable between the ASUS and Cisco was able to carry successful ping traffic, and tested OK with a cable tester, I decided to replace it. I apparently can still make my own ethernet cables successfully ūüôā¬† The problem persisted after replacement.

Thinking maybe my laptop was the culprit, I tried other devices but they all exhibited the same behavior. Then I started looking at the ASUS. I had always used the Merlin firmware for my ASUS because the stock firmware was severely lacking in features. However,¬† the newest stock firmware looked OK when I bought the new ASUS, so I kept it. And there was my mistake.¬†I saw a couple of posts saying that static routing wasn’t working correctly on ASUS routers.

Stock ASUS firmware¬†3.0.0.4_380_7743 running on my RT-AC66U_B1 does not seem to correctly handle static routes. As soon as I flashed the router to¬†Merlin-RT-AC68U_380.68_4, all of my routing problems disappeared. I didn’t even lose my config.

 

A reflection on the VMworld Hackathon

Many others have written posts summarizing VMworld, I won’t do that here. If you’d like a live-Tweet archive of the keynotes, you can look on my Twitter timeline starting on August 28, 2017. For a full blogpost, please check out Paul Woodward Jr.’s recap, as well as Sheng Sheen’s detailed VMware announcements post.

I had a great opportunity to participate in the VMworld Hackathon and I believe it was a career-changing experience. Back to that in a minute. First, let’s explore why was I part of the Hackathon at all. ¬†I’m not a developer. I’m a presales engineer. Although I wrote code for a living a while back, I haven’t developed anything professionally in almost 10 years. Most of what I did was classic ASP and VBA, and a few monster T-SQL stored procs. It wasn’t what I considered “real” programming at the time – folks who wrote object-oriented code, used big fancy source control systems, worked on large team projects, etc.

Paul did a vBrownBag tech talk at VMworld, see the replay of From CNC to VCP: A Journey of Professional Growth. One of the things Paul talked about is building your personal brand and the power of social media. ¬†To help build his brand, Paul decided to start the ExploreVM Podcast. Without Twitter, I wouldn’t have known that he was starting a podcast. Without Twitter, I wouldn’t have seen him offering guest slots on the podcast, and I wouldn’t have made Episode 7 –¬†Making the Move to a Pre-Sales Role with him.

Without Twitter, Nick Korte¬†wouldn’t have found the podcast, listened to it, and reached out to me via Twitter DM to ask questions.

Without Twitter, I wouldn’t have known Nick’s name as I scrolled through the list of Hackathon leaders when I was considering a team. And I probably wouldn’t have joined a team because I was intimidated – I’m not a programmer. ¬†But I knew Nick, and he’s not a programmer either, he’s a sysadmin. It’s not scary to join a team with a sysadmin, right? So I joined. Nick did a great post-Hackathon writeup, check that out here.

Without Twitter, I wouldn’t have met Chris Dye, one of the professional developers on our team. He kindly spent his time filling in some of my knowledge gaps as I struggled to understand how software development works today.

A number of people spent considerable time running pre-Hackathon training sessions. I went to¬†Jeeyun Lim‘s excellent “Getting started with Clarity” session. ¬†I learned that I still have a lot to learn – but I understood what Jeeyun was doing. I understood how things like Node.js and Angular make my life much simpler. I understood how the frameworks take what I used to do in hundreds of lines of classic ASP and turned them into a few configuration options. ¬†And thankfully, VMware has invested in a Pluralsight account, allowing me to learn what I’ve missed in the last decade.

I’ll never become a world-class developer. I won’t write any earth-shattering algorithms or contribute to the Linux kernel. There’s a reason I moved out of development and into the infrastructure side. But in this world of automation and devops, being able to write and understand code is a necessity. Hackathon rekindled my interest in programming. It made me realize that I don’t have to be somebody who builds APIs, or builds PowerShell libraries, or writes kernel code. Being able to programmatically consume what others have already made for me is enough. I took my first step towards understanding last week, and I will continue this week and future weeks. I hope I get to go to VMworld next year, and if there’s a Hackathon, you can bet that I’ll participate. I might even contribute some code this time.

I will close by saying that you do NOT need to be a developer to participate in Hackathon. In fact, the best teams have a mix of infrastructure folks and developers, as there is always plenty for the infra folks to do. If you get the opportunity next year, sign up. It’s worth it!

Invoking the vRealize Automation API ‚Äď Part II

In Part I, I talked about why I wanted to learn API calls in vRA and how I got my lab environment working. In Part II, I will talk about how I learned how to make an API call.

I relied heavily on Grant Orchard’s getting started guides. I have linked to Part I, II, and III below, with my explanations of how I used his blog to achieve my goal.

Part I – Getting Started [grantorchard.com]

I couldn’t figure out how to browse through API calls because I wasn’t seeing what Grant was showing. It took me forever to realize that at the very bottom of the page, you ¬†can click on Show/Hide – then the API calls appear and you can drill into each one for full details.Show / Hide API calls

Part II – Building Your First API Call [grantorchard.com]

Grant wrote:
Before we start, perform the following steps.
1. Download Postman.
2. Import this Postman collection of the vRA 7.2 API.
3. Import this Postman environment variables file.
4. Open up the API docs at https://{{vra-fqdn}}/component-registry/services/docs

Postman? What’s Postman? ¬†It’s a GUI tool to issue API calls.

What’s a Postman collection? It’s a group of API calls that you can easily click on in the GUI.

I can easily search a Collection for the API I want. In this case, I know I want to get the Bearer token (Grant explains this, it’s how the API requests are authenticated), so I search for ‘token’. I click on the “returns a token associated with the provided credentials” and it opens up the request complete with the proper URL. It saves me from having to manually piece together the API calls and paste them into Postman.

You’ve probably noticed {{vra-fqdn}} in the URL. It’s not just a placeholder. It’s an environment variable.

Grant provided a bunch of environment variables in his post – you can import the environment variables and change them to match your lab environment. You can reference these variables inside Postman.

Following Grants example, I opened the token API in Postman. ¬†The ‘Tests’ section saves the Bearer token in a variable named “token”.

Part III – Requesting a Catalog Item [grantorchard.com]

Grant’s post said the API I needed was ‘entitledCatalogItemViews’. You can see that I’m using the {{vra-fqdn}} variable in the URL as well as passing the Bearer {{token}} value. One problem I ran into is that you must have a space between Bearer and {{token}}.

Hit Send and my results come back. I have only one blueprint, a Linked Clone blueprint with Photon Linux in it. You can see two links – one for the GET: Request Template, and the other for the POST: Submit Request. The Request Template will return an example set of JSON showing you how to make the POST call to start the Blueprint.

Now I open another Postman tab and paste in the Request Template URL. Add the proper header for Authorization, and hit Send.

This is just a subset of the JSON I got back. I left the tab open and launched a new tab.

In the new tab, I used the URL in the Submit Request response that I got above. I did the same Authorization header as used previously, and pasted the Template JSON from above into the Body field.

 

After pressing Send, I got this response in the Body. You can see a Request ID as well as a state of “Submitted”

There is also an API where you can check on the state of a request. You can see now that the state has changed from Submitted to In Progress. You can keep

You can see my request in progress inside vRA

You can also see activity in the vSphere Web Client.

You can continue checking on the provisioning status by clicking Send in Postman. You would do the same thing programmatically – periodically ping the API for this asynchronous request to determine when it has completed. We now see that the status code is Successful instead of In Progress

I now have a new Item in vRA.

Now that I know the correct APIs to use, and that they work as expected in my lab environment, I can get to work calling them from Powershell. Part III of the series will document this process.

Invoking the vRealize Automation API – Part I

This post was inspired by a desire to speed up the prep time of my demos. We use nested demo environments hosted inside vCloud Director. The nested environments have resource limitations and we sometimes have to shut down unused VMs in a demo environment to ensure that other components get enough resources to execute. I also wanted to do as little prep work inside vRA as possible – automatically launch blueprints so I have a few managed VMs to show off. My idea was to write a PowerShell script that could be easily launched from the desktop.

First, I did a simple install of vRA in my home lab (this was back in May, vRA 7.2). ¬†I’d like to thank my friend Eric Shanks for his fantastic vRA7 guide available at The IT Hollow. His posts have been extremely valuable in helping get my lab environment working. When I built my environment, I used the same Windows 2012 template machine for both my IAAS box as my SQL Server. This ended up being a major source of trouble for me, which I will detail later.

This week, I started following Eric’s guide to configure vRA. I got it to the point of creating a new tenant and got AD authentication working. Then I tried using the vCenter endpoint that had been created but the logs were throwing SSL errors. I deleted it and recreated it, which was successful, but then I saw logs in Infrastructure>Monitoring>Logs that said it was looking for something named ‘vCenter’. So I deleted the endpoint again and named it vCenter. ¬†I tried a bunch of stuff including deleting and recreating, then I got other errors and eventually got it to work and I saw my compute resources under the vCenter endpoint.

I moved on to try making a Fabric Group, but I could only select my lab cluster, it didn’t have any resources in it, I couldn’t assign any compute or storage. I went back to the logs and found “DataBaseStatsService: ignoring exception: ¬†Error executing query usp_SelectAgent ¬†Inner Exception: Error executing query usp_SelectAgentCapabilities”

I googled the error and came up with this Communities page as well as KB543238
They both pointed me to MSDTC being a problem. But the KB seemingly only applied to vRA 6.x. I followed the communities post and tried uninstalling and reinstalling MSDTC, but no success.

At this point I wondered if I was hitting some 7.2 bug. Since 7.3 was out, I ran an upgrade. The vRA appliance and IAAS box upgraded without issue. ¬†As soon as I logged back in, the vCenter Endpoint wasn’t working at all. The log was full of errors saying “Failed to connect to the endpoint. To validate that a secure connection can be established to this endpoint, go to the vSphere endpoint on the Endpoints page and click the Test Connection button. Inner Exception: Certificate is not trusted (RemoteCertificateChainErrors).”

Per the vRA 7.3 Release Notes, certificate validation is turned on. Not wanting to mess around with signed certificate replacement in the lab,  I got around this problem by downloading the root CA certificate from the homepage of my VCSA, and installing it in the Trusted Root Certification Authorities bucket on the IAAS box. Making this change brought me back to the usp_SelectAgent error. I logged into SQL and tried to see if I could execute the usp_SelectAgent stored procedure, which worked fine.

Having debugged the problems for the better part of two days at this point, I went for help, which thankfully came quickly in our internal message board. My problem was definitely the MSDTC – even if you Sysprep a box, it doesn’t reset the MSDTC unique CID – so the IAAS box was unable to communicate with the SQL server.

I followed this procedure to reset the CID on both SQL and IAAS:

1. Stop the Manager Service.
2. Stop the SQL Server service.
3. Open a command prompt on the machine with the Manager Service and issue the following command:
msdtc -uninstall
4. Open a registry editor on the Manager Service and delete the following keys if they exist:

HKLM/Software/Microsoft/Software/MSDTC
HKLM/System/CurrentControlSet/Services/MSDTC
HKEY_CLASSES_ROOTCID

5. Reboot the machine with the Manager Service.
6. Open a command prompt on the machine with the Manager Service and issue the following command:
msdtc -install
7. Perform steps 3-6 on the machine running the SQL Server.
This procedure generates new CID values for MSDTC on both servers.

After this procedure was completed, everything worked and I was able to continue my vRA configuration without issue.

In Part II, I will cover how I learned some basic vRA API operations.

 

 

Home Lab, NSX, NTP, and Update Manager

I’m writing about a comedy of errors trying to deploy vRA and NSX in my homelab. As usual, I just grab the bits and start trying to make things work.

tl;dr: Use NTP in your home lab, NSX host prep relies on Update Manager, and look at the VMware interop guide.

My first problem was an inability to deploy the NSX VIBs with error “Agent VIB module not installed”. I googled around and found this KB “Agent VIB module not installed‚ÄĚ when installing EAM/VXLAN Agent using VUM (2053782)“, but it didn’t apply to 6.5

I started tailing /var/log/messages and my vpxd.log on the VCSA and saw certificate errors as well as complaints about NTP drift. So I tried refreshing the certificates
/usr/lib/vmware-updatemgr/bin/updatemgr-util refresh-certs

This eliminated the certificate complaints, and the NTP errors were because I hadn’t properly configured NTP across the whole lab. I rarely have dedicated lab time, so I’m doing the work in small chunks across many days. Sometimes I miss some of the basic steps like NTP. I ended up syncing everything in the lab to the NTP service running on my core switch.

I still was unable to deploy the VIBs, and at that point I discovered that the VIB deployment is reliant on Update Manager. I opened up update manager in the web client only to discover that it errored out. Having services down was not a surprise because I’m experimenting with just how little RAM I can give the VCSA in my lab and still have it functional.

root@vc01 [ /usr/lib/vmware-updatemgr/bin ]# service-control –status
Running:
applmgmt lwsmd vmafdd vmonapi vmware-cm vmware-content-library vmware-eam vmware-perfcharts vmware-rhttpproxy vmware-sca vmware-sps vmware-vapi-endpoint vmware-vmon vmware-vpostgres vmware-vpxd vmware-vpxd-svcs vmware-vsan-health vmware-vsm vsphere-client vsphere-ui
Stopped:
vmcam vmware-imagebuilder vmware-mbcs vmware-netdumper vmware-rbd-watchdog vmware-statsmonitor vmware-updatemgr vmware-vcha
root@vc01 [ /usr/lib/vmware-updatemgr/bin ]# service-control –start vmware-updatemgr

Once I started the service I could use update manager in the GUI, but still not deploy the VIBs. I rebooted vCenter, still no success deploying VIBs even though all of the services were up. At this point I went back to the basics of how to install NSX, and the first step is to check the interop. Uh-oh. My lab is running vSphere 6.5.0d. I was trying to deploy NSX 6.2.7. And they’re not compatible.
6.2.7-oops

I ended up fixing some bugs in the lab, which was good, but I spent a lot of time troubleshooting something that was never going to work!

Once I used a supported version, things got much better.

InstallSuccess

New job, same company

I’m excited to announce my promotion to the newly formed Dell Execution Team (DET) at VMware, effective May 8, 2017! No, we’re not a team of highly trained assassins. The Dell Execution Team helps drive VMware’s software-defined datacenter message through the strength of Dell Technologies. I’ve already done a few DET meetings with customers and I’m beyond thrilled with customer response. Although I’m leaving my role as a Commercial Systems Engineer, I do get to remain in the same excellent Commercial organization at VMware – new manager, same director.

We are actively hiring for my old job! Check out the job posting. If you’re in Chicago and have great VMware skills, hit me up! If I know you, I’ll refer you.

Windows 10 – OneDrive won’t start

OneDrive mysteriously stopped working on my Windows 10 laptop. I have no idea what went wrong, I tried reinstalling it even though it’s now part of Windows 10. Nothing worked. I finally came across this post and wanted to repost it in case it ever disappears. My situation was the same, the GPO setting was unset, which should have had the same effect, but I had to disable it to get One Drive working again.

User ‘pirwen’ posted the solution that worked for me at this link in the Microsoft Community.

“I remembered seeing an option to prevent the usage of OneDrive via the Group Policy editor, seems that there is also an option to force enable it. Here are the steps I followed:
On your keyboard hit Windows Key + R to open the Run dialog and type: gpedit.msc and hit Enter to open Local Group Policy Editor.
Next navigate to Computer Configuration\Administrative Templates\Windows Components\OneDrive. In the right panel, double click Prevent the usage of OneDrive for File Storage.
Then here instead of selecting Enabled (as many tutorials suggest to disable OneDrive on Windows 10) I selected Disabled, and saved my changes. This option was originally unset, which should have worked just as if it was disabled, except it didn’t.
After doing this I opened OneDrive again and got a notification for an update. After a few seconds it opened, and was finally working again.”

VCIX6 designation clarifications

There is a lot of confusion out there regarding the upgrade paths, the VCIX6, and underlying VCP6 requirements to achieve certification. The Certification folks are working on clarifying language on our website and accurate instructions for our customer-facing employees behind the certification@vmware.com alias.

I am writing this point from the standpoint of the Datacenter Virtualization exam, since that is the track that I am following for my VCDX attempt. If you’re in a different track, the same policy applies for your specific track.

      • If¬†you are brand new to the DCV track, you have to pass the VCP6-DCV exam. ¬†You can’t use a VCP6-DTM to start up the DCV track

 

      • If you are a VCAP5-DCA, you can pass the VCAP6-DCV Design exam to achieve VCIX6-DCV designation

 

      • If you are a VCAP5-DCD, you can pass the VCAP6-DCV Administration exam to achieve the VCIX6-DCV designation

 

      • If you are both a VCAP5-DCA and VCAP5-DCD, you can take either the VCAP6-DCV Design *OR* VCAP6-DCV Administration exam to achieve your VCIX6-DCV designation

 

      • Your VCP in the Datacenter Virtualization track must be valid (unexpired). VCAP5 holders with a valid VCP do NOT have to take the VCP6-DCV exam to sit a VCAP6 exam.

 

      • Passing the VCAP6-DCV Design or Administration exam extends the expiration date of your VCP for 2 additional years

 

      • Achieving the VCIX6-DCV designation will not give you the underlying VCP6-DCV certification.

 

      • The VCIX6-DCV is the only prerequisite for VCDX. You DO NOT need a VCP6-DCV certification

 

VCAP6-DCV Design exam 3V0-622 – Rescore

Update December 6, 2016

The rescore process is complete, all results have been posted to Cert Manager.

Update December 5, 2016

The batch processing at Pearson continues to fail. The certification team is manually updating all score results. This will take a considerable amount of time, but they are making good progress. The hope is to have all rescore results posted by the end of the day Pacific time on December 6th.

Original Post November 30, 2016

Exam takers who failed the 3V0-622 received a notice from Pearson that the exam was under review and might be rescored. The date in this email was that a rescore was expected by November 20th.  We are obviously well beyond that date and people are still anxiously awaiting results of the rescore. I am among those waiting for news.

I have volunteered some of my time with the certification team as a SME to help develop exam content (not for 3V0-622). It’s given me insight into just how extraordinarily time consuming it is to create a legally defensible certification exam.¬†No portion of the process is simple. It’s quite similar to putting code into production – even the slightest change means you have to run your entire battery of testing before promoting code. ¬†Any hiccup means re-running your tests from the beginning.

I have spoken internally with the Certification team at VMware regarding the staus of 3V0-622. They are doing everything they can to get the rescores out. However, you cannot magically make the processes work faster – the whole process from end-to-end takes 3-4 days. QA processes take the amount of time they take and cannot be rushed or skipped. Pearson has encountered a number of technical difficulties with the exam drivers and have had to run the process multiple times. Progress was further impeded by various resources being unavailable due to the Thanksgiving holiday last week.

At this point we are hoping for exam results to be available online on Friday December 2nd.