The Security Association (SA) lifetime defines how how long a VPN tunnel stay up before swapping out encryption keys. In VMC, the default lifetimes are 86,400 seconds (1 day) for Phase 1 and 3600 (1 hour) for Phase 2.
I had a customer that needed Phase 2 set to 86,400 seconds. Actually, they were using IKEv2 and there really isn’t a Phase 2 with IKEv2, but IKE is a discussion for a different day. Regardless, if you need the tunnel up for 86,400 seconds, you need to configure the setting as shown in this post. This can only be done via API call. I will show you how to do it through the VMC API explorer – you will not need any programming ability to execute the API calls using API explorer.
Log into VMC and go to Developer Center>API Explorer, pick your SDDC from the dropdown in the Environment section, then click on the NSX VMC Policy API.
Search for VPN in the search box, then find Policy,Networking,Network,Services,VPN,Ipsec,Ipsec, Profiles.
Expand the first GET call for /infra/ipsec-vpn-tunnel-profiles
Scroll down until you see the Execute button, click the button to execute the API call. You should get a response of type IPSecVpnTunnelProfileListResult.
Click on the result list to expand the list. The number of profiles will vary by customer – in my lab, we have 11 profiles.
I click on the first one and see my name in it, so I can identify it as the one I created and the one I want to change. I find the key sa_life_time set to 3600 – this is the value that needs to change to 86,400
Click on the down arrow next to the tunnel profile to download the JSON for this tunnel profile. Open it in a text editor and change the value from 3600 to 86400 (no commas in the number).
Now we need to push our changes back to VMC via a PATCH API call. Find the PATCH call under the GET call and expand it.
Paste the entirety of the JSON from your text editor into the TunnelProfile box. You can see that the 86400 is visible. Paste the tunnel ID into the tunnel-profile-id section – you can find the ID shown as “id” in the JSON file. Click execute. If successful, you will get a “Status 200, OK” response.
Now to verify. Find the GET request that takes a tunnel profile ID – this will return just a single tunnel profile instead of all of them.
Pass it the tunnel ID and click Execute. You should get a response with a single tunnel profile object.
Click on the response object and you should find an sa_life_time value of 86400.
My customer reported an issue with DNS zones in VMC, so I needed it up in the lab to check the behavior. The DNS service in VMC allows you to specify DNS resolvers to forward requests. By default DNS is directed to 126.96.36.199 and 188.8.131.52. If you’re using Active Directory, you generally will set the forwarders to your domain controllers. But some customers need more granular control over DNS forwarding. For example – you could set the default forwarders to domain controllers for vmware.com, but maybe you just acquired Pivotal, and their domain controllers are at pivotal.com. DNS zones allow you direct any request for *.pivotal.com to a different set of DNS servers.
First, I needed an IPSEC VPN from my homelab into VMC. I run Ubiquiti gear at home. I decided to go with a policy-based VPN because my team’s VMC lab has many different subnets with lots of overlap on my home network. I went to the Networking & Security tab, Overview screen which gave me the VPN public IP, as well as my appliance subnet. All of the management components, including the DNS resolvers, sit in the appliance subnet. So I will need those available across the VPN. Not shown here is another subnet 10.47.159.0/24, which contains jump hosts for our VMC lab.
I set up the VPN in the Networks section of the UniFi controller – you add a VPN network just like you add any other wired or wireless network. I add the appliance subnet 10.46.192.0/18 and my jump host subnet 10.47.159.0/24. Peer IP is the VPN public IP, and local WAN IP is the public IP at home.
I could not get SHA2 working and never figured out why. Since this was just a temporary lab scenario, I went with SHA1
On the VMC side I created a policy based VPN. I selected the Public Local IP address, which matches the 34.x.x.x VMC VPN IP shown above. The remote Public IP is my public IP at home. Remote networks – 192.168.75.0/24 which contains my laptop, and 192.168.203.0/24 which contains my domain controllers. Then for local networks I added the Appliance subnet 10.46.192.0/18 and 10.47.159.0/24. The show up as their friendly name in the UI.
The VPN comes up.
Now I need to open an inbound firewall rule on the management gateway to allow my on-prem subnets to communicate with vCenter. I populate the Sources object with subnet 192.168.75.0/24 so my laptop can hit vCenter. I also set up a reverse rule (not shown) outbound from vCenter to that same group. This isn’t specifically necessary to get DNS zones to work since we can only do that for the compute gateway – it’s the compute VMs that need the DNS zone. But I wanted to reach vCenter over the VPN.
I create similar rules on the compute gateway to allow communication between my on-prem subnets and anything behind the compute gateway – best practice would be to lock down specific subnets and ports.
I try to ping vCenter 10.46.224.4 from my laptop on 192.168.75.0/24 and it fails. I run a traceroute and I see my first hop is my VPN connection into VMware corporate. I run ‘route print’ on my laptop and see the entire 10.0.0.0/8 is routed to the VPN connection.
This means I will either have to disconnect from the corporate VPN to hit 10.x IP addresses in VMC, or I have to route around the VPN with a static route
At an elevated command prompt, I run these commands
This inserts two routes into my laptop’s route table. The -p means persistent, so the route will persist across reboots.
Now when I run a route print I can see that for my VMC appliance subnet and jump host subnet.
I can now ping vCenter by private IP from my laptop. I can also ping servers in the jump host subnet.
Now to create a DNS zone – I point to one of my domain controllers on-prem – in production you would of course point to multiple domain controllers.
I flip back to DNS services and edit the Compute Gateway forwarder. The existing DNS forwarders point to our own internal lab domain, and we don’t want to break that communication. What we do want is to have queries destined for my homelab AD redirected to my homelab DNS servers. We add the zone to the FQDN Zones box and click save.
Now we run a test – you can use nslookup, but I downloaded the BIND tools so I can use dig on Windows.
The correct record appears. You can see the TTL next to the DNS entry at 60 seconds – the VMC DNS server will cache the entry for the TTL that I have configured on-prem. If I dig again, you can see the TTL counting down toward 0.
I do another lookup after the remaining 21 seconds expire and you can see a fresh record was pulled with a new TTL of 60 seconds.
Let’s make a change. I update vmctest88 to point to 192.168.203.88 instead of .188, and I update the TTL to 1 hour.
This will be cached for 1 hour in VMC.
I switch it back to .188 with a TTL of 60 seconds on-prem, which is reflected instantly
But in VMC, the query still returns the wrong .88 IP, with the TTL timer counting down from 3600 seconds (1 hour)
My customer had the same caching problem problem, except their cached TTL was 3 days and we couldn’t wait for it to fix itself. We needed to clear the DNS resolver cache. In order to do that, we go to the API. A big thank you to my coworker Matty Courtney for helping me get this part working.
You could, of course, do this programmatically. But if consuming APIs in Python isn’t your thing, you can do it from the UI. Go to the Developer Center in the VMC console, then API explorer. Pick your Org and SDDC from the dropdowns.
Click on the NSX VMC Policy API
In the NSX VMC Policy API, find Policy, Networking IP, Management, DNS, Forwarder, then this POST operation on the tier-1 DNS forwarder
Fill out the parameter values: tier-1-id: cgw action: clear_cache enforcement_point: /infra/sites/default/enforcement-points/vmc-enforcementpoint
We see Status: 200, OK – success on the clear cache operation. We do another dig against the VMC DNS server – even though we were still within the old 1 hour cache period, the cache has been cleared. The VMC DNS server pulls the latest record from my homelab, we see the correct .188 IP with a new TTL of 60 seconds.