Tanzu: How to recover from orphaned TKC nodes

Kubernetes is quite competent with keeping the desired states of application components like pods thanks to deployments, replicasets, daemonsets, etc. But when it comes to cluster level components like nodes, this can become a challenging task even with the presence of Cluster API.

Today, I wanted to see how it is possible to recover from a situation where something really goes wrong with TKC nodes at the vSphere level. Before that, let me elaborate on what I mean by “something really wrong“.

If you carefully inspect the virtual machines deployed by WCP (Workload Control Plane) at the vSphere layer, you’ll see that the actions you are familiar with traditional VMs (like stop, start, remove and even open console) are grayed out. This is by design and this gives the full control of the VMs to WCP Service. So vi-admins cannot directly stop or remove these VMs accidentally.

Of course, neither a failed ESXi server will cause something spectacular as this will trigger a typical HA scenario and the node will be up and running on a different ESXi host in a minute or two. The scenario is, what if something goes wrong at the storage layer and the VM becomes completely inaccessible. We all know that this can happen.

Triggering a failed node

In order to destroy a TKC node on vSphere, my plan is to trigger that from the ESXi server that it runs on. To find out the ESXi server that the VM is running on is the moooost straight-forward thing that we can imagine, but here I’ll show the Kubernetes way.

  • Login to Supervisor Cluster with default administrator or an authorized sso account.
  • Query virtualmachine resources with kubectl and get host names.
kubectl -n your-namespace get tkc 
kubectl -n your-namespace get virtualmachines -o custom-columns='NAME:.metadata.name,HOST:.status.host'
  • SSH to the ESXi server with root.
  • Get the ID of the virtual machine, stop and then destroy it.
vim-cmd vmsvc/getallvms
vim-cmd vmsvc/power.off <id-of-the-VM>
vim-cmd vmsvc/destroy <id-of-the-VM>

If we just power off the VM without destroying, we’ll see it up in a minute, this means that Supervisor Cluster is taking care of it. But here we’re pushing the limits and at the end, I’d expect to see something like this;

How to recover

This is where we can question why this VM is not recovered automatically if Kubernetes is highly capable of keeping the desired state. My explanation is that we have triggered this at the vSphere (or in Kubernetes terms, infra or cloud provider) layer and Cluster API is not aware of this problem. We can get this idea by getting virtualmachine and machine instances registered in our Supervisor namespace as below;

Even though the VM does not exist and shown as orphaned on vCenter, related resources within Supervisor Cluster seem as poweredOn and Running. Normally, there are another resource types called “machinedeployment” and “machineset” which supposed to create new machines if the reality does not match with desired state but in this case everything seems normal at this level and no action is taken.

I think that this will be something to be fixed in the future releases, until then a simple manual step would suffice.

  • We need cluster-admin privileges on Supervisor Cluster to perform this. Please check out my previous blog post for more details.
  • Delete the machine resource which corresponds to the failed node.
  • This will trigger vCenter to delete the VM from its database (no more orphaned VM).
  • Machineset will also notice that one machine is missing and in order to keep the desired state, it will create another machine resource (and eventually a virtualmachine resource).
  • After a few minutes, the node and the machine resource will be shown as Ready.

This is applicable for TKC worker nodes as well as TKC control plane nodes but not for Supervisor Control Plane VMs.

Note: There is another way to remediate this situation with the help of health checks at the Kubernetes level. This will be the subject of a future post.

Tanzu: Proxy and Registry Configurations in TKC Clusters

In enterprise environments, there are always two things that every system admin needs to deal with. And if you are into Kubernetes or any other CNCF project, the effects of those items are greatly amplified.

  • Internet connectivity via proxies
  • Secure TLS communication

VMware Tanzu products are no different. You would expect your TKG guest clusters to access docker registries and pull images via a corporate proxy server. As an alternative, if you plan to use internal Harbor registry then you will need to make sure that your TKC nodes trust the CA certificate who signs the server certificate on your Harbor registry. Most probably, you’ll need both.

Edit: vSphere 7.0U2 has been recently released and according to the release notes there are improvements related with both topics. The VC version I’m running is 7.0.1 build: 17005016 so at least I know that this is applicable before 7.0U2.

For the versions that I work with, there is no declerative way of defining these configurations but there is always a workaround. VMware engineers have compiled some very useful scripts in order to define global proxy within the TKG guest clusters and add CA certificate to the nodes (Thanks to Oren Penso for showing us the way), so they deserve the credit.

I’ve slightly tweaked those scripts and you can easily find it in my GitHub repo. Clone the repo and set the variables in the script (00-tkg-script.sh) according to your environment.

  • SV_IP: VIP adress of the Supervisor Cluster
  • VC_IP: IP address of the vCenter Server
  • VC_ADMIN_USER: vCenter admin user
  • VC_ADMIN_PASSWORD: Password of the vCenter admin user
  • VC_ROOT_PASSWORD: Root password of the vCenter Server
  • Line 144: Contents of the CA certificate from the private registry you would like to use. If using embedded Harbor, you can easily get this certificate from the UI of vCenter (Cluster -> Configure -> Namespaces -> Image Registry -> Root Certificate).

Now you’re ready to go. In order to run the script with proper parameters, you’ll need the name of the TKC resource, the namespace where the TKC resource resides and the hostname or the IP address of the private registry.

./00-tkg-script.sh <tkc-cluster-name> <namespace-name> <registry>

Tricky part about this script is that it provides a way to access to the TKC nodes which might not be directly accessible from outside of overlay network. So we can use this script from anywhere we want as long as we have connectivity to vCenter and management interfaces of Supervisor Cluster Master VMs. What it basically does is ssh-over-ssh to the TKC nodes.

What this script does in detail;

  • Get the TKG Kubernetes API token via a curl request to the Supervisor Cluster
  • Get the list of the nodes and their IP adresses in the TKC cluster.
  • SSH into vCenter to get credentials for the Supervisor Cluster master VMs.
  • Get a Supervisor Cluster Kubernetes API token to get the TKC nodes SSH Password.
  • Get the TKC nodes SSH private key from the Supervisor Cluster.
  • Transfer the TKC nodes SSH private key to the Supervisor Cluster Master VM. This will be used to enable SSH from Supervisor Cluster Master VM to the TKC nodes.
  • SSH to every node in loop, modify proxy configurations of containerd service and add CA certificate of the private registry to the trusted certificates of the nodes.
  • Restarts containerd services of the nodes.
  • Cleans up temporary files on the server where the script is run.

The downside of this workaround is that if you add new nodes or replace existing ones, this procedure needs to be reapplied. So I strongly recommend not to use this in production environments without proper tests.

Again, vSphere 7U2 is supposed to let us define these configurations in a declerative way (within YAML manifests of proper resources). That would be a follow-up post in the near future. But the way this script works might still be handy in the future if other cases that requires customizations in the TKC nodes arises.

Tanzu: Access to Supervisor Cluster with cluster-admin privileges

Let’s face it, vSphere with Tanzu isn’t the most straight-forward product ever if you try to make it up and running in an enterprise environment where subnets are backed by firewalls and have limited internet connectivity via proxies. But when the dust settles after all the hassles around enabling Workload Management on vSphere, it’s quite enjoyable to play with all the TGC clusters and the supervisor cluster itself.

As soon as I started to play with supervisor cluster, I figured out that I had limited RBAC permissions even if I logged in with default administrator account. If you are really into Kubernetes and know what you are doing, you would expect to have cluster-admin privileges.

FYI, my vCenter version is 7.0.1 build: 17005016.

In order to act with cluster-admin privileges, there are two ways that I can talk about which I’m going to call them “Tanzu way” and the “Kubernetes way“.

What is described in this article is neither recommended nor supported in production environments. Please apply at your own risk.

Tanzu Way:

The most obvious way is to login to Supervisor Control Plane VM and run kubectl commands. In this way you will be logging in to the server with root and root account has certificate authentication with K8s cluster that provides cluster-admin role.

  • In order to login to Supervisor Control Plane VM, you’ll need root password that can be obtained from vCenter. So login to the vCenter server with your root account, switch your shell and run this little python script. It connects to the PSQL, queries the database and provides you the root password of the Supervisor VM as well as the VIP of the K8s api-server.
/usr/lib/vmware-wcp/decryptK8Pwd.py
  • Next step is to SSH to the IP address with the human-unreadable password. After that point, you can easily demonstrate that you have cluster-admin privileges and list/modify system level resources as well as all the custom resources that comes with Tanzu.

Kubernetes Way:

This might come in handy and provide you a workaround but what I really would like to have is to assign cluster-admin role to my sso account so that I can use it from any server that I can login from and run kubectl commands.

In order to do that, you need to leverage the concept of clusterroles and clusterrolebindings with K8s that I will not go into details and suggest this official documentation for further understanding.

Long story short, if you run the below command while logged in to the supervisor VM, you can create a clusterrolebinding that assigns cluster-admin privileges to your sso account.

kubectl create clusterrolebinding orcunuso:cluster-admin --user sso:orcunuso@your.domain --clusterrole cluster-admin

Now you are ready to use your sso account and access to your supervisor cluster with cluster-admin permissions. But never forget:

Great power comes with great responsibility