Summary
When working with Kubernetes, we all make calls to retrieve or create resources all of the time. Whenever we run kubectl apply
or kubectl get
we request an object is added or retrieved from the database. When we write a manifest, we always include an apiVersion
attribute but what is does this mean and how does Kubernetes respond to different values?
In Kubernetes, whenever there is a request to create objects (Pod, ConfigMap, etc) they are stored in the K8S database etcd before some controller reacts to its creation or modification. These objects always include an apiVersion
attribute and unlike other systems which run their own API server and may respond to requests of different versions separately before processing then storing them in the DB as the new version, Kubernetes doesn’t. It allows the extension of its API server so that you can just focus on writing the controller, not the API server. As a result, it’s important to note the different experiences this may result in to what you would expect.
The first thing to note is that it will only ever store a single copy of the requested resource. What this means is, if there are multiple versions of the GVK (Group Version Kind) available (say v1
and v2
) it will only ever maintain one or the other. However, it’s key to understand that you will be able to request any version which is defined in the API schema as served
. This is most notable in Custom Resource Definitions, which are the official way to extend the Kubernetes API with your own resource definitions and controllers and describe how only one version can ever be stored at on any one time.
CRDs are becoming more and more common in recent years and the big cloud providers are now providing their own operators for creating cloud resources. For example, Google offers Config Connector to be added to your own cluster or a fully managed version through GKE Enterprise and its Config Controller feature (which Mesoform is leveraging as a core technology to our new developer platform, Athena).
What this all means for version management and upgrades will be discussed in more detail in another article. However, first we’ll look at the simple experience an operator may have with these different versions.
Be careful what you wish for
What you ask for is what you will get.
First off we should cover how Kubernetes can say that it’s going to server multiple versions of a resource but only ever store one version. The simple answer is that it transforms whatever version you ask for (as long as it’s defined as a served version on the CRD) into whatever version is the storage version and vice versa; and it does this in-flight.
But the stored version has different fields to the served version
This is true, and usually why there would be a different version in the first place. So, how does it know what to present back to the operator? Well, it does so by implementing one of two strategies:
none
: This first one is simple, there is no strategy defined and it does the best it can and hopes that the API developer understood the consequences of the differences between each version. This usually means that whatever is the stored version must include all of the fields available in all of the served APIs and set some sensible defaults for each version, so as not to lose data.
webhook
: This option means that the API developer has also created some additional code which is used to modify the request. It’s a webhook because the Kubernetes API will send the operator’s request to this bit of code to be transformed before it processes the object to be stored in etcd.
Watches
A particular note is also worth mentioning about watches. As a human operator, you may not think about this much but as a developer of a software operator, you may be inclined to consider this as a way to determine conditions in your code or entire versions of code bases (i.e. running two controllers side-by-side to support both versions together). Because Kubernetes stores only one version in etcd and transforms it to match the requested version, the apiVersion property cannot reliably indicate which code to execute. You always get a version of what you're asking for when you set-up the watch.
For example, if the user is requesting apiVersion: v1alpha1
, the stored version is apiVersion: v1alpha2
but you’re watching apiVersion: v1alpha1
, apiVersion: v1alpha2
and apiVersion: v1alpha3
, you will see watch events for every single version, and the object returned will have the specific apiVersion
you requested.
When you open a watch against a resource you are asking the server to provide you a representation of the stored resource from etcd and in the schema of the version you specify. Optionally including the type of operation causing the last state change (e.g., ADDED, DELETED).
Example
In this example, we will create a new custom resource and then modify it while observing two different served versions of the API schema.
Our custom resource definition is as follows and for the purpose of the example, you could just create this with a kubectl apply -f testwatches.devops.mesoform.com_crd.yaml
and test. However, we have controller code watching this resource so we can demonstrate a set of events.
The next thing we will do is create watches against both versions:
[√]> kubectl get testwatch.v1alpha1.devops.mesoform.com --watch=true --output-watch-events=true -o yaml
[√]> kubectl get testwatch.v1alpha2.devops.mesoform.com --watch=true --output-watch-events=true -o yaml
Notice the version in the middle of each resource type and that we’re setting --output-watch-events
to see what type of event produced the last change.
Now we need to create a resource of type TestWatch. To do this, we run kubectl apply -f testwatch_manifest.yaml
which looks like:
First of all, let’s cover what’s going on:
It can be seen that both watches get a stream of events
The type of events seen by both watches are the same
There are 3 events, 1 x
ADDED
(line 31) and 2 xMODIFIED
(lines 54 and 93)ADDED
is pretty self-explanatory. We created a new resource and the full-o yaml
output is what was created (with some version differences we’ll come to in a minute).MODIFIED
events were created by our controller and the-o yaml
output shows us what changed. In both cases, the controller was updating the status conditions as it progressed through its control loop. You can see the first modification was to show that the controller wasRunning reconciliation
; and the second modification was to show that theLast reconciliation succeeded
and is nowAwaiting next reconciliation
(plus some other status info)
All of these show up on both watches. However, a closer look reveals some differences:
Wherever
apiVersion
(lines 2, 34 and 57) is output what is presented is whatever the specific watch asked forThe
spec
for allv1alpha1
events only have one propertyoldProperty
The
spec
for allv1alpha2
events have two propertiesoldProperty
andnewProperty
andnewProperty
has taken the default value""
because it wasn’t set by the user.
The default version
The experienced operator has probably noted something about the example above. We don’t normally specify any version with our get
, describe
or create
(with -f
it’s normally specified in the manifest file) and our command would normally be something like kubectl get testwatch.devops.mesoform.com -o yaml
, so in this case, what version do we get?
So the obvious assumption is that we get whatever version is persisted to etcd but this would be wrong. The Kubernetes API has defined a priority order for versions and the default version is whatever version has the highest priority. So in our case we receive a v1alpha2
Conclusion
Here we’ve given a summary of versions of resources, how Kubernetes handles requests for different versions. We can see that it’s not quite as many would expect it to be. What did we learn?
Be careful what you wish for because whatever you ask for is what you’ll get.
If you ask Kubernetes for a specific version, the API will transform whatever is stored in etcd to the schema of the specified version
As a result, you can’t really watch for a specific version
As a developer, if you need to be doing anything with different versions, you should consider using Conversion Webhooks
As a developer, definitely pay close attention to what properties are stored and their default values
The default version for a request which doesn’t specify an
apiVersion
is the version with the highest priority.Without being able to run
etcdctl get
using the etcd client - which in almost all Kubernetes case, we can’t - we have no idea what version is persisted to the database.
Partner with Mesoform to create robust Kubernetes solutions. Whether you're scaling infrastructure or developing custom controllers, we're here to collaborate.