Kubernetes container guarantees (and oversubscription)

Reading through the release notes of Kubernetes 1.4, I came across some fantastical news. News so good, I should have expected it. News that could not come at a better time. I’m talking about container guarantees, or what Kubernetes calls Resource Quality of Service. Let me be frank here, its like the Kubernetes team was just trying to confuse me. I’m sure the rest of you immediately knew what they were talking about but I’m a simpleton. So after reading it 5 times, I think I finally got ahold of it.

In a nutshell, when resource min and max values are set, quality of service dictates container priority when a server is oversubscribed.

Let me say this another way, we can oversubscribe server resources AND decide which containers stay alive and which ones get killed off.

Think of it like Linux OOM killer but with more fine grained control. In the Linux OOM Killer, the only thing you can do to help determine what does or does not get killed off, is adjust oom_score_adj per process. Which as it turns out is exactly what Kubernetes is doing.

Here are the details:

There are 3 levels of priority.

BestEffort – These are the containers Kubernetes will kill off first when under memory pressure.

Guaranteed – Take top priority over everything else. Kubernetes will try everything to keep these alive.

Burstable – Likely to be killed off when no more BestEffort pods exist and they have exceeded the REQUEST amount.

 

And there are two parameters you need to consider.

request – the base number of resources (cpu and ram) a container wants at runtime.

limit – The upper limit the container can consume if not already used elsewhere.

Notice how I mentioned memory pressure up above. Under CPU pressure, nothing will be killed off. Containers will simply get throttled instead.

 

So how do we determine which priority level a container will have?

Guaranteed if request == limit OR only the limits set

which looks like:

containers:
    name: mywebapp
        resources:
            limits:
                cpu: 10m
                memory: 1Gi
            requests:
                cpu: 10m
                memory: 1Gi

OR

containers:
    name: foo
        resources:
            limits:
                cpu: 10m
                memory: 1Gi

### Setting requests is optional

 

 

Burstable if request less than limit OR one of the containers has nothing set

containers:
    name: foo
        resources:
            limits:
                cpu: 10m
                memory: 1Gi
            requests:
                cpu: 10m
                memory: 1Gi

    name: bar

now recognize there are two containers above, one with nothing specified so that container gets BestEffort which makes the Pod as a whole Burstable.

OR

containers:
    name: foo
        resources:
            limits:
                memory: 1Gi

    name: bar
        resources:
            limits:
                cpu: 100m

This config above has two different resources set. One has memory set and the other cpu. Thus once again, Burstable.

 

BestEffort if no defined resources assigned.

containers:
    name: foo
        resources:
    name: bar
        resources:

 

This is just the tip of the iceberg on container guarantees.

There is a lot more there around cgroups, swap and compressible vs incompressible resources.

Head over to the github page to read more.