Kube-DNS – a little tuning

We recently upgraded Kube-dns.

gcr.io/google_containers/kubedns-amd64:1.6
gcr.io/google_containers/kube-dnsmasq-amd64:1.3

Having used SkyDNS up to this point, we ran into some unexpected performance issues. In particular we were seeing pretty exaggerated response times from kube-dns when making requests it is not authoritative on (i.e. not cluster.local).

Fortunately this was on a cluster not yet serving any production customers.

It took several hours of troubleshooting and getting a lot more familiar with our new DNS and Dnsmasq, in particular the various knobs we could turn but what hinted us off to our solution was the following issue.

https://github.com/kubernetes/kubernetes/issues/27679

** Update

Adding the following line to our “- args” config under gcr.io/google_containers/kubednsmasq-amd64:1.3 did the trick and significantly improved dns performance.

- --server=/cluster.local/127.0.0.1#10053
- --resolv-file=/etc/resolv.conf.pods

By adding the second entry we ensure requests only go upstream from kube-dns instead of back to the host level resolver.

/etc/resolv.conf.pods points only to external dns, in our case, AWS DNS for our VPC which is always %.%.%.2 for whatever your VPC IP range is.

** End Update

In either case, we have significantly improved performance on DNS lookups and are excited to see how our new DNS performs under load.

 

Final thoughts:

Whether tuning for performance or not realizing your cluster requires a bit more than 200Mi RAM and 1/10 of a CPU, its quite easy to overlook kube-dns as a potential performance bottleneck.

We have a saying on the team, if its slow, check dns. If it looks good, check it again. And if it still looks good, have someone else check it. Then move on to other things.

Kube-dns has bit us so many times, we have a dashboard to monitor and alert just on it. These are old screen caps from SkyDNS but you get the point.

screen-shot-2016-10-28-at-6-46-14-pm

screen-shot-2016-10-28-at-6-46-34-pm

 

 

Kubernetes Init Containers

Kubernetes Init containers. Alright, I’m just going to tell the truth here. When I first started reading about them, I didn’t get it. I thought to myself, “with all the other stuff they could be doing right now at this early stage of Kubernetes, what the hell were they thinking? Seriously?” But that’s because I just didn’t get it. I didn’t see the value. I mean, don’t get me wrong, Init containers are good for many reasons. Transferring state between Pets, detecting databases are up prior to starting an app, Configuring PVCs with information the primary app needs etc etc. These are all important things but there are already work arounds for this stuff. Entrypoint anyone?

And then I read one line in the petset documentation (of all places) and I had a Aha! moment.

“…allows you to run docker images from third-party vendors without modification.”

That is a HUGE reason for Init Containers and in my mind should be the biggest validation of their need as a broader Kubernetes use-case.

At Pearson we have to modify existing docker images all the time to fit our needs. Whether its clustering Consul, modding Fluentd, seeding Cassandra or setting up discovery for ElasticSearch clustering. These are all things we have done and had to create our own custom images to manage. And in some cases requiring a private docker repository to do so. Hell, half the stuff I’ve written about has caused me to put out our Dockerfiles just so you could take advantage of them. If we had init containers in the first place, it would have been a lot less code and a lot more “hey, go pull this init container and use it” in my blog posts.

Alright with that, I’m actually just going to point you to the documentation on this one. Its pretty good and gives you exactly what you need to get started.

Kubernetes Init Containers

One key thing to remember, Init containers for a given app run in serial.

Now my team has to go back to work rewriting all our old shit.