StackStorm for Kubernetes just took a giant leap forward (beta)

 

came up with it one morning around 4am while trying to get the baby to sleep.

i’m pretty proud. mostly because it works 😉

 – Andy Moore

 

As many of you know, my team began integrating StackStorm with Kubernetes via ThirdPartyResources(TPR) which we showed off at KubeCon_London in March 2016. This was a great start to our integrations with Kubernetes and allowed us to expand our capabilities around managing datastores simply by posting a TPR to the Kubernetes API. Allowing StackStorm to build/deploy/manage our database clusters automatically.

This however only worked with ThirdPartyResources. In fact, it only worked with the ‘beta’ TPRs which were significantly revamped before making it into GA.

With that Andy Moore figured out how to automatically generate a StackStorm pack crammed full of exciting new capabilities for both StackStorm Sensors and Actions.

Link:

https://github.com/pearsontechnology/st2contrib/tree/bite-1162/packs/kubernetes

You will notice this has not been committed back upstream to StackStorm yet. Our latest version diverges significantly from the original pack we pushed and we need to work with the StackStorm team for the best approach to move forward.

@stackstorm if you want to help us out with this, we would be very appreciative.

screen-shot-2016-12-02-at-2-37-51-pm

The list of new capabilities for Kubernetes is simply astounding. Here are just a few:

Authentication
RBAC
HorizontalPodAutoscalers
Batch Jobs
CertificateSigningRequests
ConfigMaps
PersistentVolumes
Daemonsets
Deployments/DeploymentRollBack
Ingress
NetworkPolicy
ThirdPartyResources
StorageClasses
Endpoints
Secrets

Imagine being able to configure network policies through an automated StackStorm workflow based on a particular projects needs.

Think about how RBAC could be managed using our Kubernetes Authz Webhook through StackStorm.

Or how about kicking of Kubernetes Jobs to Administer some cluster level cleanup activity but handing that off to your NOC.

Or allowing your Operations team to patch a HorizontalPodAutoscaler through a UI.

We could build a metadata framework derived from the Kubernetes API annotations/labels for governance.

The possibilities are now literally endless. Mad props go out to Andy Moore for all his work in this endeavor.

 

Ok so why am I listing this as beta?

There is a freak ton of capabilities in our new st2 pack that we haven’t finished testing. So if you are adventurous, want to play with something new and can help us, we would love your feedback.

Thus far our testing has included the following:

Secrets

Services

Deployments

Ingresses

Physical Volumes

Replication Controllers

Quotas

Service Accounts

Namespaces

Volumes

 

Hope you get as excited about this as we are. We now have a way to rapidly integrate Kubernetes with ….. well …… everything else.

@devoperandi

 

Note: As soon as we have cleaned up a few things with the generator for this pack, we’ll open source it to the community.

 

Past Blogs around this topic:

KubeCon 2016 Europe (Slides)

 

 

Kubernetes, StackStorm and third party resources

 

Kubernetes, StackStorm and third party resources – Part 2

 

 

KubeCon Seattle Video

Finally posting this after my speaking engagement at KubeCon_Seattle in November 2016. Thanks to all that came. My hope is releasing our Deployment Pipeline will help the Kubernetes community build an ecosystem of open source CI/CD pipelines to support an awesome platform.

Below the video are links to the various open source projects we have created which are in the last slide of the Conference deck.

Link to the Deployment Pipeline:

https://github.com/pearsontechnology/deployment-pipeline-jenkins-plugin

 

 

 

Vault SSL Integration:

https://github.com/devlinmr/contrib/tree/master/ingress/controllers/nginx-alpha-ssl

 

Kubernetes Tests:

https://github.com/pearsontechnology/kubernetes-tests

 

StackStorm Integrations:

https://github.com/pearsontechnology/st2contrib

 

Authz Webhook:

https://github.com/pearsontechnology/bitesize-authz-webhook

Kube-DNS – a little tuning

We recently upgraded Kube-dns.

gcr.io/google_containers/kubedns-amd64:1.6
gcr.io/google_containers/kube-dnsmasq-amd64:1.3

Having used SkyDNS up to this point, we ran into some unexpected performance issues. In particular we were seeing pretty exaggerated response times from kube-dns when making requests it is not authoritative on (i.e. not cluster.local).

Fortunately this was on a cluster not yet serving any production customers.

It took several hours of troubleshooting and getting a lot more familiar with our new DNS and Dnsmasq, in particular the various knobs we could turn but what hinted us off to our solution was the following issue.

https://github.com/kubernetes/kubernetes/issues/27679

** Update

Adding the following line to our “- args” config under gcr.io/google_containers/kubednsmasq-amd64:1.3 did the trick and significantly improved dns performance.

- --server=/cluster.local/127.0.0.1#10053
- --resolv-file=/etc/resolv.conf.pods

By adding the second entry we ensure requests only go upstream from kube-dns instead of back to the host level resolver.

/etc/resolv.conf.pods points only to external dns, in our case, AWS DNS for our VPC which is always %.%.%.2 for whatever your VPC IP range is.

** End Update

In either case, we have significantly improved performance on DNS lookups and are excited to see how our new DNS performs under load.

 

Final thoughts:

Whether tuning for performance or not realizing your cluster requires a bit more than 200Mi RAM and 1/10 of a CPU, its quite easy to overlook kube-dns as a potential performance bottleneck.

We have a saying on the team, if its slow, check dns. If it looks good, check it again. And if it still looks good, have someone else check it. Then move on to other things.

Kube-dns has bit us so many times, we have a dashboard to monitor and alert just on it. These are old screen caps from SkyDNS but you get the point.

screen-shot-2016-10-28-at-6-46-14-pm

screen-shot-2016-10-28-at-6-46-34-pm

 

 

Kubernetes Init Containers

Kubernetes Init containers. Alright, I’m just going to tell the truth here. When I first started reading about them, I didn’t get it. I thought to myself, “with all the other stuff they could be doing right now at this early stage of Kubernetes, what the hell were they thinking? Seriously?” But that’s because I just didn’t get it. I didn’t see the value. I mean, don’t get me wrong, Init containers are good for many reasons. Transferring state between Pets, detecting databases are up prior to starting an app, Configuring PVCs with information the primary app needs etc etc. These are all important things but there are already work arounds for this stuff. Entrypoint anyone?

And then I read one line in the petset documentation (of all places) and I had a Aha! moment.

“…allows you to run docker images from third-party vendors without modification.”

That is a HUGE reason for Init Containers and in my mind should be the biggest validation of their need as a broader Kubernetes use-case.

At Pearson we have to modify existing docker images all the time to fit our needs. Whether its clustering Consul, modding Fluentd, seeding Cassandra or setting up discovery for ElasticSearch clustering. These are all things we have done and had to create our own custom images to manage. And in some cases requiring a private docker repository to do so. Hell, half the stuff I’ve written about has caused me to put out our Dockerfiles just so you could take advantage of them. If we had init containers in the first place, it would have been a lot less code and a lot more “hey, go pull this init container and use it” in my blog posts.

Alright with that, I’m actually just going to point you to the documentation on this one. Its pretty good and gives you exactly what you need to get started.

Kubernetes Init Containers

One key thing to remember, Init containers for a given app run in serial.

Now my team has to go back to work rewriting all our old shit.

Kubernetes container guarantees (and oversubscription)

Reading through the release notes of Kubernetes 1.4, I came across some fantastical news. News so good, I should have expected it. News that could not come at a better time. I’m talking about container guarantees, or what Kubernetes calls Resource Quality of Service. Let me be frank here, its like the Kubernetes team was just trying to confuse me. I’m sure the rest of you immediately knew what they were talking about but I’m a simpleton. So after reading it 5 times, I think I finally got ahold of it.

In a nutshell, when resource min and max values are set, quality of service dictates container priority when a server is oversubscribed.

Let me say this another way, we can oversubscribe server resources AND decide which containers stay alive and which ones get killed off.

Think of it like Linux OOM killer but with more fine grained control. In the Linux OOM Killer, the only thing you can do to help determine what does or does not get killed off, is adjust oom_score_adj per process. Which as it turns out is exactly what Kubernetes is doing.

Here are the details:

There are 3 levels of priority.

BestEffort – These are the containers Kubernetes will kill off first when under memory pressure.

Guaranteed – Take top priority over everything else. Kubernetes will try everything to keep these alive.

Burstable – Likely to be killed off when no more BestEffort pods exist and they have exceeded the REQUEST amount.

 

And there are two parameters you need to consider.

request – the base number of resources (cpu and ram) a container wants at runtime.

limit – The upper limit the container can consume if not already used elsewhere.

Notice how I mentioned memory pressure up above. Under CPU pressure, nothing will be killed off. Containers will simply get throttled instead.

 

So how do we determine which priority level a container will have?

Guaranteed if request == limit OR only the limits set

which looks like:

containers:
    name: mywebapp
        resources:
            limits:
                cpu: 10m
                memory: 1Gi
            requests:
                cpu: 10m
                memory: 1Gi

OR

containers:
    name: foo
        resources:
            limits:
                cpu: 10m
                memory: 1Gi

### Setting requests is optional

 

 

Burstable if request less than limit OR one of the containers has nothing set

containers:
    name: foo
        resources:
            limits:
                cpu: 10m
                memory: 1Gi
            requests:
                cpu: 10m
                memory: 1Gi

    name: bar

now recognize there are two containers above, one with nothing specified so that container gets BestEffort which makes the Pod as a whole Burstable.

OR

containers:
    name: foo
        resources:
            limits:
                memory: 1Gi

    name: bar
        resources:
            limits:
                cpu: 100m

This config above has two different resources set. One has memory set and the other cpu. Thus once again, Burstable.

 

BestEffort if no defined resources assigned.

containers:
    name: foo
        resources:
    name: bar
        resources:

 

This is just the tip of the iceberg on container guarantees.

There is a lot more there around cgroups, swap and compressible vs incompressible resources.

Head over to the github page to read more.

Cluster Consul using Kubernetes API

Recently we had the desire to cluster Consul (Hashicorps K/V store) without calling out to Atlas. We deploy many clusters per day as we are constantly testing and we want Consul to simply bring itself up without having to reach out over the internet.

So we added a few changes to our Consul setup, so here goes-

Dockerfile:

This is a typical dockerfile for Consul running on alpine. Nothing out of the ordinary.

FROM alpine:3.2
MAINTAINER 	Martin Devlin <martin.devlin@pearson.com>

ENV CONSUL_VERSION    0.6.3
ENV CONSUL_HTTP_PORT  8500
ENV CONSUL_HTTPS_PORT 8543
ENV CONSUL_DNS_PORT   53

RUN apk --update add openssl zip curl ca-certificates jq \
&& cat /etc/ssl/certs/*.pem > /etc/ssl/certs/ca-certificates.crt \
&& sed -i -r '/^#.+/d' /etc/ssl/certs/ca-certificates.crt \
&& rm -rf /var/cache/apk/* \
&& mkdir -p /etc/consul/ssl /ui /data \
&& wget http://releases.hashicorp.com/consul/${CONSUL_VERSION}/consul_${CONSUL_VERSION}_linux_amd64.zip \
&& unzip consul_${CONSUL_VERSION}_linux_amd64.zip \
&& mv consul /bin/ \
&& rm -f consul_${CONSUL_VERSION}_linux_amd64.zip \
&& cd /ui \
&& wget http://releases.hashicorp.com/consul/${CONSUL_VERSION}/consul_${CONSUL_VERSION}_web_ui.zip \
&& unzip consul_${CONSUL_VERSION}_web_ui.zip \
&& rm -f consul_${CONSUL_VERSION}_linux_amd64.zip

COPY config.json /etc/consul/config.json

EXPOSE ${CONSUL_HTTP_PORT}
EXPOSE ${CONSUL_HTTPS_PORT}
EXPOSE ${CONSUL_DNS_PORT}

COPY run.sh /usr/bin/run.sh
RUN chmod +x /usr/bin/run.sh

ENTRYPOINT ["/usr/bin/run.sh"]
CMD []

 

config.json:

And here is config.json referenced in the Dockerfile.

{
  "data_dir": "/data",
  "ui_dir": "/ui",
  "client_addr": "0.0.0.0",
  "ports": {
    "http"  : %%CONSUL_HTTP_PORT%%,
    "https" : %%CONSUL_HTTPS_PORT%%,
    "dns"   : %%CONSUL_DNS_PORT%%
  },
  "start_join":{
    %%LIST_PODIPS%%
  },
  "acl_default_policy": "deny",
  "acl_datacenter": "%%ENVIRONMENT%%",
  "acl_master_token": "%%MASTER_TOKEN%%",
  "key_file" : "/etc/consul/ssl/consul.key",
  "cert_file": "/etc/consul/ssl/consul.crt",
  "recursor": "8.8.8.8",
  "disable_update_check": true,
  "encrypt" : "%%GOSSIP_KEY%%",
  "log_level": "INFO",
  "enable_syslog": false
}

If you have read my past Consul blog you might notice we have added the following.

  "start_join":{
    %%LIST_PODIPS%%
  },

This is important because we are going to have each Consul container query the Kubernetes API using a Kubernetes Token to pull in a list of IPs for the Consul cluster to join up.

Important note – if you are running more than just the default token per namespace, you need to explicitly grant READ access to the API for the Token associated with the container.

And here is run.sh:

#!/bin/sh
KUBE_TOKEN=`cat /var/run/secrets/kubernetes.io/serviceaccount/token`
NAMESPACE=`cat /var/run/secrets/kubernetes.io/serviceaccount/namespace`

if [ -z ${CONSUL_SERVER_COUNT} ]; then
  export CONSUL_SERVER_COUNT=3
fi

if [ -z ${CONSUL_HTTP_PORT} ]; then
  export CONSUL_HTTP_PORT=8500
fi

if [ -z ${CONSUL_HTTPS_PORT} ]; then
  export CONSUL_HTTPS_PORT=8243
fi

if [ -z ${CONSUL_DNS_PORT} ]; then
  export CONSUL_DNS_PORT=53
fi

if [ -z ${CONSUL_SERVICE_HOST} ]; then
  export CONSUL_SERVICE_HOST="127.0.0.1"
fi

if [ -z ${CONSUL_WEB_UI_ENABLE} ]; then
  export CONSUL_WEB_UI_ENABLE="true"
fi

if [ -z ${CONSUL_SSL_ENABLE} ]; then
  export CONSUL_SSL_ENABLE="true"
fi

if [ ${CONSUL_SSL_ENABLE} == "true" ]; then
  if [ ! -z ${CONSUL_SSL_KEY} ] &&  [ ! -z ${CONSUL_SSL_CRT} ]; then
    echo ${CONSUL_SSL_KEY} > /etc/consul/ssl/consul.key
    echo ${CONSUL_SSL_CRT} > /etc/consul/ssl/consul.crt
  else
    openssl req -x509 -newkey rsa:2048 -nodes -keyout /etc/consul/ssl/consul.key -out /etc/consul/ssl/consul.crt -days 365 -subj "/CN=consul.kube-system.svc.cluster.local"
  fi
fi

export CONSUL_IP=`hostname -i`

if [ -z ${ENVIRONMENT} ] || [ -z ${MASTER_TOKEN} ] || [ -z ${GOSSIP_KEY} ]; then
  echo "Error: ENVIRONMENT, MASTER_TOKEN and GOSSIP_KEY environment vars must be set"
  exit 1
fi

LIST_IPS=`curl -sSk -H "Authorization: Bearer $KUBE_TOKEN" https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_PORT_443_TCP_PORT/api/v1/namespaces/$NAMESPACE/pods | jq '.items[] | select(.status.containerStatuses[].name=="consul") | .status .podIP'`

#basic test to see if we have ${CONSUL_SERVER_COUNT} number of containers alive
VALUE='0'

while [ $VALUE != ${CONSUL_SERVER_COUNT} ]; do
  echo "waiting 10s on all the consul containers to spin up"
  sleep 10
  LIST_IPS=`curl -sSk -H "Authorization: Bearer $KUBE_TOKEN" https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_PORT_443_TCP_PORT/api/v1/namespaces/kube-system/pods | jq '.items[] | select(.status.containerStatuses[].name=="consul") | .status .podIP'`
  echo "$LIST_IPS" | sed -e 's/$/,/' -e '$s/,//' > tester
  VALUE=`cat tester | wc -l`
done

LIST_IPS_FORMATTED=`echo "$LIST_IPS" | sed -e 's/$/,/' -e '$s/,//'`

sed -i "s,%%ENVIRONMENT%%,$ENVIRONMENT,"             /etc/consul/config.json
sed -i "s,%%MASTER_TOKEN%%,$MASTER_TOKEN,"           /etc/consul/config.json
sed -i "s,%%GOSSIP_KEY%%,$GOSSIP_KEY,"               /etc/consul/config.json
sed -i "s,%%CONSUL_HTTP_PORT%%,$CONSUL_HTTP_PORT,"   /etc/consul/config.json
sed -i "s,%%CONSUL_HTTPS_PORT%%,$CONSUL_HTTPS_PORT," /etc/consul/config.json
sed -i "s,%%CONSUL_DNS_PORT%%,$CONSUL_DNS_PORT,"     /etc/consul/config.json
sed -i "s,%%LIST_PODIPS%%,$LIST_IPS_FORMATTED,"      /etc/consul/config.json

cmd="consul agent -server -config-dir=/etc/consul -dc ${ENVIRONMENT} -bootstrap-expect ${CONSUL_SERVER_COUNT}"

if [ ! -z ${CONSUL_DEBUG} ]; then
  ls -lR /etc/consul
  cat /etc/consul/config.json
  echo "${cmd}"
  sed -i "s,INFO,DEBUG," /etc/consul/config.json
fi

consul agent -server -config-dir=/etc/consul -dc ${ENVIRONMENT} -bootstrap-expect ${CONSUL_SERVER_COUNT}"

Lets go through the options here: Notice in the script we do have some defaults enabled so we may or may not included them when starting up the container.

LIST_PODIPS = a list of Consul IPs for the consul node to join to

CONSUL_WEB_UI_ENABLE = true|false – if you want a web ui

CONSUL_SSL_ENABLE = SSL for cluster communication

If true expects:

CONSUL_SSL_KEY – SSL Key

CONSUL_SSL_CRT – SSL Cert

 

First we pull in the Kubernetes Token and Namespace. This is the default location for this information in every container and should work for your needs.

KUBE_TOKEN=`cat /var/run/secrets/kubernetes.io/serviceaccount/token`
NAMESPACE=`cat /var/run/secrets/kubernetes.io/serviceaccount/namespace`

And then we use those ENV VARS with some fancy jq to get a list of IPs formatted so we can shove them into config.json.

LIST_IPS=`curl -sSk -H "Authorization: Bearer $KUBE_TOKEN" https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_PORT_443_TCP_PORT/api/v1/namespaces/$NAMESPACE/pods | jq '.items[] | select(.status.containerStatuses[].name=="consul") | .status .podIP'`

And we wait until the number of CONSUL_SERVER_COUNT has started up

#basic test to see if we have ${CONSUL_SERVER_COUNT} number of containers alive
echo "$LIST_IPS" | sed -e 's/$/,/' -e '$s/,//' > tester
VALUE=`cat tester | wc -l`

while [ $VALUE != ${CONSUL_SERVER_COUNT} ]; do
  echo "waiting 10s on all the consul containers to spin up"
  sleep 10
  echo "$LIST_IPS" | sed -e 's/$/,/' -e '$s/,//' > tester
  VALUE=`cat tester | wc -l`
done

You’ll notice this could certainly be cleaner but its working.

Then we inject the IPs into the config.json:

sed -i "s,%%LIST_PODIPS%%,$LIST_IPS_FORMATTED,"      /etc/consul/config.json

which simplifies our consul runtime command quite nicely:

consul agent -server -config-dir=/etc/consul -dc ${ENVIRONMENT} -bootstrap-expect ${CONSUL_SERVER_COUNT}"

 

Alright so all of that is in for the Consul image.

Now lets have a look at the Kubernetes config files.

consul.yaml

apiVersion: v1
kind: ReplicationController
metadata:
  namespace: kube-system
  name: consul
spec:
  replicas: ${CONSUL_COUNT}                               # number of consul containers
  # selector identifies the set of Pods that this
  # replication controller is responsible for managing
  selector:
    app: consul
  template:
    metadata:
      labels:
        # Important: these labels need to match the selector above
        # The api server enforces this constraint.
        pool: consulpool
        app: consul
    spec:
      containers:
        - name: consul
          env:
            - name: "ENVIRONMENT"
              value: "SOME_ENVIRONMENT_NAME"             # some name
            - name: "MASTER_TOKEN"
              value: "INITIAL_MASTER_TOKEN_FOR_ACCESS"   # UUID preferable
            - name: "GOSSIP_KEY"
              value: "ENCRYPTION_KEY_FOR_GOSSIP"         # some random key for encryption
            - name: "CONSUL_DEBUG"
              value: "false"                             # to debug or not to debug
            - name: "CONSUL_SERVER_COUNT"
              value: "${CONSUL_COUNT}"                   # integer value for number of containers
          image: 'YOUR_CONSUL_IMAGE_HERE'
          resources:
            limits:
              cpu: ${CONSUL_CPU}                         # how much CPU are you giving the container
              memory: ${CONSUL_RAM}                      # how much RAM are you giving the container
          imagePullPolicy: Always
          ports:
          - containerPort: 8500
            name: ui-port
          - containerPort: 8400
            name: alt-port
          - containerPort: 53
            name: udp-port
          - containerPort: 8543
            name: https-port
          - containerPort: 8500
            name: http-port
          - containerPort: 8301
            name: serflan
          - containerPort: 8302
            name: serfwan
          - containerPort: 8600
            name: consuldns
          - containerPort: 8300
            name: server
#      nodeSelector:                                     # optional
#        role: minion                                    # optional

You might notice we need to move this to a deployment/replicaset instead of a replication controller.

These vars should look familiar by now.

CONSUL_COUNT = number of consul containers we want to run

CONSUL_HTTP_PORT = set port for http

CONSUL_HTTPS_PORT = set port for https

CONSUL_DNS_PORT = set port for dns

ENVIRONMENT = consul datacenter name

MASTER_TOKEN = the root token you want to have super admin privs to access the cluster

GOSSIP_KEY = an encryption key for cluster communication

 

consul-svc.yaml

---
apiVersion: v1
kind: Service
metadata:
  name: consul
  namespace: kube-system
  labels:
    name: consul
spec:
  ports:
    # the port that this service should serve on
    - name: http
      port: 8500
    - name: https
      port: 8543
    - name: rpc
      port: 8400
    - name: serflan
      port: 8301
    - name: serfwan
      port: 8302
    - name: server
      port: 8300
    - name: consuldns
      port: 53
  # label keys and values that must match in order to receive traffic for this service
  selector:
    pool: consulpool

 

consul-ing.yaml

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: consul
  namespace: kube-system
  labels:
    ssl: "true"
    httpsBackend: "true"
    httpsOnly: "true"
spec:
  rules:
  - host: consul.%%ENVIRONMENT%%.%%DOMAIN%%
    http:
      paths:
      - backend:
          serviceName: consul
          servicePort: 8543
        path: /

We run ingress controllers so this will provision an ingress so we can make Consul externally available.

 

 

 

@devoperandi

 

Kubernetes/PaaS: Automated Test Framework

First off, mad props go out to Ben Somogyi and Martin Devlin. They have been digging deep on this and have made great progress. I wanted to make sure I call them out and all the honors go to them. I just have the honor of telling you about it.

You might be thinking right about now, “why an automated test framework? Doesn’t Kubernetes team test their own stuff already?” Of course they do but we have a fair number of apps/integrations to make sure out platform components all work together with Kubernetes. Take for example, when we upgrade Kubernetes, deploy a new stackstorm integration or add some authentication capability. All of these things need to be tested to ensure our platform works every time.

At what point did we decide we needed an automated test framework? Right about the time we realized we were committing so much back to our project that we couldn’t keep up with the testing. Prior to this time, we tested each PR requiring 2 +1s (minus the author) to allow a PR to get merged. What we found was we were spending so much time testing (thoroughly?) that we were loosing valuable development time. We are a pretty small dev shop. Literally 5 (+3 Ops) guys developing new features into our PaaS. So naturally there is a balancing act here. Do we spend more time writing test cases or actually testing ourselves? There comes a tipping when it makes more sense to write test cases and automate it and use people for other things. We felt like we hit that point.

Here is what our current test workflow looks like. Its subject to change but this is our most recent iteration.

QA Automation Workflow

Notice we are running TravisCI to kick everything off. If you have read our other blog posts, you know we also have a Jenkins plugin and you are probably thinking, ‘why Travis when you already have written your own Jenkins plugin?’ Its rather simple really. We use TravisCI to kick off tests through Github which deploys a completely new AWS VPC / Kubernetes Cluster from scratch, runs a series of tests to make sure it came up properly, all the endpoints are available and the deploys Jenkins into a namespace which kicks off a series of internal tests on the cluster.

Basically TravisCI is for external / infrastructure testing to make sure Terraform/Ansible run correctly and all the external dependencies come up and Jenkins to deploy / test at the container level for internal components.

If you haven’t already read it, you may consider reading Kubernetes A/B Cluster Deploys because we are capable of deploying two completely separate clusters inside the same AWS VPC for the purpose of A/B migrations.

Travis looks at any pull requests (PR) being made to our dev branch. For each PR TravisCI will run through the complete QA automation process. Below are the highlights. You can look at the image above for details.

1. create a branch from the PR and merge in the dev branch

2. Linting/Unit tests

3. Cluster deploy

  • If anything fails during deploy of the VPC, paasA or paasB, the process will fail, and tear down the environment with the logs of it in TravisCI build logs.

Here is an example of one of our builds that is failing from the TravisCI.

Screen Shot 2016-08-27 at 1.30.03 PM

4. Test paasA with paasB

  • Smoke Test
  • Deploy ‘Testing’ containers into paasB
  • Retrieve tests
  • Execute tests against paasA
  • Capture results
  • Publish back to Travis

5. Destroy environment

 

One massive advantage of having A and B clusters is we can use one to test the other. This enables a large portion of our testing automation to exist in containers. Thus making our test automation parallel, fast and scalable to a large extent.

The entire process takes about 25 minutes. Not too shabby for literally building an entire environment from the ground up and running tests against it and we don’t expect the length of time to change much. In large part because of the parallel testing. This is a from scratch, completely automated QA automation framework for PaaS. I’m thinking 25-30 minutes is pretty damn good. You?

Screen Shot 2016-08-27 at 1.44.41 PM

 

Alright get to the testing already.

First is our helper script for setting a few params like timeouts and numbers of servers for each type. anything in ‘${}’ is a Terraform variable that we inject on Terraform deploy.

helper.bash

#!/bin/bash

## Statics

#Long Timeout (For bootstrap waits)
LONG_TIMEOUT=<integer_seconds>

#Normal Timeout (For kubectl waits)
TIMEOUT=<integer_seconds>

# Should match minion_count in terraform.tfvars
MINION_COUNT=${MINION_COUNT}

LOADBALANCER_COUNT=${LOADBALANCER_COUNT}

ENVIRONMENT=${ENVIRONMENT}

## Functions

# retry_timeout takes 2 args: command [timeout (secs)]
retry_timeout () {
  count=0
  while [[ ! `eval $1` ]]; do
    sleep 1
    count=$((count+1))
    if [[ "$count" -gt $2 ]]; then
      return 1
    fi
  done
}

# values_equal takes 2 values, both must be non-null and equal
values_equal () {
  if [[ "X$1" != "X" ]] || [[ "X$2" != "X" ]] && [[ $1 == $2 ]]; then
    return 0
  else
    return 1
  fi
}

# min_value_met takes 2 values, both must be non-null and 2 must be equal or greater than 1
min_value_met () {
  if [[ "X$1" != "X" ]] || [[ "X$2" != "X" ]] && [[ $2 -ge $1 ]]; then
    return 0
  else
    return 1
  fi
}

 

You will notice we have divided our high level tests by Kubernetes resource types. Services, Ingresses, Pods etc etc

First we test a few things to make sure our minions and loadbalancer (minions) came up. Notice we are using kubectl for much of this. May as well, its there and its easy.

If you want to know more about what we mean by load balancer minions.

instance_counts.bats

#!/usr/bin/env bats

set -o pipefail

load ../helpers

# Infrastructure

@test "minion count" {
  MINIONS=`kubectl get nodes --selector=role=minion --no-headers | wc -l`
  min_value_met $MINION_COUNT $MINIONS
}

@test "loadbalancer count" {
  LOADBALANCERS=`kubectl get nodes --selector=role=loadbalancer --no-headers | wc -l`
  values_equal $LOADBALANCERS $LOADBALANCERS
}

 

pod_counts.bats

#!/usr/bin/env bats

set -o pipefail

load ../helpers

@test "bitesize-registry pods" {
  BITESIZE_REGISTRY_DESIRED=`kubectl get rc bitesize-registry --namespace=default -o jsonpath='{.spec.replicas}'`
  BITESIZE_REGISTRY_CURRENT=`kubectl get rc bitesize-registry --namespace=default -o jsonpath='{.status.replicas}'`
  values_equal $BITESIZE_REGISTRY_DESIRED $BITESIZE_REGISTRY_CURRENT
}

@test "kube-dns pods" {
  KUBE_DNS_DESIRED=`kubectl get rc kube-dns-v18 --namespace=kube-system -o jsonpath='{.spec.replicas}'`
  KUBE_DNS_CURRENT=`kubectl get rc kube-dns-v18 --namespace=kube-system -o jsonpath='{.status.replicas}'`
  values_equal $KUBE_DNS_DESIRED $KUBE_DNS_CURRENT
}

@test "consul pods" {
  CONSUL_DESIRED=`kubectl get rc consul --namespace=kube-system -o jsonpath='{.spec.replicas}'`
  CONSUL_CURRENT=`kubectl get rc consul --namespace=kube-system -o jsonpath='{.status.replicas}'`
  values_equal $CONSUL_DESIRED $CONSUL_CURRENT
}

@test "vault pods" {
  VAULT_DESIRED=`kubectl get rc vault --namespace=kube-system -o jsonpath='{.spec.replicas}'`
  VAULT_CURRENT=`kubectl get rc vault --namespace=kube-system -o jsonpath='{.status.replicas}'`
  values_equal $VAULT_DESIRED $VAULT_CURRENT
}

@test "es-master pods" {
  ES_MASTER_DESIRED=`kubectl get rc es-master --namespace=default -o jsonpath='{.spec.replicas}'`
  ES_MASTER_CURRENT=`kubectl get rc es-master --namespace=default -o jsonpath='{.status.replicas}'`
  values_equal $ES_MASTER_DESIRED $ES_MASTER_CURRENT
}

@test "es-data pods" {
  ES_DATA_DESIRED=`kubectl get rc es-data --namespace=default -o jsonpath='{.spec.replicas}'`
  ES_DATA_CURRENT=`kubectl get rc es-data --namespace=default -o jsonpath='{.status.replicas}'`
  values_equal $ES_DATA_DESIRED $ES_DATA_CURRENT
}

@test "es-client pods" {
  ES_CLIENT_DESIRED=`kubectl get rc es-client --namespace=default -o jsonpath='{.spec.replicas}'`
  ES_CLIENT_CURRENT=`kubectl get rc es-client --namespace=default -o jsonpath='{.status.replicas}'`
  values_equal $ES_CLIENT_DESIRED $ES_CLIENT_CURRENT
}

@test "monitoring-heapster-v6 pods" {
  HEAPSTER_DESIRED=`kubectl get rc monitoring-heapster-v6 --namespace=kube-system -o jsonpath='{.spec.replicas}'`
  HEAPSTER_CURRENT=`kubectl get rc monitoring-heapster-v6 --namespace=kube-system -o jsonpath='{.status.replicas}'`
  values_equal $HEAPSTER_DESIRED $HEAPSTER_CURRENT
}

 

service.bats

#!/usr/bin/env bats

set -o pipefail

load ../helpers

# Services

@test "kubernetes service" {
  retry_timeout "kubectl get svc kubernetes --namespace=default --no-headers" $TIMEOUT
}

@test "bitesize-registry service" {
  retry_timeout "kubectl get svc bitesize-registry --namespace=default --no-headers" $TIMEOUT
}

@test "fabric8 service" {
  retry_timeout "kubectl get svc fabric8 --namespace=default --no-headers" $TIMEOUT
}

@test "kube-dns service" {
  retry_timeout "kubectl get svc kube-dns --namespace=kube-system --no-headers" $TIMEOUT
}

@test "kube-ui service" {
  retry_timeout "kubectl get svc kube-ui --namespace=kube-system --no-headers" $TIMEOUT
}

@test "consul service" {
  retry_timeout "kubectl get svc consul --namespace=kube-system --no-headers" $TIMEOUT
}

@test "vault service" {
  retry_timeout "kubectl get svc vault --namespace=kube-system --no-headers" $TIMEOUT
}

@test "elasticsearch service" {
  retry_timeout "kubectl get svc elasticsearch --namespace=default --no-headers" $TIMEOUT
}

@test "elasticsearch-discovery service" {
  retry_timeout "kubectl get svc elasticsearch-discovery --namespace=default --no-headers" $TIMEOUT
}

@test "monitoring-heapster service" {
  retry_timeout "kubectl get svc monitoring-heapster --namespace=kube-system --no-headers" $TIMEOUT
}

 

ingress.bats

#!/usr/bin/env bats

set -o pipefail

load ../helpers

# Ingress

@test "consul ingress" {
  retry_timeout "kubectl get ing consul --namespace=kube-system --no-headers" $TIMEOUT
}

@test "vault ingress" {
  retry_timeout "kubectl get ing vault --namespace=kube-system --no-headers" $TIMEOUT
}

Now that we have a pretty good level of certainty the cluster stood up as expected, we can begin deeper testing into the various components and integrations within our platform. Stackstorm, Kafka, ElasticSearch, Grafana, Keycloak, Vault and Consul. AWS endpoints, internal endpoints, port mappings, security……….. and the list goes on.  All core components that our team provides our customers.

Stay tuned for more as it all begins to fall into place.

Kubernetes – Stupid Human mistakes going to prod

So I figured we would have a little fun with this post. It doesn’t all have to be highly technical right?

As with any new platform, application or service the opportunity for learning what NOT to do is ever present and when taken in the right light, can be quite hilarious for others to consume. So without further ado, here is our list of what NOT to do going to production with Kubernetes:

  1. Disaster Recovery testing should probably be a planned event
  2. Don’t let the Lead make real quick ‘minor’ changes
  3. Don’t let anyone else do it either
  4. Kube-dns resources required
  5. Communication is good
  6. ETCD….just leave it alone
  7. Init 0 is not a good command to ‘restart’ a server

 

I think you will recognize virtually everything here is related to people. We are the biggest disasters waiting to happen. With that being said, give your people some slack to make mistakes. It can be more fun that way.

 

Disaster Recovery testing should probably be a planned event

This one is probably my favorite for two reasons: I can neither confirm nor deny who did this and somehow no customers managed to notice.

Back in Oct/Nov of 2015 we had dev teams working on the Kubernetes/PaaS/Bitesize but their applications weren’t in production yet. We however were treating the environment as if it were production. Semi-Scheduled upgrades, no changes without the test process etc etc. As you probably know by now, our entire deploy process for Kubernetes and surrounding applications are in Terraform. But this was before we started using remote state Terraform. So if someone happened to be in the wrong directory AND happened to run a terraform destroy AND THEN typed YES to validate that’s what they wanted, we can confirm a production (kinda) environment will go down with reckless abandon.

The cool part is we managed to redeploy, restore databases and applications running on the platform within 30 minutes and NO ONE NOTICED (or at least they never said anything…..maybe they felt bad for us).

Yes yes yes, our customers didn’t have proper monitoring in place just yet.

Needless to say, the particular individual has been harassed endlessly by team mates and will never live it down.

The term “Beer Mike” exists for a reason.

what we learned: “Beer Mike” will pull off some cool shit or blow it up. Middle ground is rare.

Follow up: Our first customer going to production asked us sometime later if we had every performed DR testing. We were able to honestly say, ‘yes’. 😉

 

Don’t let the Lead make real quick ‘minor’ changes

As most of you know by now, we automate EVERYTHING. But I’ve been known to make changes and then go back and add it to automation. Especially in the early days prior to us being in production even though we had various development teams using the platform. I made a change to a security group during troubleshooting that allowed two of our components in the platform to communicate with each other in the environment. It was a valid change, it needed to happen and it made our customers happy………until we upgraded.

what we learned: AUTOMATE FIRST

 

Don’t let anyone else do it either

Whats worse about this one is this particular individual is extremely smart but made a change and took off for a week. That was fun.

what we learned: don’t let anyone else do it either.

 

Kube-dns resources required

This one pretty much speaks for itself but here are the details. Our kube-dns container was set for best-effort and got a grand total of 50Mi memory and 1/10th of a cpu to serve internal dns for our entire platform. Notice I said container (not plural). We also failed to scale it out. So when one of our customers decided to perform a 6500 concurrent (and I mean concurrent) user load test, things were a tad bit slow.

what we learned: scale kube-dns. Having 1/3 to 1/4 as many running in the cluster is a good idea. At larger scale above 100 nodes, it can be 1/8. These measurements highly depend on how many services in your environment use kube-dns.  Example: Nginx Ingress Controllers rely on it heavily.

 

Communication is good

Establish good communication channels early on with your customers. In our case, the various teams using our platform. We didn’t know until it started that there was a 6500 concurrent user load test. Ouch! What’s worse is, it was apart of a really large perf testing effort and they thought it was only going to be 450 concurrent users.

what we learned: Stay close to our customers. Keep in touch.

 

ETCD….just leave it alone

Yes this is Kubernetes and yes its quite resilient but don’t just restore ETCD, plan it out a bit, test that its going to work, have others validate that shit.

We had been playing around with scrubbing ETCD data and then dropping everything into a new Kubernetes cluster which would successfully start up all the namespaces, containers, replication controllers and everything we needed to run. The problem was, when we scrub the data and restored back into the same cluster with servers that already had labels and config values. You see, node labels are in ETCD. Our scrub script would pull all that out to make it generic so it could be deployed into another cluster. The problem is, when you do that to an existing cluster instead of a new cluster coming up, it would wipe all the labels associated with our minions which meant NONE of the containers would spin up. Fun for all.

What we learned: If you want to migrate shit, pull data from the API and do it that way. Getting fancy with ETCD leads to DUH moments.

 

Init 0 is not a good command to ‘restart’ a server

An anonymous colleague of mine meant to restart a server running an nginx controller during troubleshooting. Init 0 doesn’t work so good for that. Fortunately we run everything in ASGs so it just spun up another node but still not the smartest thing in the world if you can avoid it.

 

That’s it folks. I’m sure there are more. We’ll add to this list as the teams memory becomes less forgetful.

 

@devoperandi