Kubernetes: FaaS Options (part 1)

Over the last few months I’ve been diving into various Serverless/FaaS architectures that can run on Kubernetes. To say this space has exploded would be a severe understatement. The number of amazing developers working in this space is remarkable. Much less the number of them integrating with Kubernetes.

I’m not going to talk about wrappers around Lambda (which there are a TON of). I’m talking about true FaaS capabilities that can at least demonstrate they run on Kubernetes.

As it turns out there are a fair number of them.

I’ve worked with several of these now but I’ll point out the ones I’ve not as we go along. In most circumstances I was able to reduce the number of candidates to explore simply by reviewing their architectures to understand their pros and cons.

I’ve also come across what I believe will be some key indicators we may want to be aware of before taking on one of these capabilities.

  • Language support
  • Performance and Scalability (how quickly can a basic function execute)
  • Asynchronous/Synchronous support
  • Monitoring
  • Architecture

 

 

OpenWhisk

OpenWhisk was built and designed by IBM. Seems to be gaining a fair amount of traction in this space and has a good reputation. An excellent overview of the OpenWhisk Architecture can be written by Markus Thömmes , an OpenWhisk contributor at medium.com.

Language Support  OpenWhisk has full language support for just about anything. It even has integrations with Swift, Cloudant, Slack and YouTube.

Performance and Scalability – I found performance of OpenWhisk to be somewhat sluggish out of the box. But Mark provides some pretty good ways to increase performance here. Scalability is quite good as all components including the controllers can be scaled out.

Asynchronous/Synchronous – Asynchronous Only. Sounds like there are some plans for (semi?) synchronous support.

Monitoring – IBM does have Dashboard that can be use with IBM Bluemix and the CLI can be used for gaining insights as well but built-in capabilities with open source monitoring platforms are non-existent.

Architecture – 

CouchDB and Kafka are in direct line of the execution for any function. CouchDB being for both authentication and action retrieval. Personally I couldn’t see us requiring Authentication through OpenWhisk and I would imagine most others have their own Auth capabilities that support far more than what are offered here. Kafka because all requests are Asynchronous (at this time).

Primary problem the with the above is availability. The more stateful (semi or otherwise) services required to be available, the more opportunity for failure however you’ll find this is fairly common in FaaS. Some sort of message queue and some sort of storage for holding code. This tends to limit the number of languages (and/or versions) supported but IBM have done a good job here. Basically any container can be an invoker as long as it conforms to a few specifics.

Notes for OpenWhisk on Kubernetes:

  • No use of Kubernetes scheduling.
  • OpenWhisk controller talks directly with the Docker API on the host, thus limiting scalability to what that host can handle. Also not going to work for availability.

Recap: Overall OpenWhisk is a platform that’s been around for a little bit now. It largely resembles Lambda in its capability but open to the masses. I could see OpenWhisk being used in very large FaaS implementations but its number of dependencies in the critical path scare me. Its performance could use some enhancing out of the box. But for a FaaS that relies on injecting functions into containers, its language support is stellar and it has some pretty cool direct integrations.

Bottom Line: I can’t recommend this platform if running Kubernetes at this time.

 

Kubeless

Kubeless is almost a brand new project. As of the time of this writing, Kubeless has only been committed to in any earnest for the last 5 months. They are truly Kubernetes native and plug right in to the Serverless project. I did not get the chance to really test out this platform but I’m aiming to get a handle on it in the next few weeks.

Language Support  Python and NodeJS

Performance and Scalability – I just don’t know yet

Asynchronous/Synchronous – Both

Monitoring – Baked in monitoring with Prometheus.

Architecture – 

Kubeless relies heavily on built-in Kubernetes capabilities such as ThirdPartyResources (or Custom Definitions depending on version of Kubernetes) and takes advantage of the built-in API server. Everything to run a function exists in the ThirdPartyResource. As a result however, the Kubeless team has to provide support by language/version for functions to run. My hope is they will make this a bit more generic to allow customer runtimes? Otherwise I fear they won’t be able to keep up.

Correction: Executions with Kubeless are through http or triggered events. Thank You @sebgoa for pointing this out.

Notes: Kubeless has an easily consumable UI and directly plugs in to the Serverless Framework.

Kubeless runs on vanilla Kubernetes, OpenShift and hooks seamlessless into Kubernetes-RBAC for security.

Recap: Really cool new up and coming project integrating deeply with Kubernetes. Could be a heavy contender in the future.

Bottom Line: Not yet unless you are solely Python and NodeJS based.

 

 

IronFunctions

IronFunctions are the current unsung heroes in my mind. Under heavy development starting in July of 2016, this is an easily consumable open source project that integrates well with Kubernetes while having the unique ability to run Lambda style functions as well. So for all you Lambda junkies wanting to break your addiction, this might be a pretty damn good option.

Language Support  Only limited by the docker containers you can dream up.

Performance and Scalability – Only limited by the infrastructure its running on. I quite easily executed it in several languages both locally and on a full cluster in the 200-250ms range for Synch requests and 300ish for Asynch. I don’t see any scalability issues at this time. If a ceiling was hit, it would be quite easy to simply spin up a new IronFunctions capability in a different namespace in Kubernetes.

Asynchronous/Synchronous – Both

Monitoring – Logs

Architecture – 

IronFunctions are a truly well built platform that I’m hesitant to say could serve many different use cases. There are a few basic components to running IronFunctions.

  • IronFunctions – Essentially the controller/API that manages incoming requests and starts up resources/container to fulfill said request.
  • Database – for configuration only. Not in the critical request path.
  • Message Queue – For Asynchronous requests.

Notes: Has a usable UI for managing functions. HotFunctions are pretty awesome. The CLI is very easy to use.

All in all, IronFunctions was the dark horse that surprised me by a long shot. I would love to see Prometheus monitoring make it in as I’m not terribly excited about Logging being the metrics collection point. Overall, minor gripe.

Recap: I was noticeably surprised by its scalability, performance and maturity for a project I just happen to run across. It has all the makings of a truly scalable, production capable FaaS offering. With Synchronous, Asynchronous AND HotFunction capabilities, I was very impressed. Combine that with ease of use, just enough integration with Kubernetes and I’m pretty much sold. Keep up the good work.

Bottom Line: Of the ones I’ve reviewed so far, a definite Yes. Just get me some metrics into Prometheus. 😉

In a future post I’ll have a look at Fission, Funktion and maybe an up and comer by alexellis called faas-netes.

@devoperandi

Kubernetes – PodPresets

Podpresets in Kubernetes are a cool new addition to container orchestration in v1.7 as an alpha capability. At first they seem relatively simple but when I began to realize their current AND potential value, I came up with all kinds of potential use cases.

Basically Podpresets inject configuration into pods for any pod using a specific Kubernetes label. So what does this mean? Have a damn good labeling strategy. This configuration can come in the form of:

  • Environment variables
  • Config Maps
  • Secrets
  • Volumes/Volumes Mounts

Everything in a PodPreset configuration will be appended to the pod spec unless there is a conflict, in which case the pod spec wins.

Benefits:

  • Reusable config across anything with the same service type (datastores as an example)
  • Simplify Pod Spec
  • Pod author can simply include PodPreset through labels

 

Example Use Case: What IF data stores could be configured with environment variables. I know, wishful thinking….but we can work around this. Then we could setup a PodPreset for MySQL/MariaDB to expose port 3306, configure for InnoDB storage engine and other generic config for all MySQL servers that get provisioned on the cluster.

Generic MySQL Pod Spec:

apiVersion: v1
kind: Pod
metadata:
  name: mysql
  labels:
    app: mysql-server
    preset: mysql-db-preset
spec:
  containers:
    - name: mysql
      image: mysql:8.0
      command: ["mysqld"]
  initContainers:
  - name: init-mysql
    image: initmysql
    command: ['script.sh']

Now notice there is an init container in the pod spec. Thus no modification of the official MySQL image should be required.

The script executed in the init container could be written to templatize the MySQL my.ini file prior to starting mysqld. It may look something like this.

#!/bin/bash

cat >/etc/mysql/my.ini <<EOF

[mysqld]

# Connection and Thread variables

port                           = $MYSQL_DB_PORT
socket                         = $SOCKET_FILE         # Use mysqld.sock on Ubuntu, conflicts with AppArmor otherwise
basedir                        = $MYSQL_BASE_DIR
datadir                        = $MYSQL_DATA_DIR
tmpdir                         = /tmp

max_allowed_packet             = 16M
default_storage_engine         = $MYSQL_ENGINE
...

EOF

 

Corresponding PodPreset:

kind: PodPreset
apiVersion: settings.k8s.io/v1alpha1
metadata:
  name: mysql-db-preset
  namespace: somenamespace
spec:
  selector:
    matchLabels:
      preset: mysql
  env:
    - name: MYSQL_DB_PORT
      value: "3306"
    - name: SOCKET_FILE
      value: "/var/run/mysql.sock"
    - name: MYSQL_DATA_DIR
      value: "/data"
    - name: MYSQL_ENGINE
      value: "innodb"

 

This was a fairly simple example of how MySQL servers might be implemented using PodPresets but hopefully you can begin to see how PodPresets can abstract away much of the complex configuration.

 

More ideas –

Standardized Log configuration – Many large enterprises would like to have a logging standard. Say something simply like all logs in JSON and formatted as key:value pairs. So what if we simply included that as configuration via PodPresets?

Default Metrics – Default metrics per language depending on the monitoring platform used? Example: exposing a default set of metrics for Prometheus and just bake it in through config.

 

I see PodPresets being expanded rapidly in the future. Some possibilities might include:

  • Integration with alternative Key/Value stores
    • Our team runs Consul (Hashicorp) to share/coordinate config, DNS and Service Discovery between container and virtual machine resources. It would be awesome to not have to bake in envconsul or consul agent to our docker images.
  • Configuration injection from Cloud Providers
  • Secrets injection from alternate secrets management stores
    • A very similar pattern for us with Vault as we use for Consul. One single Secrets/Cert Management store for container and virtual machine resources.
  • Cert injection
  • Init containers
    • What if Init containers could be defined in PodPresets?

I’m sure there are a ton more ways PodPresets could be used. I look forward to seeing this progress as it matures.

 

@devoperandi

Kong API Gateway

 

Why an API gateway for micro services?

API Gateways can be an important part of the Micro Services / Serverless Architecture.

API gateways can assist with:

  • managing multiple protocols
  • working with an aggregator for web components across multiple backend micro services (backend for front-end)
  • reducing the number of round trip requests
  • managing auth between micro services
  • TLS termination
  • Rate Limiting
  • Request Transformation
  • IP blocking
  • ID Correlation
  • and many more…..

 

 

Kong Open Source API Gateway has been integrated into several areas of our platform.

Benefits:

  1. lightweight
  2. scales horizontally
  3. database backend only for config (unless basic auth is used)
  4. propagations happen quickly
    1. when a new configuration is pushed to the database, the other scaled kong containers get updated quickly
  5. full service API for configuration
    1. The API is far more robust than the UI
  6. supports some newer protocols like HTTP/2
    1. https://github.com/Mashape/kong/pull/2541
  7. Kong continues to work even when the database backend goes away
  8. Quite a few Authentication plugins are available for Kong
  9. ACLs can be utilized to assist with Authorization

Limitations for us (not necessarily you):

  • Kong doesn’t have enough capabilities around metrics collection and audit compliance
  • Its written in Lua (we have one guy with extensive skills in this area)
  • Had to write several plugins to add functionality for requirements

 

We run Kong as containers that are quite small in resource consumption. As an example, the gateway for our documentation site consumes minimal CPU and 50MB of RAM and if I’m honest we could probably reduce it.

Kong is anticipated to fulfill a very large deployment in the near future for us as one of our prime customers (internal Pearson development team) is also adopting Kong.

Kong is capable of supporting both Postgres and Cassandra as storage backends. I’ve chosen Postgres because Cassandra seemed like overkill for our workloads but both work well.

Below is an example of a workflow for a micro service using Kong.

In the diagram above, the request contains the /v1/lec URI for routing to the correct micro service from. In line with this request, Kong can trigger an OAuth workflow or even choose not to authenticate for specific URIs like /health or /status if need be. We use one type of authentication for webhooks and another for users as an example.

In the very near future we will be open sourcing a few plugins for Kong for things we felt were missing.

  • rewrite rules plugin
  • https redirect plugin (think someone else also has one)
  • dynamic access control plugin

all written by Jeremy Darling on our team.

We deploy Kong along side the rest of our container applications and scale them as any other micro service. Configuration is completed via curl commands to the Admin API.

The Admin API properly responds with both a http code (200 OK) and a json object containing the result of the call if its a POST request.

Automation has been put in place to enable us to configure Kong and completely rebuild the API Gateway layer in the event of a DR scenario. Consider this as our “the database went down, backups are corrupt, what do we do now?” scenario.

We have even taken it so far as to wrap it into a Universal API so Kong gateways can be configured across multiple geographically disperse regions that serve the same micro services while keeping the Kong databases separate and local to their region.

In the end, we have chosen Kong because it has the right architecture to scale, has a good number of capabilities one would expect in an API Gateway, it sits on Nginx which is a well known stable proxy technology, its easy to consume, is api driven and flexible yet small enough in size that we can choose different implementations depending on the requirements of the application stack.

Example Postgres Kubernetes config

apiVersion: v1
kind: Service
metadata:
  name: postgres
spec:
  ports:
  - name: pgsql
    port: 5432
    targetPort: 5432
    protocol: TCP
  selector:
    app: postgres
---
apiVersion: v1
kind: ReplicationController
metadata:
  name: postgres
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: postgres:9.4
          env:
            - name: POSTGRES_USER
              value: kong
            - name: POSTGRES_PASSWORD
              value: kong
            - name: POSTGRES_DB
              value: kong
            - name: PGDATA
              value: /var/lib/postgresql/data/pgdata
          ports:
            - containerPort: 5432
          volumeMounts:
            - mountPath: /var/lib/postgresql/data
              name: pg-data
      volumes:
        - name: pg-data
          emptyDir: {}

 

Kubernetes Kong config for Postgres

apiVersion: v1
kind: Service
metadata:
  name: kong-proxy
spec:
  ports:
  - name: kong-proxy
    port: 8000
    targetPort: 8000
    protocol: TCP
  - name: kong-proxy-ssl
    port: 8443
    targetPort: 8443
    protocol: TCP
  selector:
    app: kong
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: kong-deployment
spec:
  replicas: 1
  template:
    metadata:
      labels:
        name: kong-deployment
        app: kong
    spec:
      containers:
      - name: kong
        image: kong
        env:
          - name: KONG_PG_PASSWORD
            value: kong
          - name: KONG_PG_HOST
            value: postgres.default.svc.cluster.local
          - name: KONG_HOST_IP
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: status.podIP
        command: [ "/bin/sh", "-c", "KONG_CLUSTER_ADVERTISE=$(KONG_HOST_IP):7946 KONG_NGINX_DAEMON='off' kong start" ]
        ports:
        - name: admin
          containerPort: 8001
          protocol: TCP
        - name: proxy
          containerPort: 8000
          protocol: TCP
        - name: proxy-ssl
          containerPort: 8443
          protocol: TCP
        - name: surf-tcp
          containerPort: 7946
          protocol: TCP
        - name: surf-udp
          containerPort: 7946
          protocol: UDP

 

If you look at the configs above, you’ll notice it does not expose Kong externally. This is because we use Ingress Controllers so here is an ingress example.

Ingress Config example:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  labels:
    name: kong
  name: kong
  namespace: ***my-namespace***
spec:
  rules:
  - host: ***some_domain_reference***
    http:
      paths:
      - backend:
          serviceName: kong
          servicePort: 8000
        path: /
status:
  loadBalancer: {}

 

 

Docker container – Kong

https://hub.docker.com/_/kong/

 

Kubernetes Deployment for Kong

https://github.com/Mashape/kong-dist-kubernetes

 

@devoperandi