Python Client for Kubernetes

For reasons I’ll divulge in a future post, we needed a python client to interact with Kubernetes. Our latest and greatest work is going to rely pretty heavily on it and we’ve had difficulty finding one that is fully functional.

SPOILER: Go to the bottom of the article if you just want the code. 😉

We explored options like libCloud, pykube and even went back to some of the original python-kubernetes clients like you would see on Pypi. What we found was they were all either a) out of date, b) still very much in their infancy or c) no longer being contributed. And we realized sitting around waiting on someone else to build and maintain one just wasn’t going to work.

So with a lot of exploring and a ton of learning (primarily due to my lack of python skillz), we came to realize we could simply generate our own with codegen. You see, Kubernetes uses swagger for its API and codegen allows us to create our own python client using the swagger spec.

# on mac install swagger-codegen

brew install swagger-codegen

Acquire v1.json from v1.json at Kubernetes website

and run something like:

swagger-codegen generate -l python -o k8sclient -i v1.json

And this was fantastic……..until it didn’t work and the build fails.

You see, Kubernetes is running swagger spec 1.2 and they are using “type”: “any” which is an undefined custom type and codegen doesn’t know how to handle it.

See the github issues referenced here and here for a more detailed explanation.

The end result is, while custom types in swagger-spec 1.2 are allowed, there was no way to document the custom types for codegen to consume. This is fixed in swagger-spec 2.0 with “additionalProperties” to allow this mapping to occur.

But we still had a problem. We couldn’t easily create a python client from codegen.

So what we have done, right or wrong, is replace everything in the v1.json of

"type": "any"

with

"type": "string"

and it works.

With that here is a link to the v1.json file with the change.

But we also did the same thing for extensions/v1beta because we are working on some future endeavors so here is a link to that as well.

With these v1.json and v1.beta1.json files you should be able to create your own python client for Kubernetes.

Or if you choose, you could just use the clients we created. We intend to keep these clients updated but if you find we haven’t, feel free to create your own. Its dead simple.

https://github.com/mward29/python-k8sclient

https://github.com/mward29/python-k8sclient-v1beta1

 

As a final departing note, these python clients have NOT been fully vetted. We have not run across any issues as of this moment but if you find an issue before we do, PLEASE be a good samaritan and let us know.

The beta version, because its running against the beta api extensions may not have everything you would expect in it.

 

How we do builds in Kubernetes

First off. All credit for this goes to my friend Simas. I’m simply relaying what he has accomplished because it would be a shame if others didn’t benefit from his expertise. He is truly talented in this space and provides simple yet elegant designs that just work.

Coming into my current position we have 400+ development teams. Virtually all of which are managing their own build pipelines. This requires significant time and effort to manage, develop and automate. Each team designates their own developer on a rotating basis, or worse, completely dedicates a dev to make sure the build process goes smoothly.

What we found when looking across these teams was they were all basically doing the same thing. Sometimes using a different build server, automating using a different scripting language or running in a different code repo but all in all, its the same basic process with the same basic principles. And because we have so many dev teams, we were bound to run into enough teams developing in a particular language that it would make sense to standardize their process so multiple teams could take advantage of it. This combined with the power of docker images and we have a win/win situation.

So let me define what I mean by “build process” just so we can narrow the scope a bit. Build process – The process of building application(s) code using a common build platform. This is our first step in a complete CI/CD workflow.

So why haven’t we finished it already? Along with the Dev teams we have quite a few other engineering teams involved including QA/Performance/CISO etc etc and we haven’t finished laying out how all those teams will work together in the pipeline.

We have questions like:

Do QA/Perf/Security engineers all have access to multiple kubernetes namespaces or do they have their own project area and provide a set of endpoints and services which can be called to utilize their capabilities?

Do we mock cross-functional services in each namespace or provide endpoints to be accessed from anywhere?

What about continuous system/integration testing?

Continuous performance testing? How do we do this without adversely affecting our dev efforts?

Those are just a few of the questions we are working through. We have tons of them. Needless to say, we started with the build process.

We create Docker images for each and every Java/NodeJS/Go/Ruby/language_of_the_month our developers choose. These images are very much standardized. Allowing for built-in, centrally managed, monitored, secure containers that deploy in very short periods of time. The only deltas are the packages for the actual application. We build those in deb packages and standardize the install process, directory locations, version per language type etc etc.

Dev teams get their own namespace in Kubernetes. In fact, in most cases they get three. Dev, Stage and Prod. For the purpose of this conversation every dev team is developing an application stack which could consist of 1 to many micro services. Every namespace has its own Hubot and its own Jenkins build server which is completely vanilla to start with.

See Integrating Hubot and Kubernetes for more info on Hubot.

Each Jenkins build server connects to at least two repositories. A standard jenkins job repo that consists of all the standardized builds for each language and the application code repositories for the applications. EVERY Jenkins server connects to the same jenkins job repo. Jenkins polls each repo for changes every X minutes depending on the requirements of the team. We thought about web hooks to notify Jenkins when a new build is needed but chose to poll from Jenkins instead. Primarily because we treat every external resource as if it has gremlins and we didn’t want to deal with firewalls. We’ve been looking at options to replace this but haven’t settled on anything at this point.

Screen Shot 2016-01-08 at 6.34.53 PM

 

jenkins job repo –

  1. all the possible standardized build jobs
  2. dockerfiles for building base images – ie java,nodejs,ruby etc etc
  3. metadata on communicating with the local hubot
  4. sets up kubectl for its namespace

application code repo –

  1. Contains application code
  2. Contains a default.json file

default.json is key to the success of the build process.

It has three primary functions:

  1. Informs Jenkins what type of build it should be setup for. Ex. If XYZ team writes code in Java and NodeJS, it tells Jenkins to configure itself for those build types. This way we aren’t configuring every Jenkins server for build artifacts it will never build.
  2. It tells Jenkins meta-data about the application like application name, version, namespace(s) to deploy to, min/max number of containers to deploy, associated kubernetes services etc etc
  3. Provides Jenkins various build commands and artifacts particular to the application

Here is a very simple example of what that default.json might look like.

{
  "namespace": "someproject",
  "application": {
    "name": "sample-application",
    "type": "http_html",
    "version": "3.x.x"
  },
  "build": {
    "system_setup": {
      "buildfacts": [ // Configure the Jenkins server
        "java",
        "nodejs"
      ]
    },
    "build_steps": [
      {
        "shell": "some shell commands"
      },
      {
        "gradle": {
          "useWrapper": true,
          "tasks": "clean build -Ddeployment.target=???"
        }
      }
    ]
  },
  "build_command": "some command to execute the build",
  "artifacts": "target/",
  "services": [
    {
      "name": "sample-service",
      "external_url": "www.sample-service.com",
      "application": "someproject/sample-application",
      "instances": {
        "min": 2,
        "max": 5
      }
    }
  ]
}

 

Ok now for a little more complexity:

 

Screen Shot 2016-01-08 at 6.41.45 PM

So what just happened?

1) Dev commits code to application repository

2) Jenkins polls the jenkins build repo and application repositories for changes

3) If there is a new standard build image (say for Java), jenkins will build the latest version of the application with this image and push the image to the docker registry with a specialized tag. Then notify Dev team of the change to provide feedback through hubot.

When there is a version change in the application code repository Jenkins runs typical local tests, builds deb package, ships it to apt repository, then builds a docker image combining a standardized image from the jenkins build repo with the deb package for the application and pushes the image to the Docker registry.

4) Deploy application into namespace with preconfigured kubectl client

5) Execute system/integration tests

6) Feedback loop to Dev team through Hubot

7) Rinse and repeat into Staging/Prod on success

 

Now you are probably thinking, what about all those extra libraries that some applications may need but other do not.

Answer: If its not a common library, it goes in the application build.

 

All in all this is a pretty typical workflow.  And for the most part you are absolutely correct. So what value do we get by separating the standard/base build images and placing it into its own repository?

  • App Eng develops standard images for each language and bakes in security/compliance/regulatory concerns
  • Separation of concerns – Devs write code, System/App eng handles the rest including automated feedback loops
  • Security Guarantee – Baked in security, compliance and regulatory requirements ensuring consistency across the platform
  • Devs spend more time doing what they do best
  • Economies of scale – Now we can have a few people creating/managing images while maintaining a distributed build platform
  • Scalable build process – Every Dev team has their own Jenkins without the overhead associated with managing it
  • Jenkins servers can be upgraded, replaced, redeployed, refactored, screwed up, thrown out, crapped on and we can be back to a running state in a matter of minutes. WOOHOO Jenkins is now cattle.
  • Standardized containers means less time spent troubleshooting
  • Less chance of unrecognized security concerns across the landscape
  • Accelerated time to market with even less risk

 

Lets be realistic, there are always benefits and limitations to anything and this design is not the exception.

Here are some difficulties SO FAR:

  • Process challenges in adjusting to change
  • Devs can’t run whatever version for a given language they want
  • Devs could be prevented from taking advantage of new features in the latest versions of say Java IF the App Eng team can’t keep up

 

Worth Mentioning:

  • Both Devs and App Eng don’t have direct access to Jenkins servers
  • Because direct access is discourage, exceptional logging combined with exceptional analytics is an absolute must

 

Ok so if you made it thus far. I’m either a damn good writer, your seriously interested in what I have to say or you totally crazy about build pipelines. Somehow I don’t think its option 1. Cheers

 

@devoperandi

envconsul and Docker ….. soo long config files

As Docker continues to grow in popularity there are quite a few things that become readily apparent. Fortunately I’m only going to address one of them. Enter envconsul to retrieve application config data at run time.

This post assumes you already have a running Consul server with some data you wish to retrieve.

envconsul was written by Hashicorp, a great company that I personally respect. In everything I’ve touched made by this great little company, I’ve yet to be disappointed. Their applications are rock solid.

Github Link:

https://github.com/hashicorp/envconsul

 

envconsul utilizes a key/value stored called Consul to retrieve configuration data and present them as environment variables to the application at runtime. This concept offers up a lot of opportunity around dynamic configuration, centralized configuration management and security because there aren’t free text usernames and passwords hanging around the file system. Not that any respectable company would ever do that right? No way. Never. Ok maybe it kinda happens almost always. With envconsul, we can solve that.

 

Build envconsul:

Currently there is no package in the general package managers for envconsul so I like to pull the repo, make the binary and copy it into /usr/bin which places the binary in the path and makes it immediately executable.

git clone https://github.com/hashicorp/envconsul.git
cd envconsul
make

If you decide you like envconsul, bake it right into your vm or container and you’ll always have it available.

 

Create a envconsul.cnf file:

Basically this file tells envconsul where the consul server exists.

consul = "consul.mydomain.com:8500"
timeout = "5s"

 

Add it to your Dockerfile:

I mentioned this had to do with Docker right? Well in the Dockerfile when you build your images you can bake envconsul right into to run command with something like the following:

CMD /usr/sbin/apache2ctl -k start && envconsul -config=/etc/envconsul.cnf -sanitize=false -upcase=false myblog env /usr/local/tomcat/bin/catalina.sh run

Let’s imagine I have an Tomcat container with Apache Web Server running in front of it. In the command above I’m starting apache and then executing envconsul to call the consul server.
So what have I really done here?

I’ve set sanitize to false otherwise envconsul will replace “invalid” characters as underscores
I’ve referenced the envconsul.cnf with -config
I’ve set upcase to false cause being a Linux nut, I know some devs like to ingest environment variables that aren’t just uppercase
I’ve specified the key myblog to get data back from consul
I’ve added env so envconsul presents the results from consul as environment variables to catalina.sh

 

One thing I love about envconsul is when it provides the environment variables to the application, it is ONLY to the application. Logging in as root and running printenv won’t even provide the variables envconsul presents to the application.

 

This has been a very basic “get it up and running” scenario around envconsul. There are other things to explore like ssl, authentication and consul API Tokens so head over to the Github page dig in.

 

And if you have found this valuable, Tweet it please.

devops – a start

I’ve come to realize DevOps is highly misunderstood when its actually simple.

Tools + Communication + Automation = Speed, Stability and Value

Simple right?

So that’s how I look at it but what is the actual definition?

DevOps (a portmanteau of “development” and “operations”) is a software development method that stresses communication, collaboration, integration, automation, and measurement of cooperation between software developers and other information-technology (IT) professionals.

In theory its dead simple. In practice, it requires work just like everything else.

In my experience, even the companies that have a “culture of change” don’t change quite as easily as they may like to think. All the hemming and hawwing still happens. The nay sayers still exist. All the long drawn out meetings trying to convince key stakeholders still happen. And nothing moves quite as fast as the champion’s want them to AND THAT IS OK. Yep, you heard me right. Its perfectly ok.

Why do I say it’s ok? Because implementing a DevOps movement is scary and the vast majority of companies get it wrong while claiming a huge success. What generally happens is, we pick out the parts that sound cool like “Automation” (I’m 100% guilty of this), call our team members “DevOps Engineers” (managed not to do this), create a few meetings between Developers and Operations and call it good. Things get better between the teams, releases get better and we can deploy shit with the click of a button if we have spent enough time on it. But wait there’s more….if your company really gets a hair up their ass, they bake QA into the automation process. Thus hopefully we’ve gotten some integration between subsystems and system tests. Realizing that system tests generally catch issues which drive system integrations to happen. Bassackwards much?

After that, we officially have the gold nugget to success and true nirvana has hit its peak. We are deploying to production 80 times per day, PagerDuty never wakes us up and everything is air tight. right?

THE HOLY GRAIL!!!!

But as we soon come to find out, its not all beer and whiskey from here on out. The hard part is yet to come.

What’s missing? The most difficult parts are missing. feedback – collaboration – measurement of cooperation – communication. Which mostly boils down to Communication. But we can get away with what we have. I mean, its just a flesh wound right?

https://www.youtube.com/watch?v=ikssfUhAlgg

So inherently there is an order. A continuum by which DevOps is implemented. Easiest first, most difficult last. Go figure right? What’s the easiest? The parts that don’t require communication of course, automation and tools. Every geek can get behind those two. I mean, who doesn’t like tools? And who doesn’t like to click a button and see lots of cool shit happen that you can show off at your favorite meetup?

What do these other four things really buy us? How about planning, buy-in, innovation, reduced MTTR (mean time to recover), coordinated work, motivation, effectiveness, morale, speed, change lead time, retention, reduced costs from production defects, lower defect escape rate, reduced deployment times, and the list goes on.

I would propose that communication is the linchpin to a truly successful DevOps paradigm yet the one most avoided when getting started. Without it, all we really have is some cool automation.

But how do we do all this? Its hard right? It requires effort that geeks don’t generally engage in. How do we accomplish this part of DevOps and do it well? What makes this so difficult? Is it different for every company? Are we really that bad at communication? Or do we simply talk in different languages? Are their best practices to follow? How do we measure cooperation? What degree of collaboration is good?

In a follow-on post we’ll start to explore this paradigm of communication and how different companies make it happen.

 

Thoughts on communication:

Speak the same language – As odd as this sounds, encourage people to define what they mean by xyz. Make it a point to get definitions that may not be clear. I’ve seen and contributed to so many long conversations that could have been done in 3 minutes because multiple people were working from different definitions of the same word.

Innovate on communication – Have a retro on communication. Find out how to communicate in a manner that others can consume. Iterate on it.

Don’t allow different communication mechanisms to become a barrier – Allow people to consume Asynchronous information in different ways. Provide an ESB for communication if you will. Think ChatOps. Write some coffee script that will allow engineers to register with various types of communication. RSS feeds, email, message boards, online reporting, IM, Reddit, text to speech (maybe thats too far?). Let people consume information how they want to when its Asynchronous. Let them subscribe to what they think it important.

Provide venues for communication to happen organically – That sounds like a high level, completely ambiguous piece of no information. What I mean is, if multiple teams likes beer, make it a habit to go to lunch and drink every Friday. Find commonalities between the teams and exploit those to the companies benefit. Increase morale, collaboration and communication in one fail swoop. Some of the best ideas come from this one thing. DO NOT see this as a bunch of engineers taking off for the afternoon. You would be amazed at how many ideas come out of these venues.

Synchronous communication – should all happen the same way. Keep it consistent. One, maybe two technologies is fine. Beyond that is too much. Don’t have 10 different ways to constantly interrupt your colleagues.

Close the damn loop – All too often someone important gets left out of the communication loop. This can often happen because the email that got sent looks like the 1000 other emails received that day. Find a way to emphasize information that is important. You can even require an acknowledgement.

 

Painful to watch:

Hiring people that do both – really? I guess in a small organization it can work for some duration of time but I would have to argue that one person will never measure up to an engineer that is dedicated to one or the other. Its preposterous. If the application has any degree of complexity, the concept of one engineer being great at everything is plain ridiculous. What you end up with is a mediocre solution that is lacking in more than one thing. Its why we have QA, and developers and operations and performance engineers and security professionals. Because there is too much for one person to be good at all of it. Don’t be stupid, let people do what they are good at.

 

What do you think? What can we do to communicate better?