Registry Migration (ECR)

Today I’m going to provide a Registry migration script using Python that will allow you to migrate from a private docker registry to ECR. Keep in mind, its a script people. It got the job done. Its not fancy. Its not meant to cover all the possible ways in which you could do this. It doesn’t have a bunch of error handling. Its not meant to be run all the time. But it should give you a start if you need/want to do something similar. Please read the comments in the script. There are some environment vars and such to set prior to running.

Make sure AWS CLI is configured and run:

aws ecr get-login --region us-east-1

then run the command it gives back to you to login.

If you see the following error when running the script, you just managed to overload your repo. As a result I made the script more serial (instead of parallel) to help out but I still managed to overload it in serial mode once.

Received unexpected HTTP status: 500 Internal Server Error
Traceback (most recent call last):
  File "migrate.py", line 101, in <module>

  File "migrate.py", line 29, in __init__
    self._get_catalog()
  File "migrate.py", line 39, in _get_catalog
    self._run(mylist)
  File "migrate.py", line 55, in _run
    else:
  File "migrate.py", line 98, in _upload_image

  File "/usr/lib64/python2.7/subprocess.py", line 575, in check_output
    raise CalledProcessError(retcode, cmd, output=output)

 

If you get something like below you probably aren’t logged into ECR with the user you are running the script with.

Traceback (most recent call last):
  File "migrate.py", line 98, in <module>
    MigrateToEcr()
  File "migrate.py", line 29, in __init__
    self._get_catalog()
  File "migrate.py", line 39, in _get_catalog
    self._run(mylist)
  File "migrate.py", line 43, in _run
    self._ensure_new_repo_exists(line)
  File "migrate.py", line 74, in _ensure_new_repo_exists
    checkrepo = subprocess.check_output(command, shell=True)
  File "/usr/lib64/python2.7/subprocess.py", line 575, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '/usr/local/bin/aws ecr describe-repositories' returned non-zero exit status 255

Link to the script on Github.

 

Why we aren’t using ECR in a follow on post.

Kubernetes – ServiceAccounts

serviceAccounts are a relatively unknown entity within Kubernetes. Everyone has heard of them, everyone has likely added them to –admission-control on the ApiServer but few have actually configured or used them. Being guilty of this myself for quite some time I figured I would give a brief idea on why they are important and how they can be used.

serviceAccounts are for any process running inside a pod that needs access the Kubernetes API OR to a secret. Is it mandatory to access a Kubernetes Secret? No. Is it recommended, you bet. Not having serviceAccounts active through –admission-control can also leave a big gaping security hole in your platform. So make sure its active.

Here is the high-level-

  1. serviceAccounts are tied to Namespaces.
  2. Kubernetes Secrets can be tied to serviceAccounts and thus limited to specific NameSpaces.
  3. If non are specified, a ‘default’ with relatively limited access will be supplied on NameSpace create.
  4. Policies can be placed on serviceAccounts to add/remove API access.
  5. serviceAccounts can be specified during Pod or RC creation.
  6. In order to change the serviceAccount for a Pod, a restart of the Pod is necessary.
  7. serviceAccount must be created prior to use in a Pod.
  8. serviceAccount Tokens are used to allow a serviceAccounts to access a Kubernetes Secret.
  9. Using ImagePullSecrets for various Container Registries can be done with serviceAccounts.

 

Creating a custom serviceAccount is dead simple. Below is a yaml file to do so.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: pulse

 

And creating a policy for a serviceAccount isn’t too bad either.

(NOTE: must have –authorization-mode=ABAC set for Authorization plugin)

Screen Shot 2016-04-15 at 7.52.44 PM

 

Now we have a serviceAccount named Pulse and we’ve applied a policy that allows Kube API ReadOnly access to view events related to the Pulse Namespace.

Now lets associate a Secret with this Pulse serviceAccount.

apiVersion: v1
kind: Secret
metadata:
  name: pulse-secret
  annotations: 
    kubernetes.io/service-account.name: pulse
type: kubernetes.io/service-account-token
type: Opaque
data:
  password: eUiXZDFIOPU2ErTmCg==
  username: my_special_user_name

Ok now we have a Secret that is only accessible from a process running in the Pulse namespace that is using the pulse serviceAccount.

Name:   pulse-secret
Namespace:  pulse
Annotations:  kubernetes.io/service-account.name=pulse,kubernetes.io/service-account.uid=930e6ia5-35cf-5gi5-8d06-00549fi45306

Type: kubernetes.io/service-account-token

Data
====
ca.crt: 1452 bytes
token: some_token_for_pulse_serviceaccount

Which brings me to my next point. You can have multiple serviceAccounts per Namespace. This means granularity in what processes you allow access to various pieces of the Kubernetes API AND what processes WITHIN a namespace you want to have access to a Secret.

In closing, serviceAccounts can be granular, they can limit access to Secrets, when combined with abac policies they can provide specific access to the Kube API and they are fairly easy to use and consume.