Running TeamCity on Kubernetes


Kubernetes – κυβερνήτης • (kyvernítis) mBt_pEfqCAAAiVyz

  1. governor (leader of a region or state)
  2. (nautical) captain, skipper
  3. pilot (of an aircraft)


61066550We recently moved to a new office and dis­covered that one of our bare metal Con­tin­uous Integration build agents didn’t survive the move. Since other developers were already unhappy with the fact, that the CI could only be reached from within the office, we took that as an in­cen­tive to give  building on the Google Cloud Platform a shot.

We already experimented with CoreOS for our internal logging system. Although we liked it, we were quite curious how a cluster setup with Kubernetes (K8S if you like) performs.

So in this post, we’ll show you how to set things up, what pitfalls to avoid, and how we managed to create a robust and easy to set up solution.


Cluster Setup

The setup of a Kubernetes cluster on the Google Cloud Platform is quite easy (and fast as a Google engineer will happily demonstrate) with the Google Cloud Container Engine.
I recommend using the command line tool gcloud because there are almost no Kubernetes features implemented in the GUI yet.

For our use case we figured two CPU cores should be enough and used a n1‑standard‑2 machine:

gcloud alpha container clusters create\
 -m n1-standard-2\

Build Agent Setup

Pod of dolphins by  Serguei S. Dukachev
Photo by Serguei S. Dukachev

Now that you have a three node Kubernetes cluster running, it is time to start-up some containers. In Kubernetes a group of containers and their resources (volumes, ports…) form a pod (as in: a pod of dolphins). To ensure that n pods are always running, you can use a replication controller. Almost anything you do in a Kuberntes cluster you do through kubectl which is hidden under:

gcloud alpha container kubectl

By the way, your life will be a bit easier when you setup an alias first:

alias kubectl='gcloud alpha container kubectl'

Here’s an example for our replication controller:

id: teamcity-agent
kind: ReplicationController
apiVersion: v1beta1
  name: teamcity-agent
  replicas: 3
    name: teamcity-agent
      name: teamcity-agent
       version: v1beta1
       id: teamcity-agent
         - name: teamcityagent
           image: smallimprovements/teamcity-agent-docker
             - name: TEAMCITY_SERVER
               value: http://teamcity:8111
             - name:  agent-data
               mountPath: /data
          - name: agent-data
                path: /var/buildAgent

This will spawn a replication controller (line 2) named teamcity-agent (line 1). The controller should fire up three pods (line 7). The pods consists of:

  • a container named teamcityagent (line 18), that has its working directory “data mounted to a volume (line 23)
  • a Volume named agent-data (line 27) that goes to a host directory (line 29)

If aren’t comfortable writing YAML, don’t worry, Kubernetes understands JSON as you’ll see in Googles example.

For a full reference of replication controllers refer to the excellent docs.

To fire up anything in Kubernetes you use kubectl create:

kubectl create -f ./teamcity-agent-rc.yml

If everything went well, you’ll get a list of your running controller and pods by:

kubectl get pods,rc

With the  parameter “-l”  you can filter the resources by the labels attached to them. In our example we gave the replication controller (line 5) – and all the pods spawned by him – the label “teamcity-agent” (line 12). So to list the controller and the pods, we run:

POD                    IP                  CONTAINER(S)        IMAGE(S)                                    HOST                                                            LABELS                STATUS              CREATED
teamcity-agent-acnbt          teamcityagent       smallimprovements/teamcity-agent-docker:latest   k8s-teamcity-node-1.c.eastern-kit-781.internal/   name=teamcity-agent   Running             12 hours
teamcity-agent-hvz97          teamcityagent       smallimprovements/teamcity-agent-docker:latest   k8s-teamcity-node-2.c.eastern-kit-781.internal/   name=teamcity-agent   Running             12 hours
teamcity-agent-tycvv          teamcityagent       smallimprovements/teamcity-agent-docker:latest   k8s-teamcity-node-3.c.eastern-kit-781.internal/    name=teamcity-agent   Running             12 hours

CONTROLLER          CONTAINER(S)        IMAGE(S)                                    SELECTOR              REPLICAS
teamcity-agent      teamcityagent       smallimprovements/teamcity-agent-docker:latest   name=teamcity-agent   3

We tried running an in-memory disk, ramdisk, inside the container, because we weren’t happy with the build times at first (but eventually didn’t see any noticeable performance improvements). If you want to do this, you need to start the whole Kubernetes cluster with –allow_privilged=true so you can mount tmpfs inside a container.

But, it’s much easier to set up a ramdisk on your host system, and use it as a volume for your containers. This is what we are doing in line 26 of the example above.

The fstab on our host system looks like this

tmpfs /var/buildAgent tmpfs noatime,size=2G  0  0

Build Server Setup

61075533Another important part of our setup is the TeamCity Server. Again we use a replication controller.

“Why?” you might ask, do you use a replication controller, if you need only one pod?

That’s because pods are not durable. When a pod fails it’s gone. In their design document Google says that you “should almost always” use controllers to facilitate self-healing in the cluster.

id: teamcity
kind: ReplicationController
apiVersion: v1beta1
  name: teamcity
  replicas: 1
    name: teamcity
      name: teamcity
        id: teamcity
          - name: teamcity
            image: smallimprovements/teamcity-docker
              - name: teamcity
                containerPort: 8111
              - name:  teamcity-data
                mountPath: /var/lib/teamcity
          - name: teamcity-data
                pdName: teamcity-backup
                fsType: ext4

Nothing special here, but note that we exposed the port 8111 of the teamcity container and named it “teamcity” (line 19). Furthermore we have given the label “teamcity” to the pods (line 12). The label will be important for the setup of our TeamCity service.

Which brings us to the last Kubernetes resources of this post, the service.


Because pods are ephemeral and their containers will get a new IP address every time they are fired up, the only way to get a permanent address to access our TeamCity server is to expose it as a service:

id: teamcity
kind: Service
apiVersion: v1beta1
port: 8111
containerPort: teamcity
  name: teamcity
  name: teamcity

Here a service with the id “teamcity” (line 1) is created.
We use the container port with the name “teamcity” (line 5) from the pods that match the selector “teamcity” (line 9), i.e. the pods that have the label “teamcity”.

The container port is exposed as the port 8111 (line 4) on the teamcity service.
So how do you access the service now? From within a container the service is accessible on a host named after the services id. So inside a teamcityagent container you can access TeamCity now through http://teamcity:8111.

But we wanted to access TeamCity from our office. Luckily every service will get an IP address through which it is accessible from the nodes inside the cluster.

To find out the services IP address use kubectl, as in the following example:

NAME       LABELS         SELECTOR       IP             PORT
teamcity   name=teamcity  name=teamcity  8111

So now we know that from any node in the cluster TeamCity will be accessible via
Because we didn’t want to expose our TeamCity server to the public, we decided dig a ssh tunnel from our office into the cluster and from there to the TeamCity service.

# we use a load balanced ssh connection
# to reach any node in the teamcity cluster

# from inside the cluster access Teamcity
# on the services ip

ssh -NfC\

You might have stumbled over the load balanced comment: To make things more stable we decided to use a load balancer for the SSH connection. So when we destroy or create nodes in our Kubernetes cluster, we add them to the target pool and they are eligible for the tunnel. How to set up a load balancer on the cloud platform is described here.

To wrap things up, here are the steps  to get from zero to a cluster with three nodes running TeamCity and the agents:

#create the cluster
gcloud alpha container clusters create\
 -m n1-standard-2\

#create TeamCity
kubectl create -f teamcity-rc.yml

#expose it as a service
kubectl create -f teamcity-service.yml

#create the agents
kubectl create -f teamcity-agent-rc.yml


To be fair most of the time you’ll spend getting the resource configurations right, or debugging your containers. A nice touch is that you can read the logs from outside the cluster through kubectl:

# kubectl logs <pod> <container>
kubectl logs teamcity-grizy teamcity

If the pods won’t start, this doesn’t help you much, because as Kubernetes will tell you: There are no logs. In that case a “kubectl get events will” usually help. This will give you a list of events that happened in your cluster, before things went bad.

In theory you can execute commands through kubectl in the containers, as we tried in the following example

kubectl exec -it\
-p teamcity-agent-acnbt\
-c teamcityagent\
--  bash

Alas, in practice this consistently failed us, but we guessed it will work once the container engine reaches beta (it might work for you).


Its been fun setting up a build system on Kubernetes. Debugging was easier than on plain CoreOS because we could read the clusters events.

We loved the service abstraction to link containers and were quite impressed by the replication controllers (a feature lacking in plain CoreOS)

What we didn’t like was, that it wasn’t possible to enforce that only one agent might run per node. So it took some tries (kubectl {delete, create}) till they ran where they should.

Per cluster one additional instance is created, that hosts the Kubernetes API: The master. Currently it will have the same machine type as every node in your cluster. So for our 3 node setup an additional two core machine is quite a waste, but this evens out when you do larger clusters.