After the fantastic DockerCon Europe and the recent releases of Docker 1.9.1
, Compose 0.5.2
and Swarm 1.0.1
I finally have all the missing bits to automatically deploy a suite of Atlassian products to a swarm cluster without supervision:
- Docker Network and Docker Swarm are now production ready.
- Docker Compose finally works in a multi-host configuration.
- Swarm is capable to handle 1000+ hosts and 50,000 containers as demonstrated live on stage.
This is the dream I have – and we probably all have as an industry: to describe our software components, describe how they are linked together and let the infrastructure automatically arrange itself to match our needs. It’s here! It has been cooking for a while and depending on the technology stack maybe it is already there for you. Nonetheless the Docker suite of tools have reached that moment for me. And it’s glorious.
Let me show you an example of the possibilities.
The objective
I start with a meaningful goal: Deploy Bitbucket Server and PostgreSQL to a 3-node cluster created with Docker Machine and managed via Docker Swarm.
This is the end result I have in mind, where my setup does not mention any hard-coded IP address:
Prerequisites
As a prerequisite I need an account on an IaaS provider, this time around I choose Digital Ocean but any other of the Docker Machine drivers will do. I create an authenticated API_TOKEN
and this allows me to create nodes at will using “docker-machine
“.
Install and run discovery server
The new multi-host capabilities of Compose and Swarm require a more complete discovery service than the basic Docker Hub Swarm tokens, so in this piece I will use Consul, a discovery server and key/value store from HashiCorp.
- First step, create the consul node using docker-machine:
docker-machine create -d digitalocean --digitalocean-access-token=$DO_TOKEN --digitalocean-region "ams2" consul
This specifies the “ams2” region, passes my token and names this machine “consul“.
Running pre-create checks... Creating machine... Waiting for machine to be running, this may take a few minutes... Machine is running, waiting for SSH to be available... Detecting operating system of created instance... Provisioning created instance... Copying certs to the local machine directory... Copying certs to the remote machine... Setting Docker configuration on the remote daemon... To see how to connect Docker to this machine, run: docker-machine env discovery
- After the machine is ready, switch our
docker
environment to run commands on that instance by evaluating:eval "$(docker-machine env consul)"
- Finally run the consul server in a simple non redundant configuration with:
docker run -d -p 8400:8400 -p 8500:8500 -p 8600:53/udp -h consul progrium/consul -server -bootstrap
- Test it by curling:
curl $(docker-machine ip consul):8500/v1/catalog/nodes
[{"Node":"consul","Address":"172.17.0.2"}]
Setup a 3-node Swarm cluster
Now we can create a cluster of 3 machines, with slightly different requirements.
- Let’s start with the Swarm master, which will control our entire cluster:
docker-machine create -d digitalocean --digitalocean-access-token=$DO_TOKEN --digitalocean-image "debian-8-x64" --digitalocean-region "ams3" --swarm --swarm-master --swarm-discovery=consul://$(docker-machine ip consul):8500 --engine-opt="cluster-store=consul://$(docker-machine ip consul):8500" --engine-opt="cluster-advertise=eth0:2376" demo-master
Take note we chose a specific Debian 8.2 image
debian-8-x64
, the default Ubuntu image that Docker Machine chooses on Digital Ocean won’t work because it has an older kernel that does not work with Docker overlay networks. We also passcluster-store
andcluster-advertise
to the Docker engine on this new machine with information on how the swarm can store keys and values of the infrastructure we are building. Those are stored on theconsul
instance we readied before.
- Next create a machine with 2Gb of RAM to run Bitbucket Server:
docker-machine create -d digitalocean --digitalocean-access-token=$DO_TOKEN --digitalocean-image "debian-8-x64" --digitalocean-region "ams3" --digitalocean-size "2gb" --swarm --swarm-discovery=consul://$(docker-machine ip consul):8500 --engine-label instance=java --engine-opt="cluster-store=consul://$(docker-machine ip consul):8500" --engine-opt="cluster-advertise=eth0:2376" node1
We require the machine to have 2GB of RAM and tag this machine with
label
java so that we can deploy our application based on labels.
- Third, create a machine to host the PostgreSQL database:
docker-machine create -d digitalocean --digitalocean-access-token=$DO_TOKEN --digitalocean-image "debian-8-x64" --digitalocean-region "ams3" --swarm --swarm-discovery=consul://$(docker-machine ip consul):8500 --engine-label instance=db --engine-opt="cluster-store=consul://$(docker-machine ip consul):8500" --engine-opt="cluster-advertise=eth0:2376" node2
We tag this machine with
label
db so that we can deploy our application based on labels.
- Check that the machines have been created:
docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM consul - digitalocean Running tcp://5.101.98.134:2376 cluster * digitalocean Running tcp://188.166.23.145:2376 cluster (master) node1 - digitalocean Running tcp://178.62.247.112:2376 cluster node2 - digitalocean Running tcp://178.62.212.73:2376 cluster
- Connect our local
docker
command to the entire Swarm:eval $(docker-machine env --swarm cluster)
docker info
Containers: 15 Images: 12 Role: primary Strategy: spread Filters: health, port, dependency, affinity, constraint Nodes: 3 cluster: 188.166.23.145:2376 └ Containers: 2 └ Reserved CPUs: 0 / 1 └ Reserved Memory: 0 B / 519.2 MiB └ Labels: executiondriver=native-0.2, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), provider=digitalocean, storagedriver=aufs node1: 178.62.247.112:2376 └ Containers: 10 └ Reserved CPUs: 0 / 2 └ Reserved Memory: 0 B / 2.061 GiB └ Labels: executiondriver=native-0.2, instance=java, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), provider=digitalocean, storagedriver=aufs node2: 178.62.212.73:2376 └ Containers: 3 └ Reserved CPUs: 0 / 1 └ Reserved Memory: 0 B / 519.2 MiB └ Labels: executiondriver=native-0.2, instance=db, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), provider=digitalocean, storagedriver=aufs CPUs: 4 Total Memory: 3.075 GiB Name: c5e1ce85f79a
Multi-host Docker Compose configuration
Next on the list is to write the multi-host configuration in a docker-compose.yml
, which will take care of starting both our Java application and our database in the proper order. It will also create a transparent overlay network between the cluster nodes involved.
The interesting points of the setup are:
- We do not specify any IP addresses for the physical infrastructure.
- We allocate applications to nodes using label constraints.
- We create a data only container with Bitbucket Server licensing information.
- We only use official images from the Docker Hub.
This is the complete docker-compose.yml
:
bitbucket:
image: atlassian/bitbucket-server
ports:
- "7990:7990"
- "7999:7999"
volumes_from:
- license
user: root
privileged: true
environment:
- "constraint:instance==java"
db:
image: postgres
ports:
- "5432:5432"
environment:
- "POSTGRES_PASSWORD=somepassword"
- "constraint:instance==db"
license:
build: .
License data-only was built from a Dockerfile
written like this:
FROM alpine
RUN mkdir -p /var/atlassian/application-data/bitbucket/shared
COPY ./bitbucket.properties /var/atlassian/application-data/bitbucket/shared/bitbucket.properties
VOLUME /var/atlassian/application-data/bitbucket
CMD ["/bin/true"]
And the only file it stored in reality is a single bitbucket.properties
file with this:
setup.displayName=Bitbucket Server
setup.baseUrl= http://localhost:7990
setup.license=<fill your license>
setup.sysadmin.username=admin
setup.sysadmin.password=admin
setup.sysadmin.displayName=<User Name>
setup.sysadmin.emailAddress=<Email Address>
jdbc.driver=org.postgresql.Driver
jdbc.url=jdbc:postgresql://orchestration_db_1:5432/postgres
jdbc.user=postgres
jdbc.password=somepassword
To start everything we can now invoke docker-compose
, making sure we turn on the multi-host networking and specify we want to use an overlay network:
docker-compose --x-networking --x-network-driver=overlay up -d
The result is our application deployed to the cluster:
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0f6adc9a14bb atlassian/bitbucket-server "./bin/start-bitbucke" 2 hours ago Up 2 hours 178.62.247.112:7990->7990/tcp, 178.62.247.112:7999->7999/tcp node1/orchestration_bitbucket_1
0a305957925f postgres "/docker-entrypoint.s" 2 hours ago Up 2 hours 128.199.37.223:5432->5432/tcp node2/orchestration_db_1
Note that the Java application “Bitbucket Server” was deployed to the instance with 2GB of RAM labelled java
as planned, and the PostgreSQL onto node2
which was labelled db
. Beautiful.
Issues
While creating the setup above I ran into a whole set of issues, partially due to the novelty of the tools and partially due to my hastiness.
- Proper orchestration only works with a fully fledged discovery service like consul, not the default
token
you get when running the basic `docker swarm create`. - In the flag
--engine-opt="cluster-advertise=eth0:2376"
guides mentionedeth1
but that is dependent on the specific machine and provider used. In the case of Digital Ocean the correct interface iseth0
. I tracked that down by looking into/var/log/upstart/docker.log
where I found this bit:Error starting daemon: discovery advertise parsing failed (no available advertise IP address in interface (eth1:2376))
To understand what happened I even went looking into the source code.
- At one point I got a very cryptic failure on vxlan interface creation, like the following:
ERROR: Cannot start container 774f639d4275af7f53dd8c8f3d65387d053c8000ab96ce3c6765b982428c3a2d: subnet sandbox join failed for "10.0.0.0/24": vxlan interface creation failed for subnet "10.0.0.0/24": failed in prefunc: failed to set namespace on link "vxlana389573": invalid argument
Turns out that to get the full blown multi-host support in compose and swarm, you need at least a
3.15+
Linux kernel (as explained here), and the default Digital Ocean Ubuntu image had an older one:uname -a
Linux node3 3.13.0-68-generic #111-Ubuntu SMP Fri Nov 6 18:17:06 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
To make things work I had to add
--digitalocean-image "debian-8-x64"
to mydocker-machine
create command. - To find the proper Digital Ocean image I installed a neat tool called
tugboat
, which is a command line tool to provision DO images:gem install tugboat
tugboat authorize tugboat images | grep ubuntu
12.04.5 x64 (slug: ubuntu-12-04-x64, id: 10321756, distro: Ubuntu) 12.04.5 x32 (slug: ubuntu-12-04-x32, id: 10321777, distro: Ubuntu) 15.10 x64 (slug: ubuntu-15-10-x64, id: 14169855, distro: Ubuntu) 15.10 x32 (slug: ubuntu-15-10-x32, id: 14169868, distro: Ubuntu) 15.04 x64 (slug: ubuntu-15-04-x64, id: 14169884, distro: Ubuntu) 15.04 x32 (slug: ubuntu-15-04-x32, id: 14169999, distro: Ubuntu) 14.04.3 x64 (slug: ubuntu-14-04-x64, id: 14530089, distro: Ubuntu) 14.04.3 x32 (slug: ubuntu-14-04-x32, id: 14530129, distro: Ubuntu)
I tried all Ubuntu images and they all failed, including
15.04
,so I had to use an image for Debian 8.2, that had the proper kernel version and didn’t crash. - Whenever I needed to restart the containers, something went wrong with the overlay network creation, the vxlan network gave me a problem. To remove the vxlan configurations I used:
sudo umount /var/run/docker/netns/* && sudo rm /var/run/docker/netns/* && start docker
Conclusions
The source of the above configurations can be found on Bitbucket.
This for me was the first magical step into having an entire suite of Atlassian tools deployed and run automatically onto a Docker Swarm. Stay tuned for the next chapter in the series. If you found this interesting and want more follow me at @durdn or my awesome team at @atlassiandev.