Running Apache Mesos inside Docker on CoreOS

Jake Sanders | 16 February 2015

If you've heard of docker containers, you've probably also heard of CoreOS. If not, CoreOS is a lightweight, minimal Linux distribution designed for running containers in a clustered environment. This sounds ideal for running Apache Mesos, except for one caveat. CoreOS doesn't have packages, so in order to run Mesos, we're going to have to create containerised versions of all Mesos' services.

Let's quickly go over the key components of Mesos:

  • Zookeeper
  • Mesos master
  • Mesos slave
  • Framework

Neither Zookeeper nor the Mesos master have any issues running in a container. However, the Mesos slave is a little more complex, as it expects to have access to the docker daemon. In order to accomplish this, we're going to have to mount the host's docker socket, executable, and related libraries into the container.

For a quick-and-dirty single node setup, fire up a fresh CoreOS installation and run the following commands:


# Grab our IP
export HOST_IP=`ip -o -4 addr list eth0 | grep global | awk '{print $4}' | cut -d/ -f1`
# Start Zookeeper
docker run -d --name=zookeeper --net=host --name=zookeeper jplock/zookeeper
# Start mesos master
docker run -d --name=mesos_master --net=host mesosphere/mesos-master:0.20.1 --ip=$HOST_IP --zk=zk://$HOST_IP:2181/mesos --work_dir=/var/lib/mesos/master --quorum=1
# Start mesos slave
docker run -d --name=mesos_slave --privileged --net=host -v /sys:/sys -v /usr/bin/docker:/usr/bin/docker:ro -v /var/run/docker.sock:/var/run/docker.sock -v /lib64/libdevmapper.so.1.02:/lib/libdevmapper.so.1.02:ro -v /lib64/libpthread.so.0:/lib/libpthread.so.0:ro -v /lib64/libsqlite3.so.0:/lib/libsqlite3.so.0:ro -v /lib64/libudev.so.1:/lib/libudev.so.1:ro mesosphere/mesos-slave:0.20.1 --ip=$HOST_IP --containerizers=docker --master=zk://$HOST_IP:2181/mesos --work_dir=/var/lib/mesos/slave --log_dir=/var/log/mesos/slave
# Start framework, for example Marathon:
docker run -d --name marathon -e LIBPROCESS_PORT=9090 -p 8080:8080 -p 9090:9090 mesosphere/marathon:v0.7.6 --master zk://$HOST_IP:2181/mesos --zk zk://$HOST_IP:2181/marathon --checkpoint --task_launch_timeout 300000

Alternatively, start CoreOS with this cloud-config file. Just like that, we have a single host mesos "cluster."

Expanding to a multiple-host cluster

If you're unfamiliar with Mesos' architecture, I covered it briefly in a previous post, or you can read the official docs. The key point is that Mesos needs to know the address(es) of a running Zookeeper quorum in order for nodes to register themselves in the cluster. In the above (single host) example, this was easily achieved by grabbing our own IP address. But now we'll need to use some sort of service discovery.

The easy way would be to set up Zookeeper on a dedicated host (or 3) and use DNS. However, seeing as we talked about automated service discovery in the previous blog post, let's roll a completely automated solution. The only thing we need to know is each other's IP addresses, which we can glean using CoreOS's built-in etcd discovery.

Start some new CoreOS hosts with the following cloud-config file:


#cloud-config

coreos:
etcd:
# generate a new token for each unique cluster from https://discovery.etcd.io/new
discovery: -------Generate your own token here----------
# use $public_ipv4 if your datacenter of choice does not support private networking
addr: $private_ipv4:4001
peer-addr: $private_ipv4:7001
fleet:
public-ip: $private_ipv4   # used for fleetctl ssh command
units:
- name: etcd.service
command: start
- name: fleet.service
command: start

Here are some systemd unit files for launching consul, registrator, and bootstrapping consul. Launch them on each node using systemctl.

consul.service


[Unit]
Description=Consul
After=docker.service
Requires=docker.service

[Service]
Restart=on-failure
TimeoutStartSec=0
EnvironmentFile=/etc/environment
ExecStartPre=-/usr/bin/docker kill consul
ExecStartPre=-/usr/bin/docker rm consul
ExecStartPre=/usr/bin/docker pull progrium/consul
ExecStartPre=-/usr/bin/etcdctl mk /consul $COREOS_PUBLIC_IPV4
ExecStart=/usr/bin/sh -c "/usr/bin/docker run --rm --name consul -h $(/usr/bin/cat /etc/hostname) -p 8300:8300 -p 8301:8301 -p 8301:8301/udp -p 8302:8302 -p 8302:8302/udp -p 8400:8400 -p 8500:8500 -p 53:53/udp progrium/consul -server -bootstrap-expect 3 -advertise $(/usr/bin/ip -o -4 addr list eth0 | /usr/bin/grep global | /usr/bin/awk \'{print $4}\' | /usr/bin/cut -d/ -f1)"
ExecStop=/usr/bin/docker stop consul

[Install]
WantedBy=multi-user.target

consul-discovery.service


[Unit]
Description=Consul Discovery
BindsTo=consul.service
After=consul.service

[Service]
Restart=on-failure
EnvironmentFile=/etc/environment
ExecStart=/bin/sh -c "while true; do etcdctl mk /services/consul $COREOS_PUBLIC_IPV4 --ttl 60;/usr/bin/docker exec consul consul join $(etcdctl get /services/consul);sleep 45;done"
ExecStop=/usr/bin/etcdctl rm /services/consul --with-value %H

[Install]
WantedBy=multi-user.target

registrator.service


[Unit]
Description=Registrator
After=After=docker.service
Requires=docker.service
[Service]
Restart=on-failure
TimeoutStartSec=0
EnvironmentFile=/etc/environment
ExecStartPre=-/usr/bin/docker kill registrator
ExecStartPre=-/usr/bin/docker rm registrator
ExecStartPre=/usr/bin/docker pull progrium/registrator
ExecStart=/usr/bin/sh -c "/usr/bin/docker run --rm --name registrator -h $(/usr/bin/cat /etc/hostname) -v /var/run/docker.sock:/tmp/docker.sock progrium/registrator consul://$(/usr/bin/ip -o -4 addr list eth0 | grep global | awk \'{print $4}\' | cut -d/ -f1):8500"
ExecStop=/usr/bin/docker stop registrator
[Install]
WantedBy=multi-user.target

Alternatively, grab our example repository, which contains all the appropriate unit files embedded inside cloud-config files already, and give it a whirl!

Need help running Kubernetes?

Get in touch and see how we can help you.

Contact Us

+44 (0) 20 3608 0110