Jake Sanders | 16 February 2015
If you've heard of docker containers, you've probably also heard of CoreOS. If not, CoreOS is a lightweight, minimal Linux distribution designed for running containers in a clustered environment. This sounds ideal for running Apache Mesos, except for one caveat. CoreOS doesn't have packages, so in order to run Mesos, we're going to have to create containerised versions of all Mesos' services.
Let's quickly go over the key components of Mesos:
Neither Zookeeper nor the Mesos master have any issues running in a container. However, the Mesos slave is a little more complex, as it expects to have access to the docker daemon. In order to accomplish this, we're going to have to mount the host's docker socket, executable, and related libraries into the container.
For a quick-and-dirty single node setup, fire up a fresh CoreOS installation and run the following commands:
# Grab our IP
export HOST_IP=`ip -o -4 addr list eth0 | grep global | awk '{print $4}' | cut -d/ -f1`
# Start Zookeeper
docker run -d --name=zookeeper --net=host --name=zookeeper jplock/zookeeper
# Start mesos master
docker run -d --name=mesos_master --net=host mesosphere/mesos-master:0.20.1 --ip=$HOST_IP --zk=zk://$HOST_IP:2181/mesos --work_dir=/var/lib/mesos/master --quorum=1
# Start mesos slave
docker run -d --name=mesos_slave --privileged --net=host -v /sys:/sys -v /usr/bin/docker:/usr/bin/docker:ro -v /var/run/docker.sock:/var/run/docker.sock -v /lib64/libdevmapper.so.1.02:/lib/libdevmapper.so.1.02:ro -v /lib64/libpthread.so.0:/lib/libpthread.so.0:ro -v /lib64/libsqlite3.so.0:/lib/libsqlite3.so.0:ro -v /lib64/libudev.so.1:/lib/libudev.so.1:ro mesosphere/mesos-slave:0.20.1 --ip=$HOST_IP --containerizers=docker --master=zk://$HOST_IP:2181/mesos --work_dir=/var/lib/mesos/slave --log_dir=/var/log/mesos/slave
# Start framework, for example Marathon:
docker run -d --name marathon -e LIBPROCESS_PORT=9090 -p 8080:8080 -p 9090:9090 mesosphere/marathon:v0.7.6 --master zk://$HOST_IP:2181/mesos --zk zk://$HOST_IP:2181/marathon --checkpoint --task_launch_timeout 300000
Alternatively, start CoreOS with this cloud-config file. Just like that, we have a single host mesos "cluster."
If you're unfamiliar with Mesos' architecture, I covered it briefly in a previous post, or you can read the official docs. The key point is that Mesos needs to know the address(es) of a running Zookeeper quorum in order for nodes to register themselves in the cluster. In the above (single host) example, this was easily achieved by grabbing our own IP address. But now we'll need to use some sort of service discovery.
The easy way would be to set up Zookeeper on a dedicated host (or 3) and use DNS. However, seeing as we talked about automated service discovery in the previous blog post, let's roll a completely automated solution. The only thing we need to know is each other's IP addresses, which we can glean using CoreOS's built-in etcd discovery.
Start some new CoreOS hosts with the following cloud-config file:
#cloud-config
coreos:
etcd:
# generate a new token for each unique cluster from https://discovery.etcd.io/new
discovery: -------Generate your own token here----------
# use $public_ipv4 if your datacenter of choice does not support private networking
addr: $private_ipv4:4001
peer-addr: $private_ipv4:7001
fleet:
public-ip: $private_ipv4 # used for fleetctl ssh command
units:
- name: etcd.service
command: start
- name: fleet.service
command: start
Here are some systemd unit files for launching consul, registrator, and bootstrapping consul. Launch them on each node using systemctl.
consul.service
[Unit]
Description=Consul
After=docker.service
Requires=docker.service
[Service]
Restart=on-failure
TimeoutStartSec=0
EnvironmentFile=/etc/environment
ExecStartPre=-/usr/bin/docker kill consul
ExecStartPre=-/usr/bin/docker rm consul
ExecStartPre=/usr/bin/docker pull progrium/consul
ExecStartPre=-/usr/bin/etcdctl mk /consul $COREOS_PUBLIC_IPV4
ExecStart=/usr/bin/sh -c "/usr/bin/docker run --rm --name consul -h $(/usr/bin/cat /etc/hostname) -p 8300:8300 -p 8301:8301 -p 8301:8301/udp -p 8302:8302 -p 8302:8302/udp -p 8400:8400 -p 8500:8500 -p 53:53/udp progrium/consul -server -bootstrap-expect 3 -advertise $(/usr/bin/ip -o -4 addr list eth0 | /usr/bin/grep global | /usr/bin/awk \'{print $4}\' | /usr/bin/cut -d/ -f1)"
ExecStop=/usr/bin/docker stop consul
[Install]
WantedBy=multi-user.target
consul-discovery.service
[Unit]
Description=Consul Discovery
BindsTo=consul.service
After=consul.service
[Service]
Restart=on-failure
EnvironmentFile=/etc/environment
ExecStart=/bin/sh -c "while true; do etcdctl mk /services/consul $COREOS_PUBLIC_IPV4 --ttl 60;/usr/bin/docker exec consul consul join $(etcdctl get /services/consul);sleep 45;done"
ExecStop=/usr/bin/etcdctl rm /services/consul --with-value %H
[Install]
WantedBy=multi-user.target
registrator.service
[Unit]
Description=Registrator
After=After=docker.service
Requires=docker.service
[Service]
Restart=on-failure
TimeoutStartSec=0
EnvironmentFile=/etc/environment
ExecStartPre=-/usr/bin/docker kill registrator
ExecStartPre=-/usr/bin/docker rm registrator
ExecStartPre=/usr/bin/docker pull progrium/registrator
ExecStart=/usr/bin/sh -c "/usr/bin/docker run --rm --name registrator -h $(/usr/bin/cat /etc/hostname) -v /var/run/docker.sock:/tmp/docker.sock progrium/registrator consul://$(/usr/bin/ip -o -4 addr list eth0 | grep global | awk \'{print $4}\' | cut -d/ -f1):8500"
ExecStop=/usr/bin/docker stop registrator
[Install]
WantedBy=multi-user.target
Alternatively, grab our example repository, which contains all the appropriate unit files embedded inside cloud-config files already, and give it a whirl!
Do you need help with a Cloud Native or Kubernetes implementation? Get in touch and let's work together.
Contact UsAt LiveWyer Labs we innovate through research and development, see what else we've been working on lately.
If you want to stay up to date and be notified when we post new and exciting content, make sure to follow our Linkedin and Medium.