Service Discovery with Docker, Consul, Consul-Template, & Registrator

There are a multitude of good articles that explain in detail how to setup service discovery using the above services. Instead, this article will focus on the challenges in getting Service Discovery setup on AWS.

Reasons for using Consul as a service discovery tool

Our primary reason for setting up Consul is the move of our applications to a containerized environment in AWS. We decided on a combination of Docker for containerization and AWS CloudFormation for infrastructure provisioning.

Quickstart

A good place to start is HashiCorps Getting Started section. Another reference page that lists Consul services and nodes is Catalog – HTTP API – Consul by HashiCorp.

General outline of setting up service discovery

  1. Setup a cluster of Consul servers
  2. Run Consul clients on all Docker hosts / ECS instances
  3. Run Registrator for automatic service discovery on all Docker hosts / ECS instances
  4. Run Consul Template to generate configuration files & reload services programmatically

Setup a cluster of Consul servers

In order to create the Consul cluster programmatically, we use a CloudFormation template that uses CloudFormation MetaData that gets executed using AWS::CloudFormation::Init helper scripts.

The MetaData defines configuration sets that install Consul, allow us to run Consul as a service, and starts that Consul service on startup. We run the suggested minimum of 3 Consul servers on t2.micro instances, as described in the minimum server requirements on HashiCorps documentation.

Consul Node Name

For both servers and clients, we define a unique node name based on the CloudFormation stack name in combination with the last octet of the instance’s private IP address:

-node=${AWS::StackName}-$(echo ${!EC2_INSTANCE_IP_ADDRESS} | tr . " " | awk '{print $4}')

Consul Bind Address

By default, Consul will bind to 0.0.0.0, which means it’s only available locally. The process of “automatically advertising the first available private IPv4” address (Configuration – Consul by HashiCorp) didn’t fully work, so we are setting the bind address manually when starting the Consul service:

-bind=${!EC2_INSTANCE_IP_ADDRESS}

Automatic Cluster Joining

We make use of Consul’s auto-discovery mechanism on AWS (Cloud Auto-Joining). Consul uses tags to discover other instances. Since CloudFormation automatically adds tags when it creates instances, we use the name of the CloudFormation stack:

-retry-join "provider=aws tag_key=aws:cloudformation:stack-name tag_value=${AWS::StackName}"

Run Consul clients on all Docker hosts / ECS instances

Similar to installing the Docker servers, we use CloudFormation MetaData and the AWS::CloudFormation::Init script to download, configure, and run the Consul client on the Docker hosts / ECS instances. Theses nodes will then automatically join the Consul cluster once they come online.

Run Registrator for automatic service discovery on all Docker hosts / ECS instances

Full documentation for Registrator.

Registrator is a service registry bridge for Docker. It works with several service registries, with Consul being one of them. Instead of having to write custom scripts that monitor for container starts and stops, Registrator is taking care of that task. It also communicates with Consul to manage the creation and destruction of services programmatically whenever a container starts / stops.

Running Registrator on Docker host / ECS instance

By default, Registrator runs in networking mode host (Docker run reference | Docker Documentation), which gives the container full access to the host’s system services.

In AWS ECS, this networking mode is available, but only as a global setting in a task definition, i.e. every container in a task definition would have to run in this mode. Since this is not desirable, we use a CloudFormation template to start Registrator outside of ECS and manually set the Docker container hostname as well as the Registrator -ip option. EC2’s instance metadata and user data come in handy and provide the required IP addresses.

docker run -d --restart=always -v /var/run/docker.sock:/tmp/docker.sock \
-h $(curl -s http://169.254.169.254/latest/meta-data/instance-id) \
--name consul-registrator gliderlabs/registrator:latest \
-ip $(curl -s http://169.254.169.254/latest/meta-data/local-ipv4) \
consul://$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4):8500

Run Consul Template to generate configuration files & reload services programmatically

Consul Template on Github: GitHub – hashicorp/consul-template

Our use case requires a Varnish instance to dynamically generate a Varnish configuration file and reload it afterwards. HashiCorp’s consul-template service is doing that for us.
consul-template monitors any changes to the Consul cluster and renders / generates templates using Go templates.

Varnish Template Example

Consul Template’s templating language is documented in detail here. Here’s a basic example that dynamically creates backends based on tags and services:

{{range service "dummy-app.web-80"}}
backend {{.Node | regexReplaceAll "([^0-9a-zA-Z_]+)" "_"}}_{{.Port}} {
  .host = "{{.Address}}";
  .port = "{{.Port}}";
  .probe = cda_healthcheck;
}
{{end}}

In this example, we are monitoring all web services running on port 80 that are tagged with the tag dummy-app. We use tags because we might have multiple web services that run on the same port, but only want specific applications to be included for these Varnish backends. Consul will provide the IP address ({{.Address}}) as well as the port ({{.Port}}) for us and render out the template.

Consul Tempalte Exec Mode

Consul Template offers a supervisor mode that is supposed to work well with containers. Since containers are only supposed to run one command, having 2 services (in this case Consul Template and Varnish) run simultaneously is challenging. The exec mode in Consul Template is a simple version of a supervisor that allows us to run consul-template as the container command, which will in turn supervise another service, e.g. Varnish. When any of the list templates change, Consul Template will send a configurable reload signal to the child process -> this did not work for me.

Approach One: Run a simple process manager in a container

Since containers don’t contain any process manager by default, using one requires setting it up manually. There are multiple options out there, with varying degrees of complexity:

I decided on using Chaperone: A lightweight, all-in-one process manager for lean containers that seemed like it did the job and was easy to setup. Once installed, Chaperone reads a configuration file and starts several services. An example configuration file for Consul Template and Varnish could look like this:

varnish.service: {
  command: "varnishd -F -T localhost:6082 -f ${VCL_CONFIG} -S /etc/varnish/secret -s malloc,${CACHE_SIZE} ${VARNISHD_PARAMS}"
}

consul-template.service: {
  command: "/usr/local/bin/consul-template \
           -log-level debug \
           -consul-addr \"10.0.110.124:8500\" \
           -template \"/default.ctmpl:/etc/varnish/default.vcl:varnish_reload_vcl\""
}

console.logging: {
  selector: '*.warn',
  stdout: true,
}

The Consul Template’s template parameter has three parts:
1. The source template
2. The target file
3. An optional command to execute after a template gets rendered, in this case varnish_reload_vcl, which reloads the Varnish configuration.

This approach was unsuccessful for me, as reloading Varnish triggered by Consul Template would kill the container. On to the next approach: Docker in Docker.

Approach 2: Run Consul Template and Varnish in separate containers

This approach entails running the two services in separate containers, which is much closer to a standardized Docker setup. The only challenge: the consul-template container needs to execute a command in the varnish container.

The only way I was able to achieve this was by mounting the Docker host’s docker socket into the consul-template container. This is considered bad practice by some (Don’t expose the Docker socket (not even to a container) | lvh), but time was running out to complete this task, and the Registrator container relies on the same logic.

When creating the Varnish image, we make the directory that contains the Varnish configuration files available as a volume and then use that volume in the Consul Template container using Docker’s volumes_from parameter.

The consul-template container runs only Consul Template, generates the Varnish configuration file in the volume mounted from the Varnish container, and then executes a local shell script when it detects a new service. Since this container has access to the Docker deamon, we can find the correct Varnish container using a Docker label, and execute a script in the varnish container that reloads the Varnish configuration.

docker exec $(docker ps --last 1 --filter "label=varnish=engine" --quiet) /usr/local/bin/reload-varnish-config.sh

Summary

Service discovery using AWS ECS is slightly frustrating to setup, as the EC2 Container Services restricts us in running container exactly the way we want to. Mostly, the inability to use different networking modes for individual containers in a task definition as well as the inability to set dynamic environment variables for containers in a task definition are annoying. But once everything is setup and working, service discovery takes care of dynamically updating all services with the right private IP addresses and ports, a task that would otherwise render a containerized architecture impractical.