RabbitMQ cluster on kubernetes with StatefulSets
UPDATE 2018-1-8: Don’t do it this way. Use the k8s support built in to either the official autocluster plugin (for RabbitMQ 3.6.x) or the new built-in peer discovery feature in RabbitMQ 3.7.x+.
UPDATE 2017-3-18: Improved the postStart
command based on further testing and refinement.
Since I couldn’t find a blog post showing how to do this, I told myself I should write one up once I figured it out. So here it is.
Kubernetes has a relatively new feature called StatefulSets that is designed to make it easier to run containerized services that are inherently stateful. And RabbitMQ is nothing if not inherently stateful. RabbitMQ clusters double down on the whole statefulness thing. After a few years of experience trying to force RabbitMQ clusters into stateless container infrastructure, I was curious to try embracing the statefulness instead. Paint with the grain and all that.
The first step is you need a kubernetes (k8s) cluster. If you don’t already have one, minikube is a good option to get started on your local dev machine.
Once you have a working k8s cluster and can talk to it with kubectl
, you’re ready to create the RabbitMQ cluster. First we need to create a new secret to hold our Erlang cookie: kubectl create secret generic rabbitmq-config --from-literal=erlang-cookie=c-is-for-cookie-thats-good-enough-for-me
. Then paste the YAML below into a rabbitmq.yaml file and run kubectl create -f rabbitmq.yaml
.
---
apiVersion: v1
kind: Service
metadata:
# Expose the management HTTP port on each node
name: rabbitmq-management
labels:
app: rabbitmq
spec:
ports:
- port: 15672
name: http
selector:
app: rabbitmq
type: NodePort # Or LoadBalancer in production w/ proper security
---
apiVersion: v1
kind: Service
metadata:
# The required headless service for StatefulSets
name: rabbitmq
labels:
app: rabbitmq
spec:
ports:
- port: 5672
name: amqp
- port: 4369
name: epmd
- port: 25672
name: rabbitmq-dist
clusterIP: None
selector:
app: rabbitmq
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: rabbitmq
spec:
serviceName: "rabbitmq"
replicas: 5
template:
metadata:
labels:
app: rabbitmq
spec:
containers:
- name: rabbitmq
image: rabbitmq:3.6.6-management-alpine
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- >
if [ -z "$(grep rabbitmq /etc/resolv.conf)" ]; then
sed "s/^search \([^ ]\+\)/search rabbitmq.\1 \1/" /etc/resolv.conf > /etc/resolv.conf.new;
cat /etc/resolv.conf.new > /etc/resolv.conf;
rm /etc/resolv.conf.new;
fi;
until rabbitmqctl node_health_check; do sleep 1; done;
if [[ "$HOSTNAME" != "rabbitmq-0" && -z "$(rabbitmqctl cluster_status | grep rabbitmq-0)" ]]; then
rabbitmqctl stop_app;
rabbitmqctl join_cluster rabbit@rabbitmq-0;
rabbitmqctl start_app;
fi;
rabbitmqctl set_policy ha-all "." '{"ha-mode":"exactly","ha-params":3,"ha-sync-mode":"automatic"}'
env:
- name: RABBITMQ_ERLANG_COOKIE
valueFrom:
secretKeyRef:
name: rabbitmq-config
key: erlang-cookie
ports:
- containerPort: 5672
name: amqp
volumeMounts:
- name: rabbitmq
mountPath: /var/lib/rabbitmq
volumeClaimTemplates:
- metadata:
name: rabbitmq
annotations:
volume.alpha.kubernetes.io/storage-class: anything
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi # make this bigger in production
The secret sauce is mostly in the postStart command. That does 2 things:
- Adds a new search domain to
/etc/resolv.conf
so that the StatefulSet short ordinal hostnames (e.g.rabbitmq-1
) resolve to the other pods. - Stops the local RabbitMQ node, joins the cluster, and starts it back up (if it doesn’t already appear to be a member).
Now run kubectl describe service rabbitmq-management
and look up the NodePort. Then connect to the public IP of one of your nodes and the NodePort in a web browser (you may have to open the port in your firewall first) and login with username guest
and password guest
. This is the RabbitMQ management UI.
You should see all 5 nodes listed if the cluster… clustered.
Now run kubectl delete pod rabbitmq-2
and watch what happens. The node will go red in the management UI, but k8s will start a new one, and it should re-join the cluster in short order. It will pick the same persistent volume for its /var/lib/rabbitmq
directory as the pod you deleted. So as far as it and the cluster know, it is the same node, back in action. Nice.