Deploy Docker Swarm on AWS EC2 via CloudFormation templates - Step 7 - Docker Swarm
In this step we will create a Docker Swarm cluster.
This post is part of a thread that includes these steps:
- Network Setup
- Storage
- Roles
- Manager Instance
- Worker Launch Template
- Worker Instances
- Docker Swarm (this post)
- Cleanup
Docker Swarm
Manager Machine
IMPORTANT: On the manager machine port 2377 (Docker Swarm) must be open / accessible from other nodes in order for them to be able to join the swarm
We are using Docker Swarm. At least one machine should be designated as a “manager”. The “manager” machine is where all cluster managment operations will be done from, including also all the stack deployments.
The manager machine on AWS is named manager
.
Login to manager machine
./ssh/ssh-manager.sh
Switch user to the worker
user:
sudo su - worker
Clean Slate
It is recommended to start from clean Docker installation.
Warning: This will destroy all your images and containers. There is no way to restore them!
docker stop $(docker ps --all --quiet)
docker rm $(docker ps --all --quiet)
docker rmi $(docker images --quiet)
Docker Swarm
Init
Initialize Docker Swarm. This will make the machine Swarm manager:
export dev=eth0
export host_ip=`ip -4 -o addr show $dev | perl -lane 'print $F[3]' | cut -d/ -f1`
docker swarm init --advertise-addr $host_ip --default-addr-pool 192.168.0.0/16
Manager Node
The node on which we ran docker swarm init
is now the manager node. The manager node is also the first worker node.
Worker Nodes
To add another worker to the swarm, run this and follow the instructions:
docker swarm join-token worker
To remove node from the swarm, login to the node and run:
docker swarm leave
pssh
mkdir -p ~/nodes
cd ~/nodes
Install parallel-ssh
cd ~/nodes
git clone https://github.com/nplanel/parallel-ssh
cd parallel-ssh
python3 setup.py install --user
Install pssh
cd ~/nodes
git clone https://github.com/lilydjwg/pssh
cd pssh
pip3 install --user .
Simple test:
pssh \
--host=worker-1.swift.internal --option="StrictHostKeyChecking=no" \
--user=$USER --inline 'uptime'
Test that you can execute a simple command on all hosts:
cd ~/nodes
# for all nodes
printf "%s\n" worker-{1..2} > hosts
pssh \
--hosts=$HOME/nodes/hosts --option="StrictHostKeyChecking=no" \
--user=$USER 'echo hi'
Add Nodes to Swarm
Run this on the manager machine. Copy the join command from the output:
docker swarm join-token worker
Paste the join command which should look similar to this:
docker swarm join --token <very-long-token> <ip-address>:2377
Use pssh
(parallel ssh) to execute the join command on large number of nodes:
pssh --hosts=$HOME/nodes/hosts --inline \
'docker swarm join --token <very-long-token> <ip-address>:2377'
Example (but note that your token will be different):
pssh --hosts=$HOME/nodes/hosts --option="StrictHostKeyChecking=no" --inline \
'docker swarm join --token SWMTKN-1-0nvgk6xtg0lcdcrb7a4mow6978g5tftb4wu5big52bbcpajq82-bq2j7uh6kos1s192h42a853so 10.0.10.175:2377'
Verify:
docker node ls
Congratulations!
We are done with Step 7. Docker Swarm
.
Next step is: Step 8. Cleanup