From 0 to Docker
Recently, I took some time to learn about Docker.
A single docker image can be reused to provision containers. An image can be thought about like a DVD. You can use it on multiple devices and the same movie plays.
A dockerfile is a text document with the sequential instructions necessary to build a docker image.
An application along with its dependencies is packaged in a container. A container is a running, paused or stopped instance of an image. Although you can run more than one process with a container, they are designed to run a single process. Containers are also designed to be ephemeral. The data stored on containers is inherently ephemeral, as well.
A registry is a repository for images. The most popular example of a registry is Docker Hub. I think of Docker Hub like I think of the NPM registry. Instead of NPM packages you can pull container images.
A tag allows you to add a unique version identifier to an image name.
While experimenting with Docker I’ve been using WSL2 on my Windows 10 machine to run Ubuntu 20.04. These are the installation steps I’ve followed to get Docker up and running.
Make sure to start the docker service before running the sample image.
$ sudo service docker start$ sudo docker run hello-world
This runs one instance (container) of the
I followed the Linux post install steps to avoid having to run
sudo every time I run a docker command.
$ sudo groupadd docker$ sudo usermod -aG docker $USER
You can use the container’s name in many (maybe all?) of these commands. For the sake of brevity I’ll use
container when I mean
To provision a container use the
docker run command:
$ docker run image
To name your container instead of using the randomly generated name:
$ docker run --name cool_name image
Note: You can’t have 2 containers deployed with the same name.
-e option to add environment variables.
$ docker run -e MY_VAR=myValue image
Containers are isolated so you need to explicitly expose them to the outside world. To do this using the
docker run command we can leverage the
--expose option. Also you can map or publish the exposed port to a port on your host system with the
-P option or
-p container_port:host_port. I had to use -P in order to properly connect to my Ubuntu localhost from my Windows browser when using nginx.
$ docker run -P nginx
See which containers are running:
$ docker ps
See all containers (running, paused and stopped):
$ docker ps -a
You can use the smallest number of characters to uniquely identify a container. For example, if you’re only running one container and it has an ID of
d64c42228850 you can just use
d to identify the container when running docker commands.
Get detailed info about a container:
$ docker inspect container
Some other useful commands for managing containers are listed below.
$ docker pause container$ docker unpause container$ docker stop container
docker stop, the
docker kill command will stop the container but it will kill the process instead of gracefully shutting it down.
$ docker kill container
Stopped containers remain in the system and take up space. To remove a stopped container (it must be stopped before you can remove):
$ docker rm container
To automatically remove the container when it exits:
$ docker run --rm image
To stop all running containers in bash:
$ docker stop $(docker ps -q)
To remove all exited containers in bash:
$ docker rm $(docker ps -q -f status=exited)
If all containers are stopped you can remove them all with one command.
$ docker rm $(docker ps -aq)
exec command executes a command against a running container.
$ docker exec container ls /
To get into an interactive bash shell inside a container:
$ docker exec -it container /bin/bash
This is nice for running commands against a container to later be added to a dockerfile once tested out or for debugging a running container. We can use the
exit command or CMD + d to disconnect.
To inspect container logs:
$ docker logs container_id
We can also add
-f to tail logs.
To list local images:
$ docker images
Get new images:
$ docker pull image_name
To remove docker images:
$ docker rmi image_name
Note: to remove an image all associated containers running, paused or stopped must first be removed.
To tag images:
$ docker tag image new_image
To push images to a repository:
$ docker login$ docker push image
We can map a container directory to a directory on the host machine. This will allow data saved to the specified directory in the container to remain in the directory on the host machine when the container is removed. Likewise, changes made on the host machine will be reflected in the container.
$ docker run -itd -v ~/local_relative_dir:/container_dir ubuntu
Note: I had to use the absolute path to get this to work properly with WSL 2.
Another option is to use volume mounts that are managed by Docker. Docker Volumes are the preferred way to persist data. You can mount more than one volume inside a container and you can mount any number of containers to a single volume.
To create a volume:
$ docker volume create my_volume
To list volumes:
$ docker volume ls
To map a volume:
$ docker run -itd -v my_volume:/container_dir ubuntu
To find the path to the volume on your local host machine:
$ docker volume inspect my_volume
In the resulting JSON, the value of “Mountpoint” is the path to the volume. This path can be configured.
A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image.
There are several Dockerfile instruction options.
FROM - Specify the image to base your image off of. Multiply FROM values designate a multi-stage docker build.
RUN - Run a linux command.
ADD - Copy files into the container from the host machine or from a remote URL.
ENV - Set an environment variable.
EXPOSE - Expose a port.
WORKDIR - Set the working directory.
USER - Set the user or group to use when running the container.
VOLUME - Create a mount point with the specified name and mark it as holding externally mounted volumes from the host or other containers.
ENTRYPOINT - Set how to enter, or, how to start the application.
Set default command for running a container.
Also can be used to postfix the ENTRYPOINT command if it exits.
If you list more than one CMD then only the last CMD will take effect.
An inline CMD will override the default in the Dockerfile (
docker run image [CMD])
LABEL - Add metadata.
See the Dockerfile reference documentation for a full list of all the options.
Once we have our Dockerfile we can use
docker build to create an automated build that executes several command-line instructions from the Dockerfile in succession. If you run the build command more than once it will only build the delta between build runs. However, you can utilize the
--no-cache flag to build everything instead.
Here’s a simple example of building an image using a Dockerfile.
Script file (hello_world.sh)
#!/bin/bashecho "Hey! We did some docker stuff!"
FROM ubuntuRUN apt update -y && apt upgrade -yWORKDIR ~/scriptCOPY hello-world.sh .RUN chmod +x hello-world.shCMD './hello-world.sh'
$ docker build -t hello_world .$ docker run hello_world
Multi-stage builds are useful for ending up with smaller image sizes. Here’s an excerpt from the Docker docs for multi-stage builds:
One of the most challenging things about building images is keeping the image size down. Each instruction in the Dockerfile adds a layer to the image, and you need to remember to clean up any artifacts you don’t need before moving on to the next layer. To write a really efficient Dockerfile, you have traditionally needed to employ shell tricks and other logic to keep the layers as small as possible and to ensure that each layer has the artifacts it needs from the previous layer and nothing else.
It was actually very common to have one Dockerfile to use for development (which contained everything needed to build your application), and a slimmed-down one to use for production, which only contained your application and exactly what was needed to run it. This has been referred to as the “builder pattern”. Maintaining two Dockerfiles is not ideal.
With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don’t want in the final image. To show how this works, let’s adapt the Dockerfile from the previous section to use multi-stage builds.
Here’s an example of cutting down image size using a multi-stage build.
FROM golang:alpineWORKDIR /fizzbuzzCOPY . .RUN go build -v -o FizzBuzzAppCMD ["./FizzBuzzApp"]
FROM golang:alpine AS build-stepWORKDIR /fizzbuzzCOPY . .RUN go build -v -o FizzBuzzAppFROM alpineWORKDIR /appCOPY --from=build-step /fizzbuzz/FizzBuzzApp .CMD ["./FizzBuzzApp"]
In addition to multi-stage builds, we can leverage layers to boost performance. Layers are intermediate changes in an image. Each instruction in a dockerfile creates a new layer. Layers can be cached and reused for performance benefit.
$ docker network ls$ docker network create my_network$ docker network inspect my_network$ docker network rm my_network
You can’t remove a network with running containers associated with it.
Containers cannot speak across networks only within them. To run a container within a specified network.
$ docker run -d --network=my_network nginx
Note: Since the container ID is also the host name of the container you can use it to ping, etc.
Containers can communicate via IP addresses or hostnames within a network (unless they have a network type of
Bridged - Default network created by Docker to allow connection between Docker networks and the host machine.
Host - Network config for the container as if it was the host machine with no port mapping required.
None - No networking outside of the container for true isolation.
Docker compose is used for managing more than one container.
Compose is a tool for defining and running multi-container Docker applications. With Compose, you use a YAML file to configure your application’s services. Then, with a single command, you create and start all the services from your configuration.
This hands-on example voting application uses Docker Compose.
To get all the images defined in the
$ docker-compose pull
Build everything in the
$ docker-compose build
To start Compose and run the entire application:
$ docker-compose up
To run in background:
$ docker-compose up -d
Some other useful commands:
$ docker-compose ps$ docker-compose logs -f$ docker-compose down
down command also removes containers.
Compose won’t help with things like load balancing. Orchestrators are good for things like this. Container orchestration is the automated process of managing or scheduling the work of individual containers for applications.
- Amazon ECS
- Docker Swarm
- Red Hat Openshift
The general strategy for taking an existing application and incorporating Docker looks something like this.
- Identify which base image you need
- Identify which tools/apps/runtimes to install in the container
- Identify what you’ll need to copy to the container (if anything)
- Get the app working
- Then worry about data persistence
- Identify opportunities for configuration (env variables, config file(s), scripts, etc)
- Make the image as small as possible
- Build time optimization (via cached layering, etc)
- Add logging