Photo by Venti Views on Unsplash
Everything You Need To Get Started With Docker
What is Docker, why use it, and how to do it correctly
Maybe you heard about Docker but don't fully understand its goal and inner functioning. On the other hand, having a clear understanding of Docker is necessary to not waste your time and resources.
In this article, I give you the necessary knowledge to get started efficiently with Docker, and resources to look at in the future, but you definitely need practice to get there.
What is Docker?
From the Docker documentation, we learn:
Docker provides the ability to package and run an application in a loosely isolated environment called a container.
That's a great definition of what Docker actually is, but if you're just getting started, it might not be easy to understand. Instead, I would like to describe Docker (and containers) in opposition to virtual machines.
Virtual Machine
Maybe you worked with virtual machines in the past, and they are definitely easier to approach.
They can be seen as software that simulates (or virtualizes) a separate environment, on a host machine. For virtual machines, it's possible to interact with physical components (almost) directly and allocate dedicated processors or memory for example.
To be more precise, virtual machines use a piece of software called hypervisor to interact with the physical part of your computer.
In the end, it allows you to simulate a complete operating system, separated from everything else. On the other hand, it makes it heavy by nature, as it simulates a complete environment, using a hypervisor (even though lightweight).
Container
A container (of which Docker is a provider, but there are alternatives) is slightly different. It achieves the same goal of virtualization but with another strategy.
Instead of building a complete virtual machine, a container can be seen as a package that contains everything needed for your system to work. That often includes a light operating system, runtime libraries, and your code.
Containers still work in separate processes but don't use an hypervisor. You can see the separation between containers as softer than virtual machines, which also makes it more performant.
To be more precise, instead of virtualizing physical components like virtual machines, containers only virtualize the operating system. They make use of the host operating system, its features, and its resources.
What problems does Docker solve?
Docker is a great solution to one modern challenge: automation. Because containers are lightweight, they can be easily built and deployed. It makes development, testing, and deployment faster hence cheaper.
Your applications often need a specific environment to run in. It includes an operating system (with a specific version), runtimes, libraries and frameworks, and sometimes more.
This is where the strength of containers, and Docker, is revealed. It makes it very easy to define the environment you need.
For example, setting up a new local environment becomes extremely fast and easy. Tests can run in the same environment, as well as deployment to minimize issues with your infrastructure.
Theory with Docker
To get started, there are three main concepts to understand with Docker:
Dockerfile
Image
Container
Container
Previously, we only talked about containers, which are the final building blocks. At this point, you should understand they are a running piece of software packaged with your application and all the necessary dependencies for it.
Usually, you give access to the application running on your container to the host machine so it becomes usable. But how do we get there?
Image
If your container is actually running, based on a package with everything you need, the previous step is to build the necessary package.
Docker comes with the concept of image, which is exactly what you would expect from a package, a binary with all your dependencies.
Images are built on top of each other. For example, the most basic ones only provide a lightweight operating system like Ubuntu, Debian, or Alpine.
A more advanced docker image is node, which includes the NodeJS runtime on top of Debian (by default). Typically, you would build your own image on top of this one, and include your source code.
At this point, you might have one remaining question:
How do we define the content of an image and build it?
Dockerfile
Docker works with a configuration system based on a text file called Dockerfile. The first example from the introduction is for a go application:
# syntax=docker/dockerfile:1
FROM golang:1.20-alpine
WORKDIR /src
COPY . .
RUN go mod download
RUN go build -o /bin/client ./cmd/client
RUN go build -o /bin/server ./cmd/server
ENTRYPOINT [ "/bin/server" ]
The first FROM instruction defines the base image we are building on top of. Those images usually come from DockerHub, where you can upload your images.
Then, the WORKDIR instruction defines the working directory, where we will run commands by default.
COPY is used to copy files from the host machine to the image, usually the source code.
As the name suggests, RUN is responsible for running commands on the image in the process of being built. In this example, go is used to install dependencies and build client and server applications, based on code copied previously.
Finally, ENTRYPOINT defines the default command used when starting a container based on this image. It's necessary, as you often need to start your application with the container, and a container will stop if the main process is not busy.
You can find more references in the Docker documentation.
Using Docker
Now that you have a better understanding of Docker, you have to actually write Dockerfiles, build images, run containers, and more!
Setup
At this point, you will have to install the Docker engine and use its CLI.
Please, clone this project, which we will use in the next part. It's a simple NodeJS backend application that answers with "hello world" on the root path (/).
Create a Dockerfile
Inside the cloned repository, I want you to create a file called Dockerfile with the following content:
FROM node:18.18.0
WORKDIR /app
COPY . /app
RUN yarn
EXPOSE 8000
CMD ["yarn", "start"]
In the previous section, we already explained most of the instructions we use in a Dockerfile. There are a few differences here:
We are using an image for NodeJS with version 18.18.0.
We install dependencies with yarn (instead of using go).
We expose port 8000, which makes it accessible from outside the container.
We start the app with yarn start.
Create an image
Now, you can open a console inside the same directory and run the command:
docker image build . -t hello-world
Here, we start with the docker command to access its CLI, image to access actions for images, and build to create the actual image.
It takes one argument with the path to the directory with the Dockerfile to build into an image, "." for the actual directory.
While it's not necessary, the "-t" flag defines the name (and optionally :tag) for the generated image. You can have a look at the full documentation for the image build as well as all the commands for image.
List images
Another useful command is to list images, thanks to the image ls command:
docker image ls
It's not the command you will use the most, but keep in mind it exists, mainly if you want to clean memory taken by docker.
Introduction to containers
While you can create and start a container separately, a more common practice is to use run to do both at the same time.
We need to define which image to run our container with, but also define a port accessible on the host machine, and which port it corresponds to, on the container.
We exposed port 8000 from the Dockerfile, we can re-use it and link it to the same port 8000 on the host machine, thanks to the "-p" flag:
docker container run -p 8000:8000 hello-world
That's it, your container should be up and running. You can try to open http://localhost:8000
on your browser and should see a message with "hello world".
List containers
From another console, you can list containers and have multiple commands to do it. In theory, the official command is:
docker container ls
But the truth is, you can also use:
docker container ps
Which can be shortened, and is the most used:
docker ps
No matter which exact command you are using, you can add the "-a" flag to list all containers, including the stopped ones:
docker ps -a
Stop a container
Among the information displayed when listing containers, you have access to the container ID.
After finding the ID of the currently running container, you can use the stop command:
docker container stop [CONTAINER ID]
Start an existing container
You can use the start command:
docker container start [CONTAINER ID]
Interact with a container
During your day-to-day work with Docker, you often need to interact with your containers. It can be to see what's going on, move files around, etc.
You can use exec to run a single command on a container:
docker container exec [CONTAINER ID] [COMMAND]
For a more complex use case, it's possible to keep a console attached and interactive with your container.
It's done thanks to the interactive and tty flags (found in the exec documentation). To be more precise, you can open an interactive console, using the basic sh command.
The interactive and tty flags can be shortened to "-it":
docker container exec -it [CONTAINER ID] sh
To easily leave the interactive console, you can use CMD + D.
Free disk space
If you are working heavily with Docker, but don't have a huge amount of available disk space, you might use it all.
Keep in mind you have access to a prune command for images and containers, which can come in handy.
Docker Compose
While using Docker by itself is a great start, I bet you will very quickly need Docker Compose.
If Docker allows you to manipulate images and containers, docker-compose makes it possible to manage multiple containers and make them work together.
Setup
From a general perspective, Docker Compose works with a configuration file that defines how your containers interact with each other. If you have Docker installed, you should also have Docker Compose and its CLI.
For this exercise, please clone a todo backend made with NodeJS. It needs a PostgreSQL database, which means we also need to run migrations with Prisma.
Dockerfile
We will use a slightly different Dockerfile for this example:
FROM node:18.18.0
RUN apt-get update && apt-get install tini
WORKDIR /app
COPY . /app
RUN yarn
EXPOSE 8000
ENTRYPOINT ["/usr/bin/tini", "--"]
Here, we are installing and using tini, which mainly keeps the container alive and improves signal forwarding.
We are not starting the app here, but that can be configured inside the docker-compose file. It's necessary, as starting the app requires access to a PostgreSQL database.
Compose configuration
While there are more advanced features inside Docker Compose (and Docker, such as networks and volumes), we will focus on services.
While services are more complex in reality, you can see each of them as a definition for a container. Here, we need two containers:
One with a PostgreSQL database
Another with our application, which depends on the first
Good news, there is a postgres image, which can be easily used. We can define our services with this image first:
services:
postgres:
image: postgres:alpine
restart: always
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
ports:
- '5432:5432'
Then, we can add a service for our hello-world application, and configure it with its Dockerfile. It can be done using the build property, which takes the path with the necessary Dockerfile:
services:
postgres:
image: postgres:alpine
restart: always
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
ports:
- '5432:5432'
hello-world:
build: .
Unfortunately, that's not good enough, we need to:
Configure the necessary environment variables.
Configure the container port to the host machine port.
Start the application, which should access the container for postgres.
services:
postgres:
image: postgres:alpine
restart: always
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
ports:
- '5432:5432'
hello-world:
build: .
environment:
- DATABASE_URL=postgres://postgres:postgres@postgres/express-todo
- PORT=8000
- NODE_ENV=development
depends_on:
- postgres
Here, we configured the environment variable, using the username and password postgres, as configured in the first service.
We also use postgres as hostname for the database, which is provided by Docker Compose, as the service itself is called postgres.
We use PORT 8000 and NODE_ENV development as it's necessary for our application, but the most important part is not environment, but the depends_on property.
It's used to define the dependency relationship between services, hence which services are started when another one is, and in which order.
Finally, we can add the startup command:
services:
postgres:
image: postgres:alpine
restart: always
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
ports:
- '5432:5432'
hello-world:
build: .
environment:
- DATABASE_URL=postgres://postgres:postgres@postgres/express-todo
- PORT=8000
- NODE_ENV=development
ports:
- '8000:8000'
command: >
sh -c "yarn prisma migrate deploy
&& yarn build
&& yarn start
depends_on:
- postgres
With this syntax, we can define a command on multiple lines. Here, we are running migration with Prisma, building the app, and starting it.
Starting services
For Docker Compose, you can find the complete list of commands. What we need right now is simply up:
docker-compose up
Once the application starts, you should be able to access localhost:8000/todo, which returns an empty list for now:
Good practices
This article was a good introduction, but I definitely recommend you have a look at an article on performances, and more generally good practices with Docker.