Stop Using Insecure and Inefficient Dockerfiles

Stop Using Insecure and Inefficient Dockerfiles

How to improve your Dockerfile (an example for NodeJS)

Often, I see terrible Dockerfiles used at length. It has a negative impact on security, productivity, and overall cost.

Today, I want to show you how to improve from a basic to a very efficient Dockerfile, step by step.

Getting Started

It’s a very short Dockerfile, which has the advantage of being simple while working perfectly fine. It does the job of:

  • Using the latest version of the official NodeJS image as a base

  • Define a working directory (/app)

  • Copying source files

  • Installing dependencies

  • Exposing port (8000)

  • Running the app (yarn start)

But is it efficient? Obviously not. If it was, there would be no point to the article you’re about to read.

Parent Image

Here, there are three important points.

Using a fixed version

Maybe it seems like a good idea to use the latest version of an image as a base (to stay up-to-date), but it’s not. In practice, it’s often the source of issues.

If you use the last version of an image (latest), you don’t have control over when it gets updated. It’s possible (and will happen) that a new version of an image is published while being incompatible with your app.

Between two builds, your application will suddenly go from working to failing, without any apparent reason.

That’s why we prefer using a fixed version instead of “latest”. But that also means you have the responsibility to update your base images from time to time.

Long-Term-Support

Usually, a software is provided with specific LTS versions. That means creators provide better support for those versions, and should be preferred.

NodeJS provides a list of LTS versions on its official website.

Alpine

Docker images are built on top of a given distribution, such as Ubuntu and Debian. Among those distributions is Alpine.

It’s known for being very lightweight compared to others and is a huge help in keeping an efficient image.

Lower Privileges

The default user being ROOT, he has unlimited access. For security, it’s a better idea to provide a user with limited privileges.

Fortunately, with the node image comes a user called… node.

Working Directory

The working directory is the one used by default. It’s a good idea to define a specific one for your app.

The most common practice is to use /usr/src/app.

Caching

Docker works with a caching system, which is often ignored. You can think of each step in a Dockerfile as a layer, which is cached.

When one layer changes, all subsequent layers are invalidated. When the image is rebuilt, instead of retrieving a layer from the cache, the necessary command is simply restarted.

One layer changes frequently: the source code. So it’s best to copy the source code as late as possible.

The most common mistake is to copy the list of dependencies at the same time as the source code, then install the dependencies.

In this case, every time the source code changes and the image is rebuilt, the dependencies will be reinstalled. That’s why we prefer to copy the files defining the dependencies first, then install them, and finally copy the source code.

Configure the Working Directory

Previously, we defined a new Working Directory and set up a user with limited privilege. That means we need to create the necessary directory first and give necessary access to the node user.

Update Package List

With Alpine, we aren’t using APT, but APK to manage packages. No matter which tool you use to manage packages, you need to update its list from remote repositories.

That way, you ensure no out-of-date packages are installed. It’s mandatory to avoid packages with security and performance issues.

Adding Packages

I recommend using the — no-cache flag to avoid generating cache you won’t use, but will still make your image heavier.

Multiple RUN

Inside Best practices for writing Dockerfiles, we learn to:

always combine RUN apt-get update with apt-get install in the same RUN statement.

Using apt-get update alone in a RUN statement causes caching issues and subsequent apt-get install instructions to fail.

For more information, I highly recommend you give best practices a read.

Entrypoint

I like to keep a basic Dockerfile that doesn’t start my application directly. Instead, I use Tini, which keeps the container alive and improve how processes are managed.

Then, you can either use multi-stage builds to augment your basic stage, or start and configure your application from outside the Dockerfile.

Locally you can use the Docker CLI. Also, when you deploy your app, any container management system allows you to configure startup scripts.

That way, you can have a single Dockerfile that is used in different configurations, instead of multiple configurations with barely any difference.

It becomes the source of truth about which environment your app run on.

Final Result

Now, you might be wondering: what’s the difference?

Apart from better security and faster re-build, one of my real-world project is 1GB lighter with the improved image:


Do you want to learn more backend skills, you can effectively use in a professional environment?


Cover photo by Thais Morais on Unsplash