Technical Evolution and Practice of Mounting Host Volumes During Docker Build

Keywords: Docker Build | Host Volume Mount | Buildkit | RUN_mount | Multi-stage Build

Abstract: This article provides an in-depth exploration of the technical evolution of mounting host volumes during Docker build processes, from initial limitations to the full implementation through Buildkit. It thoroughly analyzes the inherent constraints of the VOLUME instruction, optimization strategies with multi-stage builds, and the specific implementation of RUN --mount syntax in Buildkit. Through comprehensive code examples, it demonstrates how to mount cache directories and build context directories during builds, addressing practical scenarios such as package manager cache sharing and private repository access. The article compares solutions from different historical periods, offering developers comprehensive technical reference.

Technical Background and Problem Definition

In Docker containerized development, mounting host volumes during the build phase has been a consistently important technical requirement. While the traditional docker run -v /export:/export command can mount host directories at runtime, achieving similar functionality during the build phase presents significant technical challenges. This need primarily stems from two practical scenarios: first, sharing package manager cache directories (such as apt-get install cache) during builds to avoid re-downloading dependencies each time; second, accessing large private code repositories on the host to avoid complex git clone operations with SSH keys in Dockerfile.

Nature and Limitations of VOLUME Instruction

Early developers often misunderstood the functionality of the VOLUME instruction, believing it could specify host directories to mount. In reality, the VOLUME instruction only defines the target mount point paths within the container and cannot specify source paths. During the build process, each RUN command creates a temporary anonymous volume that is pre-populated with content from the image but discarded after command execution. Only modifications to the container itself are preserved, while changes to volumes are not persisted.

This design choice is primarily based on Docker's core design principle – portability. Allowing arbitrary host directory mounts during builds would severely compromise image portability, as build results would depend on specific host filesystem states. As emphasized in Docker's official documentation, the build process should be deterministic and reproducible.

Evolution of Historical Solutions

In early Docker versions, the community attempted to address this issue through third-party tools. The Rocker project was a typical example, providing volume mounting capabilities during builds by extending Dockerfile syntax. Rocker supported mounting reusable volumes, sharing SSH keys, multi-stage builds, and other advanced features, once becoming a popular solution to this problem.

# Rocker syntax example (deprecated)
MOUNT /host/path:/container/path
RUN command_using_mounted_path

However, as the Docker ecosystem matured, the Rocker project announced discontinuation in early 2018. The official explanation stated that the container ecosystem had become sufficiently mature, with many of Rocker's key features now achievable through native Docker build tools or other established tools.

Optimization Strategies with Multi-stage Builds

The multi-stage build feature introduced in Docker 17.05 provided new approaches for optimizing the build process. While it doesn't directly solve host volume mounting, separating build and runtime environments effectively reduces final image size.

# Multi-stage build example
FROM debian:sid as builder
COPY export /export
RUN compile_command_here >/result.bin

FROM debian:sid
COPY --from=builder /result.bin /result.bin
CMD ["/result.bin"]

In this pattern, the first stage (builder) can contain complete development environments and build tools, even utilizing large dependency libraries. The second stage contains only runtime-essential components, copying necessary build artifacts from the build stage via the COPY --from instruction. While this approach doesn't directly mount host volumes, it indirectly reduces reliance on external mounts by optimizing the build workflow.

Revolutionary Breakthrough with Buildkit

Docker version 18.09 promoted Buildkit from experimental to stable feature, marking the complete breakthrough of technical barriers for volume mounting during builds. Buildkit completely redesigned Docker's build process, introducing new frontend parsers, with the most important feature being the RUN --mount syntax.

Enabling Buildkit Builds

To use Buildkit features, ensure Docker version 18.09 or higher and enable Buildkit via environment variable:

export DOCKER_BUILDKIT=1
docker build .

If Buildkit is not enabled, using the --mount option will generate an error: "the --mount option requires BuildKit".

Cache Directory Mounting Practice

The RUN --mount syntax supports multiple mount types, with type=cache specifically designed for sharing cache directories between build commands:

# syntax = docker/dockerfile:1
FROM debian:latest
RUN --mount=target=/var/lib/apt/lists,type=cache \
    --mount=target=/var/cache/apt,type=cache \
    apt-get update \
 && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
      git

This configuration enables apt-get package lists and download caches to be shared across multiple builds, significantly improving build speed. Similar patterns can be applied to various programming language package managers, such as Maven's $HOME/.m2 directory, Go's /root/.cache directory, etc.

Build Context Directory Mounting

More importantly, RUN --mount supports mounting directories from the build context via type=bind:

# syntax = docker/dockerfile:1
FROM debian:latest
RUN --mount=target=/export,type=bind,source=export \
    process_export_directory_here...

This mounting approach has several important characteristics: first, the source directory must exist in the build context; second, the mount is read-only, preventing writes back to the host; finally, the mount doesn't map back to the build client. These restrictions ensure the security and reproducibility of the build process.

Security and Portability Considerations

Buildkit's mounting mechanism was designed with thorough security considerations. Mount sources are strictly limited to the build context, preventing malicious Dockerfiles from accessing sensitive directories on the host. This design avoids early security concerns where attackers could trick users into building images to steal host filesystem information.

Simultaneously, this restriction maintains Docker's portability principle. Since all mounted content comes from the build context, build results don't depend on specific host environments, ensuring images can be correctly built and run in any Docker-supported environment.

Practical Application Scenario Analysis

Based on Buildkit's mounting capabilities, we can elegantly solve the two core problems initially raised:

For package manager cache sharing, achieve through cache-type mounts:

# Sharing multiple package manager caches
FROM ubuntu:20.04
RUN --mount=target=/var/cache/apt,type=cache \
    --mount=target=/var/lib/apt/lists,type=cache \
    apt-get update && apt-get install -y python3-pip

RUN --mount=target=/root/.cache/pip,type=cache \
    pip3 install numpy pandas

For large private repository access, achieve through bind mounts:

# Accessing private code in build context
FROM alpine:latest
RUN --mount=target=/app/src,type=bind,source=src \
    cd /app/src && \
    make build

Technical Evolution Summary

From initial lack of support in early Docker versions, through third-party tool Rocker's attempts, to native Buildkit's complete support, the technical path for mounting host volumes during builds has undergone a full evolution cycle. The current Buildkit-based solution achieves an excellent balance between security, portability, and functionality.

Developers can now confidently use the RUN --mount syntax to optimize build processes, enjoying performance benefits from mounting while ensuring Docker's core design principles remain intact. As the Docker ecosystem continues to evolve, we anticipate more advanced build features will be introduced, further simplifying and optimizing containerized application development workflows.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.