Docker is a powerful ecosystem which has revolutionized software development a long while ago. If we scope discussion to build pipeline, leveraging docker-based agents to offload build workload comes with a lot of advantages. Black-boxing build environments brings flexibility and scalability, but sacrifices a bit of control. It becomes harder to intervene during the build workflow. And eventually, we end up with a mixed feeling like leveraging docker is great but it will be even greater if I could build more modular pipelines mixing local processing and containerized one.
In this paper we will showcase how to address hybrid scenarios such as:
- I would like to offload build workload to docker but image is not the expected outcome. Instead, I would like to retrieve intermediate built artifacts to shape and ship them on my own.
- I would like to craft an image that involves heterogeneous content which can not be fully built sequentially by a single containerized pipeline.
Said differently, I would like to navigate back - and ideally forth - inside the docker build system.
Let’s see how one could achieve such cases.
Multi-stage builds
We use multi-stage docker build for years now. It is a very powerful tool to sequence a build pipeline while benefiting from docker caching algorithm as its best and enforcing sharpened outcome images.
Assume you are building a .NET
application. You are likely to end up with a 2-stages pipeline, namely build
and final
below, and a COPY
instruction to hand over stuff.
|
|
By doing so, we enforce final image will neither contain SDK
nor intermediate build artifacts, decreasing both its size and vulnerability exposure.
Note that you can also scale this approach to combine and aggregate multiple sources. Assume you are now building a modular .NET
application. You are likely to end up with a 3-stages pipeline, namely build1
, build2
and final
below, and multiple COPY
instructions to hand over stuff.
|
|
Once again, outcome image is tailored for your purpose. By splitting build in two, you also benefit from docker caching which only builds what needs to be. Last but not least, you gain paralellization for free thanks to docker dependency algorithm. As build1
and build2
are independent, they can be processed accordingly.
This approach obviously shines when you have to aggregate heterogenous content. Materializing dedicated swimlanes allows for contextual optimization.
|
|
Multi-stage building is a docker best practice and must be applied early in your process as it greatly optimizes the overall pipeline.
Keep in mind that optimization must be propagated at the inner level as well:
-
By materializing costly steps such as dependencies fetching
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
# -- -- -- -- -- -- -- -- -- -- -- -- -- -- FROM mcr.microsoft.com/dotnet/sdk:7.0-alpine AS build WORKDIR /app # Restore in dedicated step to benefit from cache COPY *.csproj . RUN dotnet restore \ -r linux-musl-x64 # Shrink artifact COPY . . RUN dotnet publish \ --no-restore \ -o out \ -c Release \ -r linux-musl-x64 \ --self-contained true \ -p:PublishSingleFile=true \ -p:IncludeNativeLibrariesForSelfExtract=true \ -p:PublishTrimmed=True \ -p:TrimMode=Link \ /p:DebugType=None \ /p:DebugSymbols=false
-
By leveraging tailored images
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
# -- -- -- -- -- -- -- -- -- -- -- -- -- -- # Start from uncluttered image.. FROM alpine:3.18.4 AS final WORKDIR /app # .. comes with a price # https://learn.microsoft.com/en-us/dotnet/core/install/linux-alpine#dependencies RUN apk add \ bash \ icu-libs \ krb5-libs \ libgcc \ libintl \ libssl1.1 \ libstdc++ \ zlib \ --no-cache COPY --from=build /app/out/ . ENTRYPOINT ["./daedalus"]
Buildx engine
Buildx provides extended build capabilities through BuildKit and should be favored over plain old docker build. As they both share the same core API
, switch is seamless as you only have to move from docker build
to docker buildx build
.
Options we are looking at in this post are:
-
The output one, which comes with different flavors, including a
local
type to write all result files from an image to a directory on the client side.1 2 3
docker buildx build \ --output docker_out_win \ .
-
The target one, which provides the ability to target a specific stage and scopes build pipeline accordingly.
1 2 3
docker buildx build \ --target staging \ .
Hybrid scenarios
Artifactory
Here we would like to release not only - or not at all - our docker image but the plain old matching executables, for both windows and linux OS. Idea is still to offload build workload to a containerized pipeline, but also fetch the resulting executable artifacts to be able to upload them to artifact repository, e.g., artifactory, downstream.
We slightly amend our Dockerfile
to accommodate new OS needs by leveraging built-in ARG
instruction to provide OS switching. Fallback value is provided to ensure we still can build the original pipeline.
|
|
And we now have the ability to provide specific value from the buildx
command line through build-arg
option.
|
|
This in place we introduce an extra stage to perform the artifact extraction. We took the opportunity to use scratch image to do so.
|
|
Here we simply grab generated executable from the build
stage and populate staging
stage with.
Once prerequisites are now set up, we can trigger the desired pipelines from the command line.
|
|
So now we have full-access to our so-desired executables, we can unfold downstream pipeline, eg upload them to artifactory through JFrog CLI.
|
|
One could argue we could have wrapped
JFrog CLI
within a container as well and stay in docker ecosystem the whole path along. That is right. But sometimes we do not have choice or we have simplest ones. Here, as the build pipeline is intented to be unfold within aGitHub
workflow, we would like to leverage existing JFrog GitHub Actions to deal with artifactory publishing.
WASM
Assume now we would like to explore WASM ecosystem and decide to build and deploy a .NET
application. .NET
ecosystem provides a useful Wasi.Sdk
nuget package to easily enrich build pipeline with this new platform. On the other side, docker ecosystem has been enriched to deal with WASM workload, via brand-new platform
and runtime
parameters.
This in mind, we craft a 2-stages Dockerfile
:
build
generates theWASM
version of the.NET
executable.
As we only have to leverageWasi.Sdk
nuget, we can start from a plain oldmcr.microsoft.com/dotnet/sdk
imagefinal
usesWASM
artifact to feed the final image throughCOPY
instruction
|
|
Once done we can trigger pipeline through buildx
, specifying the WASM
platform we are targetting for.
|
|
And it miserably fails..
|
|
In fact, buildx
engine applies platform
option to the whole pipeline which fails as we do not have a matching WASM
version of mcr.microsoft.com/dotnet/sdk
. To fix this, we have to chunk pipeline, triggering first half with default platform and second half with WASM
one. But we also have to hand over artifact from one stage to the other. Here it is not only being able to navigate back from docker ecosystem but being able to navigate forth as well.
Let’s reshape our Dockerfile to accommodate our new needs by adding a brand-new staging
stage, acting as a middle-man to articulate our chunks.
|
|
Our pipeline in place, we can sequence pipeline by chaining two buildx
calls.
|
|
Obviously, you can check that our WASM
image is a valid one by exercising through different runtime.
|
|
Closing
Docker ecosystem is backed up by great tooling. When exploring this ecosystem one can find new ways to deal with tricky problems. It is not only to use the right tool for the job but to learn how to combine tools to shape new one. We showcase here that we can combine both multi-stage
and buildx
features to smooth boundaries between local world and containerized one and enable back and forth navigation. It does not mean we have to move towards this hybrid facility every time. It only means that if we have to, we now know how to operate. And that’s already great. At least for today.