Docker is a powerful ecosystem which has revolutionized software development a long while ago. If we scope discussion to build pipeline, leveraging docker-based agents to offload build workload comes with a lot of advantages. Black-boxing build environments brings flexibility and scalability, but sacrifices a bit of control. It becomes harder to intervene during the build workflow. And eventually, we end up with a mixed feeling like leveraging docker is great but it will be even greater if I could build more modular pipelines mixing local processing and containerized one.
In this paper we will showcase how to address hybrid scenarios such as:
- I would like to offload build workload to docker but image is not the expected outcome. Instead, I would like to retrieve intermediate built artifacts to shape and ship them on my own.
- I would like to craft an image that involves heterogeneous content which can not be fully built sequentially by a single containerized pipeline.
Said differently, I would like to navigate back - and ideally forth - inside the docker build system.
Let’s see how one could achieve such cases.
Multi-stage builds
We use multi-stage docker build for years now. It is a very powerful tool to sequence a build pipeline while benefiting from docker caching algorithm as its best and enforcing sharpened outcome images.
Assume you are building a .NET application. You are likely to end up with a 2-stages pipeline, namely build and final below, and a COPY instruction to hand over stuff.
|
|
By doing so, we enforce final image will neither contain SDK nor intermediate build artifacts, decreasing both its size and vulnerability exposure.
Note that you can also scale this approach to combine and aggregate multiple sources. Assume you are now building a modular .NET application. You are likely to end up with a 3-stages pipeline, namely build1, build2 and final below, and multiple COPY instructions to hand over stuff.
|
|
Once again, outcome image is tailored for your purpose. By splitting build in two, you also benefit from docker caching which only builds what needs to be. Last but not least, you gain paralellization for free thanks to docker dependency algorithm. As build1 and build2 are independent, they can be processed accordingly.
This approach obviously shines when you have to aggregate heterogenous content. Materializing dedicated swimlanes allows for contextual optimization.
|
|
Multi-stage building is a docker best practice and must be applied early in your process as it greatly optimizes the overall pipeline.
Keep in mind that optimization must be propagated at the inner level as well:
-
By materializing costly steps such as dependencies fetching
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23# -- -- -- -- -- -- -- -- -- -- -- -- -- -- FROM mcr.microsoft.com/dotnet/sdk:7.0-alpine AS build WORKDIR /app # Restore in dedicated step to benefit from cache COPY *.csproj . RUN dotnet restore \ -r linux-musl-x64 # Shrink artifact COPY . . RUN dotnet publish \ --no-restore \ -o out \ -c Release \ -r linux-musl-x64 \ --self-contained true \ -p:PublishSingleFile=true \ -p:IncludeNativeLibrariesForSelfExtract=true \ -p:PublishTrimmed=True \ -p:TrimMode=Link \ /p:DebugType=None \ /p:DebugSymbols=false -
By leveraging tailored images
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21# -- -- -- -- -- -- -- -- -- -- -- -- -- -- # Start from uncluttered image.. FROM alpine:3.18.4 AS final WORKDIR /app # .. comes with a price # https://learn.microsoft.com/en-us/dotnet/core/install/linux-alpine#dependencies RUN apk add \ bash \ icu-libs \ krb5-libs \ libgcc \ libintl \ libssl1.1 \ libstdc++ \ zlib \ --no-cache COPY --from=build /app/out/ . ENTRYPOINT ["./daedalus"]
Buildx engine
Buildx provides extended build capabilities through BuildKit and should be favored over plain old docker build. As they both share the same core API, switch is seamless as you only have to move from docker build to docker buildx build.
Options we are looking at in this post are:
-
The output one, which comes with different flavors, including a
localtype to write all result files from an image to a directory on the client side.1 2 3docker buildx build \ --output docker_out_win \ . -
The target one, which provides the ability to target a specific stage and scopes build pipeline accordingly.
1 2 3docker buildx build \ --target staging \ .
Hybrid scenarios
Artifactory
Here we would like to release not only - or not at all - our docker image but the plain old matching executables, for both windows and linux OS. Idea is still to offload build workload to a containerized pipeline, but also fetch the resulting executable artifacts to be able to upload them to artifact repository, e.g., artifactory, downstream.
We slightly amend our Dockerfile to accommodate new OS needs by leveraging built-in ARG instruction to provide OS switching. Fallback value is provided to ensure we still can build the original pipeline.
|
|
And we now have the ability to provide specific value from the buildx command line through build-arg option.
|
|
This in place we introduce an extra stage to perform the artifact extraction. We took the opportunity to use scratch image to do so.
|
|
Here we simply grab generated executable from the build stage and populate staging stage with.
Once prerequisites are now set up, we can trigger the desired pipelines from the command line.
|
|
So now we have full-access to our so-desired executables, we can unfold downstream pipeline, eg upload them to artifactory through JFrog CLI.
|
|
One could argue we could have wrapped
JFrog CLIwithin a container as well and stay in docker ecosystem the whole path along. That is right. But sometimes we do not have choice or we have simplest ones. Here, as the build pipeline is intented to be unfold within aGitHubworkflow, we would like to leverage existing JFrog GitHub Actions to deal with artifactory publishing.
WASM
Assume now we would like to explore WASM ecosystem and decide to build and deploy a .NET application. .NET ecosystem provides a useful Wasi.Sdk nuget package to easily enrich build pipeline with this new platform. On the other side, docker ecosystem has been enriched to deal with WASM workload, via brand-new platform and runtime parameters.
This in mind, we craft a 2-stages Dockerfile:
buildgenerates theWASMversion of the.NETexecutable.
As we only have to leverageWasi.Sdknuget, we can start from a plain oldmcr.microsoft.com/dotnet/sdkimagefinalusesWASMartifact to feed the final image throughCOPYinstruction
|
|
Once done we can trigger pipeline through buildx, specifying the WASM platform we are targetting for.
|
|
And it miserably fails..
|
|
In fact, buildx engine applies platform option to the whole pipeline which fails as we do not have a matching WASM version of mcr.microsoft.com/dotnet/sdk. To fix this, we have to chunk pipeline, triggering first half with default platform and second half with WASM one. But we also have to hand over artifact from one stage to the other. Here it is not only being able to navigate back from docker ecosystem but being able to navigate forth as well.
Let’s reshape our Dockerfile to accommodate our new needs by adding a brand-new staging stage, acting as a middle-man to articulate our chunks.
|
|
Our pipeline in place, we can sequence pipeline by chaining two buildx calls.
|
|
Obviously, you can check that our WASM image is a valid one by exercising through different runtime.
|
|
Closing
Docker ecosystem is backed up by great tooling. When exploring this ecosystem one can find new ways to deal with tricky problems. It is not only to use the right tool for the job but to learn how to combine tools to shape new one. We showcase here that we can combine both multi-stage and buildx features to smooth boundaries between local world and containerized one and enable back and forth navigation. It does not mean we have to move towards this hybrid facility every time. It only means that if we have to, we now know how to operate. And that’s already great. At least for today.