Buildkit: local build, caching and image scanning

Exploring advanced Buildkit use cases

Published on 06/11/2022 by igor.kolomiyets in Technical Tips

In the previous articles, Part 1 and Part 2, we discussed the use of Buildkit to build Docker images in the Jenkins pipeline. The described approach has one flaw.

Splash

Look at the following stage:

stage('Build Docker Image') {
            container('buildkit') {
                sh """
                  buildctl build \
                      --frontend dockerfile.v0 \
                      --local context=. \
                      --local dockerfile=. \
                      --output type=image,name=${image},push=true
                  buildctl build \
                      --frontend dockerfile.v0 \
                      --local context=. \
                      --local dockerfile=. \
                      --output type=image,name=${repository}:${tag},push=true
                """
                milestone(1)
            }
        }

In this case, buildctl executed twice and both times it will run the full build really doubling the stage excution time.

Can this be improved?

Yes, Buildkit is quite cache efficient, we just did not use it in the stage described above.

To utilize Buildkit ability to cache layers we have to export it to the local container filesystem when executing buildctl for the first time and then import it when running it second time.

To export layers to the local cache add the following option: --export-cache type=local,dest=/tmp/buildkit/cache.

To import it later add the following option: --import-cache type=local,src=/tmp/buildkit/cache.

If there are more than two calls to the buildctl tool in the pipeline it makes sense to use both --export-cache and --import-cache options for all calls except the first one.

So, to utilized cache in the example stage above we have to modify it to the following:

stage('Build Docker Image') {
            container('buildkit') {
                sh """
                  buildctl build \
                      --frontend dockerfile.v0 \
                      --local context=. \
                      --local dockerfile=. \
                      --export-cache type=local,dest=/tmp/buildkit/cache \
                      --output type=image,name=${image},push=true
                  buildctl build \
                      --frontend dockerfile.v0 \
                      --local context=. \
                      --local dockerfile=. \
                      --import-cache type=local,src=/tmp/buildkit/cache \
                      --output type=image,name=${repository}:${tag},push=true
                """
                milestone(1)
            }
        }

There are two other use cases which I would like to discuss as well.

One of them is compiling source code during Docker Image build. Historically, we compile resulting executable or archive (in Java world) first and then feed it one way or another to a Docker Image.

However, nothing is really stopping us from doing this as part of the Docker Image build. One little problem was stopping me from using this approach in the pipeline for years.

You see, I would normally have test reports published to the Jenkins Job run so, they could be accessed via Jenkins frontend. And pulling out those reports from the image was not straightforward.

When building image using plain Docker I would first have to build image stopping at the end of the build stage. Then I have to create a container from that image and copy reports and perhaps the other files out of the image to the host filesystem.

Then, I have to build it again to the end getting the final image that would be pushed eventually to the registry.

The Buildkit simplifies this task. It supports multiple output formats for resulting image. It could be just flushed to the disk as a directory structure, or stored as an OCI image .tar file. So, in this case, I would probably prefer to build the image as a local directory and then just copy files out.

Let’s image that we have the following Dockerfile that builds a Spring Boot application:

FROM eclipse-temurin:17.0.4.1_1-jdk AS source

ARG version="1.0.0"
ARG build_number
ENV BUILD_NUMBER=${build_number}

RUN mkdir /app

COPY .gradle.properties /root/.gradle/gradle.properties
COPY gradle /app/gradle
COPY build.gradle gradle.properties gradlew settings.gradle sonar-project.properties /app/

WORKDIR /app

RUN ./gradlew clean

FROM source AS build
ARG version="1.0.0"

ARG build_number
ENV BUILD_NUMBER=${build_number}

COPY src /app/src
COPY test /app/test

RUN ./gradlew build

FROM eclipse-temurin:17.0.4.1_1-jre as app

ARG version="1.0.0"
ARG build_number

RUN groupadd -g 1001 devops \
    && useradd -u 1001 -g 1001 devops \
    && mkdir -p /app/config \
    && chown -R devops /appENV BUILD_NUMBER=${build_number}
    
COPY --from=build /app/build/libs/service-${version}.${build_number}.jar /app/service-${version}.${build_number}.jar

RUN ln -s /app/service-${version}.${build_number}.jar /app/service.jar
RUN mkdir -p /app/config \
 && chown -R devops /app
    
EXPOSE 8080
WORKDIR /app
USER devops

FROM app as debug

CMD ["java", "-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005", "-jar", "/app/service.jar"]

FROM app

CMD java $JAVA_OPTS -jar /app/service.jar

So, to get it just built I need first to run it to the build target. The “Build Java Code” stage should look in this case like the following:

container('buildkit') {
            stage('Build Java Code') {
                try {
                    sh """
                        mkdir -p /root/.gradle; cat /etc/.gradle/gradle-new.properties > /root/.gradle/gradle.properties
                        cp /etc/.gradle/gradle-new.properties .gradle.properties
                        chmod 755 ./gradlew
                        buildctl build \
                             --frontend dockerfile.v0 \
                             --local context=. \
                             --local dockerfile=. \
                             --export-cache type=local,dest=/tmp/buildkit/cache \
                             --output type=local,dest=/tmp/app \
                             --opt network=host \
                             --opt build-arg:version=${version} \
                             --opt build-arg:build_number=${env.BUILD_NUMBER} \
                             --opt target=build
                        cp -R /tmp/app/app/build ./
                    """
                } catch (error) {
                    step([$class: 'Mailer',
                        notifyEveryUnstableBuild: true,
                        recipients: emailextrecipients([[$class: 'CulpritsRecipientProvider'],
                                                        [$class: 'DevelopersRecipientProvider']]),
                        sendToIndividuals: true])
                    throw error
                } finally {
                    step([$class: 'JUnitResultArchiver', testResults: 'build/test-results/test/*.xml'])
                    step([$class: 'JacocoPublisher',
                        execPattern: 'build/jacoco/*.exec',
                        classPattern: 'build/classes',
                        sourcePattern: 'src/main/java',
                        exclusionPattern: 'src/test*'
                    ])
                }
            }

Note that right after executing buildctl we are copying build directory back to the workspace: cp -R /tmp/app/app/build ./.

Then in the final block, we publish test results and Jacoco test results to Jenkins job.

Moreover, we can now run the sonarqube scan from the sonarqube container as all the necessary files will be in the workspace.

Note that we are caching layers hence, if we run the build now to the end Buildkit won’t compile code again and just use the cached layers instead.

The other thing I would like to do in the pipeline is that I would like to scan the resulting image for vulnerabilities.

Historically, we used Anchore Server to scan images for vulnerabilities. With this approach image should be pushed to the registry prior to scanning which is not ideal since if we find vulnerabilities in the image it won’t be used and just hangs in the registry taking disk space.

So, ideally, we would like to scan image locally, prior to pushing it. Moreover, we would like to push images only if it is built from the master branch, however we would like to scan them for every branch and every PR.

Rise of Grype lets us scan images locally without hassle but when using Buildkit approach, described earlier we do not have access to the image in question: we built it and pushed it immediately.

So, to scan resulting image we want to build it and store it locally, perhaps this time as an OCI image .tar file.

The stage for this will look like the following:

stage('Build Docker Image') {
    try {
        sh """
            buildctl build \
               --frontend dockerfile.v0 \
               --local context=. 
               --local dockerfile=. \
               --export-cache type=local,dest=/tmp/buildkit/cache \
               --import-cache type=local,src=/tmp/buildkit/cache \
               --output type=oci,dest=/tmp/image.tar \
               --opt network=host \
               --opt build-arg:version=${version} \
               --opt build-arg:build_number=${env.BUILD_NUMBER}
        """
    } catch (error) {
        step([$class: 'Mailer',
            notifyEveryUnstableBuild: true,
            recipients: emailextrecipients([[$class: 'CulpritsRecipientProvider'],
                                            [$class: 'RequesterRecipientProvider']]),
            sendToIndividuals: true])
            throw error
    }
}

Note, that we use both --export-cache and --import-cache in the stage and resulting image is stored as /tmp/image.tar file.

Now I can run the grype scan against the .tar file as grype supports OCI images. I will use the following stage to scan image:

stage('Scan Docker Image') {
            try {
                sh """
     wget https://raw.githubusercontent.com/anchore/grype/main/install.sh
     chmod 755 install.sh./install.sh
     mv bin/grype /usr/local/bin/
     
                    GRYPE_MATCH_GOLANG_USING_CPES=false \
                          /usr/local/bin/grype \
                                  oci-archive:/tmp/image.tar \
                                  -f high \
                                  --scope all-layers \
                                  -o template \
                                  --file report.html \
                                  -t grype.tmpl \
                                  --only-fixed
                """
            } catch (error) {
                step([$class: 'Mailer',
                    notifyEveryUnstableBuild: true,
                    recipients: emailextrecipients([[$class: 'CulpritsRecipientProvider'],
                                                    [$class: 'RequesterRecipientProvider']]),
                    sendToIndividuals: true])
                    throw error
            } finally {
                publishHTML (target : [allowMissing: false,
                 alwaysLinkToLastBuild: true,
                 reportDir: '',
                 keepAll: true,
                 reportFiles: 'report.html',
                 reportName: 'Grype Scan Report',
                 reportTitles: 'Grype Scan Report'])
            }
            milestone()
            }

The above stage assumes that grype.tmpl file is available in the workspace. This file is required to produce a html report for the grype scan and publish it to the Jenkins Job run. More details about using templates with grype is available here.

So, now if all the stages succeeded and if the build in question is for master branch we can build and push the resulting images as before.

if (env.BRANCH_NAME == "master") {
            //Release stage is only executed from the 'master' branch
            stage('Tagging Source Code') {
                values = version.tokenize(".")
                def repositoryCommitterEmail = "jenkins@iktech.io"
                def repositoryCommitterUsername = "jenkinsCI"sh "git config user.email ${repositoryCommitterEmail}"
                sh "git config user.name '${repositoryCommitterUsername}'"
                sh "git tag -d v${values[0]} || true"
                sh "git push origin :refs/tags/v${values[0]}"
                sh "git tag -d v${values[0]}.${values[1]} || true"
                sh "git push origin :refs/tags/v${values[0]}.${values[1]}"
                sh "git tag -d v${version} || true"
                sh "git push origin :refs/tags/v${version}"sh "git tag -a v${values[0]} -m \"passed CI\""
                sh "git tag -a v${values[0]}.${values[1]} -m \"passed CI\""
                sh "git tag -a v${version} -m \"passed CI\""
                sh "git tag -a v${version}.${env.BUILD_NUMBER} -m \"passed CI\""
                sh "git push --tags"
            }
            milestone()stage('Push Docker Image to the registry') {
                container('buildkit') {
                    try {
                        sh """
                            wget https://amazon-ecr-credential-helper-releases.s3.us-east-2.amazonaws.com/0.6.0/linux-amd64/docker-credential-ecr-login -O /usr/local/bin/docker-credential-ecr-login
          chmod 755 /usr/local/bin/docker-credential-ecr-loginmkdir -p /root/.docker
          cp /tmp/docker/config.json /root/.docker/
                            buildctl build \
                                 --frontend dockerfile.v0 \
                                 --local context=. \
                                 --local dockerfile=. \
                                 --output type=image,name=${image},push=true \
                                 --export-cache type=local,dest=/tmp/buildkit/cache \
                                 --import-cache type=local,src=/tmp/buildkit/cache \
                                 --opt network=host \
                                 --opt build-arg:version=${version} \
                                 --opt build-arg:build_number=${env.BUILD_NUMBER}
                            buildctl build \
                                --frontend dockerfile.v0 \
                                --local context=. \
                                --local dockerfile=. \
                                --output type=image,name=${repository}:latest,push=true \
                                --export-cache type=local,dest=/tmp/buildkit/cache \
                                --import-cache type=local,src=/tmp/buildkit/cache \
                                --opt network=host \
                                --opt build-arg:version=${version} \
                                --opt build-arg:build_number=${env.BUILD_NUMBER}
                        """
                    } catch (error) {
                        step([$class: 'Mailer',
                            notifyEveryUnstableBuild: true,
                            recipients: emailextrecipients([[$class: 'CulpritsRecipientProvider'],
                                                            [$class: 'RequesterRecipientProvider']]),
                            sendToIndividuals: true])
                            throw error
                    }
                }
            }stage('Publish Service docker image version to the artifactz.io') {
                publishArtifact name: 'service',
                                description: 'Test Service',
                                type: 'DockerImage',
                                stage: 'Development',
                                flow: 'Simple',
                                version: "${version}.${env.BUILD_NUMBER}"
            }stage('Push Service docker image version to the Automated Integration Testing stage') {
                pushArtifact name: 'service', stage: 'Development'
            }
        } else {
            echo 'Skipping release for branch [' + env.BRANCH_NAME + ']. Release are only executed from the master.'
            stage('Notify') {
                node('master') {
                    if(!hudson.model.Result.SUCCESS.equals(currentBuild.getPreviousBuild()?.getResult())) {
                        step([$class: 'Mailer',
                            notifyEveryUnstableBuild: true,
                            recipients: emailextrecipients([[$class: 'CulpritsRecipientProvider'],
                                                            [$class: 'RequesterRecipientProvider']]),
                            sendToIndividuals: true])
                    }
                }
            }
        }

As you can see, Buildkit gives us a lot of flexibility to what we can do in the pipeline.