Skip to content

Improve release artifact and workload integrity in Kubernetes via a secure software supply chain

Business Problem

A software supply chain typically refers to all the components and processes required to successfully build, distribute, and deploy a product. This is made up of everything from the source code, to the code repos and artifact registries, to the build servers, and to the deployment and operating systems/tools.

Attacks against the supply chain come in a variety of forms, from a direct attack on a company's software build system to the compromise of a third-party dependency. In an infamous attack, hackers infiltrated SolarWinds' build system to inject malicious code into their widely used enterprise management products, enabling severe attacks against SolarWinds' customers. In an attack against Log4j, the ubiquitous open-source Java logging framework, malicious code was added to the Log4Shell tool. This enabled attacks against Log4j users, leading to exfiltrated data, injection of malicious content, and/or takeover of targeted systems.

There is an urgent need to mitigate these risks across the software supply chain by improving security controls. This has been widely acknowledged by authoritative organizations:

Solution

This document describes a general approach to a Secure Software Supply Chain (SSSC) solution. The solution focuses on improving software supply chain security for containerized workloads deployed to, and operated in, Kubernetes environments. This is achieved through defense-in-depth and trust-but-verify approaches.

Our team has worked with many of Microsoft's largest customers, and collaborated with Azure product engineering, to understand the business and technical requirements for building a secure software supply chain that delivers against real customer requirements.

Solution overview

Ensuring integrity in the software release begins when developers start writing the code. All code commits must be cryptographically signed in trusted development environments. Additionally, known vulnerabilities for included dependencies should be surfaced in the development environment to ensure developers do not commit those vulnerabilities to the codebase. Policy enforcement within the developer environment can be used to ensure that developers sign their commits and dependencies within specific vulnerability categories cannot be committed. This ensures that code authors can be cryptographically verified and that the vulnerability risk is well understood.

To ensure that there is a complete understanding of all the components that make up a software release, a component inventory is built as part of the build/release process. This includes all release dependencies, including third party libraries and OS packages. A list of all known component vulnerabilities is added to the inventory, providing a comprehensive understanding of the bits and vulnerabilities that make up the software release. The security team uses this information to design and implement a risk mitigation strategy.

To ensure that the software release, component inventory, and vulnerability collection can't be manipulated by untrusted actors, these software supply chain security artifacts are cryptographically signed in a trusted environment. This ensures that the integrity of all of these artifacts can be cryptographically verified later, even in low-trust environments.

In addition, attestations are used to provide cryptographically verifiable traceability. For example, an attestation can confirm that a component was built by an authorized build runner, backed by a VM running a known Azure subscription, and signed at a specific date-time. Signatures provide confidence in the integrity of all released artifacts, and attestations provide traceability.

The software supply chain security artifacts contain useful information, but need to be combined with policy enforcement to ensure that all components are properly signed and can run with acceptable risk within the software development and deployment lifecycle. Examples include not allowing a software release to be built when a dependency has a major vulnerability and only allowing signed container images to be deployed to Kubernetes clusters.

It is important to provide developers the observability tools they need to understand the relationships between the software supply chain artifacts and the systems and processes that touch them. An emerging best practice is to enable developers to observe these components and relationships via a knowledge graph. For example, if a developer learns about a new vulnerability identified in a particular release of log4j, she might ask questions like the following: "Which of my recent releases have log4j version 2.12.3 as a dependency?" Another question might be, "Which software releases deployed to which clusters have exposure to CVE-2021-44228?"

Value Proposition

By employing the secure software supply chain concepts and components, an organization can ensure that the integrity of their software releases are verifiable from code through to operations. Policies and software supply chain artifacts provide centralized control over risk mitigation, and the knowledge graph provides increased risk assessment insights across the entire software supply chain.

Logical Architecture

There are many capabilities across the phases of the software supply chain lifecycle, which are required to deliver a solution for the business problem described. This is illustrated in the logical architecture diagram below.

Lifecycle and capabilities - logical diagram

Develop

Within the develop phase, developers must be supported to ensure the security and integrity of all code, binaries, and configuration that is expected to be included in the software release.

Develop phase capabilities - logical diagram

The development environment encompasses all the tools that the developer uses to write, build and test code. Examples include VS Code, Devcontainers and/or GitHub Codespaces. The development environment uses a code repository to store the code that's written. Additional components, such as libraries, frameworks, container base images, are retrieved from a component registry.

Repeatable and deterministic builds are an important aspect of a secure software supply chain. These ensure that the contents that make up a software release are well known, and any attempts to tamper with the artifacts can be detected. The development ecosystem used within the develop phase must ensure a record of all the dependencies is captured. Examples of this component inventory include application package managers (npm, NuGet, go.mod), OS package managers (winget, apt get), and container manifests (dockerfile). Version pinning of components from the component registry is encouraged to ensure that builds remain deterministic.

To improve integrity and security, developers use the Signing capability to ensure that all code commits are cryptographically signed in their trusted environment. Signing commits contributes to the trust chain for the code. Signing keys are managed via a certificate store.

In addition, developers use the Scanning capability to lookup vulnerabilities for the current version of each dependent component. For example, developers might learn that the current version of a component has an unacceptable vulnerability, and they can immediately mitigate that risk switching to a version without the vulnerability.

The Policy capability is used to enforce policies available via a policy store. Example policies might include:

  • no code can be committed without signing
  • no code with a non-zero number of critical severity vulnerabilities can be committed
  • component versions must be pinned

The Knowledge Graph capability provides deeper insights into the software supply chain to ensure that risks can be identified and mitigated in new code releases.

Build

Within the build phase, the processes that create a software release must be supported to ensure the security and integrity of the software release.

Build phase capabilities - logical diagram

The build agent retrieves code from a code repository and pulls down dependencies from their associated component registries. Examples of component registries include the registries for package managers and container registries for container base images. Security and compliance teams may deny access to public component registries and enforce the use of internal/private registries instead. This is to ensure integrity across the entire software supply chain used to build the software release.

The build agent uses the code and component to build a software release. In containerized workloads, the release artifact is typically a container image.

The knowledge graph is used to store traceability details about the software build process and subsequent release.

Build-time policies are enforced through an initial policy gate to ensure that the software release has been built in a secure and compliant manner. An example is a policy that enforces pinning dependency versions for all components.

Security artifacts are built to improve the integrity and security of the software release:

  • The Software Composition Analysis capability is used to build a Software Bill of Materials (SBOM) which is a collection of all the components used to build the software release. This can also include information like version and license.
  • The Attestations capability is used to create any required attestations relevant to the building of the software release.
  • The Scanning capability is used to produce a collection of vulnerabilities for all the associated components.
  • All the security artifacts (SBOM, Attestations, Vulnerability Scans) produced are cryptographically signed using the Signing capability to ensure that any tampering of these artifacts can be detected.
  • Signing keys are managed via a certificate store.
  • The knowledge graph is used to associate each security artifact with the software release.

Build-time policies are enforced through a second policy gate, using the additional information from the security artifacts to ensure that the software release is still secure and compliant. An example of policies here could include no release without signed security artifacts, no release if any critical vulnerabilities detected in components, no release if any components use non-compliant license.

The software release is published to an artifact repository, such as a container registry. The software release is associated with additional release details with the knowledge graph. Examples of additional release details include service (GitHub, Azure DevOps), repository (repo url), and release commit (git commit). Some implementations may bundle the security artifacts with the software release in the artifact repository, treating the whole as a "package." Other implementations may rely on the security artifacts being referenced via the knowledge graph, with the artifact repository simply storing the software release.

Deploy

Within the deploy phase, a workload that references the software release is deployed to a Kubernetes cluster. Policy enforcement verifies the integrity and checks the security risks for the software release before allowing the deployment to be scheduled.

Deploy phase capabilities - logical diagram

The workload may be provisioned via either push-based (pipelines, control planes) or pull-based (GitOps) deployments, since the policy enforcement runs within the Kubernetes control plane boundary.

Kubernetes enforces a policy gate with policies from a policy store (see the policy capability). Policy enforcement verifies component signatures (see the signing capability) on container images and security artifacts. Policy enforcement may also use information from the artifact repository and/or Knowledge Graph to enforce rules around integrity and security of the workload being deployed.

Kubernetes pulls the container images required by the workload from the artifact repository, and then schedules them to run on a node. A mechanism that can be local to the Kubernetes cluster or part of the knowledge graph system uses the knowledge graph functionality to associate the deployed software release with operational details. An example of the relationships stored in the knowledge graph includes - "which cluster, in which region, is running the software release, which is composed of a workload definition and associated container images".

Operate

Within the operate phase, a regular policy audit process ensures that any changes in the integrity and security of deployed software releases are noted and acted upon.

Operate phase capabilities - logical diagram

Security artifacts like SBOMs and signatures are static and don't change over time. These artifacts ensure that the integrity of a software release can be verified. Artifacts like vulnerability scans are more dynamic and the risks they represent may change over time. For example, new vulnerabilities may be detected in components that were considered safe when first built or used in a software release.

Since new vulnerabilities are constantly emerging, it's critical to regularly monitor existing software releases to identify components for which new vulnerabilities have been identified. This is done with a scheduled job that scans software releases and audits their policy compliance to identify new security risks. New risks are associated with the software release and the identified components within the knowledge graph. These new risk associations in the knowledge graph can be used by the security team to proactively mitigate security risks, or by the deployment policy gates to automatically ensure that no new deployments of the software release will be accepted by the Kubernetes cluster.

The Kubernetes cluster may also use the knowledge graph capability to associate runtime violations with the cluster, deployment, and software release. An example is a security component in Kubernetes identifying an attempt to escalate access privileges in a workload. This could indicate a security risk due to mis-configuration of the workload, or a vulnerability in the container image used by the workload.

Observability

The knowledge graph is used across all lifecycle phases to provide deep insights into the relationships between the software supply chain artifacts, and the systems and processes that touch them. This ensures traceability across the system and allows for a quick and efficient mechanism for identifying risks so that they can be mitigated.

Implementations

You will need to consider the following when using/building implementations for the solution:

  • Tooling Ecosystem - There are two major tooling ecosystem approaches - notation and sigstore/cosign. Each of these approaches makes decisions that should be understood. Those decisions impact how security artifacts are stored and how the security artifact information can be accessed. Also how security artifacts are signed and verified.
  • Signing Infrastructure - A good understanding of certificates, trust chains, and management of certificates is required since one of the core aspects of a secure software supply chain is signing.
  • Component Registry and Inventory - Package managers and sbom tools are typically aligned well with a set of coding languages and/or ecosystems. A basic understanding of how each of these tools supports the software environment being used is important.
  • Vulnerability Infrastructure - Vulnerability databases are heavily skewed towards support for Linux only in the OSS world. The vulnerability database support for Windows is typically available in commercial offerings. Ensure a basic understanding of the software release os requirements and how these may impact how and where you can do vulnerability scanning.
  • Attestations - Attestations are non-trivial to work with in the ecosystem at this time. Sigstore has made some progress in simplifying their use for simple cases. Consider what additional information within the secure software supply chain needs to be attested and consider using the knowledge graph to persist the attestation and its relationship to other artifacts.
  • Artifact Repository - Typically in containerized workloads, the container registry is the artifact repository. Depending on notation or cosign alignment, there are other considerations as to where the security artifacts and signatures will be present. Depending on implementation, the software release, and security artifacts, could be located in different systems. This is something to be aware of when building out the secure supply chain infrastructure.
  • Policy Infrastructure - A basic understanding of the various policy frameworks, ecosystems and infrastructure is required. Each has different approaches to policy language and integrations into various stages of the software supply chain. Consider also if there is a requirement around centralized policy management and the projection of those policies into the various stages of the software supply chain.
  • Knowledge Graph - This capability is in the early stages of being solved in the broader ecosystem. This is not mature yet.

Notation-based secure software supply chain in Azure Kubernetes Service (AKS)

This implementation delivers a notation-based secure software supply chain in Azure built on Azure Kubernetes Service (AKS) and Azure Container Registry (ACR).

Notation-based implementation in AKS

Either Azure Pipelines or GitHub Actions can be used in the build phase to create the software release and security artifacts. Microsoft's sbom-tool is used to generate an SBOM from the source code, dependencies, and the container image packages. A vulnerability report is generated using Aquasec's Trivy tool (only supports Linux workloads). Notation is used to sign the container image and security artifacts with a signing cert stored in Azure Key Vault. The ORAS tool is used to bundle the signatures, security artifacts, and container image (software release), and then to publish the release bundle to Azure Container Registry.

Policy enforcement is enabled in the deploy phase by using Open Policy Agent's Gatekeeper and the Ratify verification tool. Ratify acts as an external data provider for Gatekeeper and facilitates exposing the appropriate information from the release bundle to the policy engine. Ratify verifies component signatures using a root CA certificate stored in its configuration.

The following policies are enforced:

  1. All images must be signed to ensure workload integrity
  2. All images must have an attached and signed SBOM and vulnerability scan result
  3. All images may only be retrieved from allowed list of container registries

View GitHub repo

Learn more