SOFTWARE SUPPLY CHAIN SECURITY: CI/CD/CT PIPELINES AND SECURITY TOOLS: PART 2

The Software Supply Chain Security

This is part two of a two part blog post on Software Supply Chain Security. If you haven’t read Part 1 yet, starting there is recommended.

Software Scanning Tools

There are a variety of scanning tool capabilities that should be baked into your organization’s CI/CD/CT pipelines.

Static Application Security Testing (SAST)

SAST tools perform analysis of source code and (ideally) dependent packages to identify vulnerabilities.

Some examples of these tools include:

These tools will generate reports. Ideally, the tool would have a mechanism that can report back to the CI/CD/CT pipeline invoking it regarding whether there were any critical or high vulnerabilities identified. Your organization’s minimum level of “clean” reports may vary.

Tools in this space will often have the ability to make language style recommendations or coding standards enforcement. This can get annoying for developers who aren’t used to the style being enforced, but can automate the more mundane aspects of code reviews.

Dynamic Application Security Testing (DAST)

Dynamic Application Security Testing (DAST) is a security testing method used to identify vulnerabilities in a running application by simulating attacks, such as SQL injection, cross-site scripting, and other common vulnerabilities. These types of tools can only detect known vulnerabilities. The catalog of “known vulnerabilities” would typically come from one of the public vulnerability databases (see below). DAST tools evaluate an application, API, service, etc from an external perspective, without needing access to the source code. In this way, you are simulating what an attacker would be doing or looking for while exploring your software system for a weakness that could be exploited.

Ideally, a new build would have a temporary environment spun up and these tools would be run against that to detect vulnerabilities/issues. These tools usually come in the form of a known-vulnerability scanner.

These tools will generate reports. Ideally, the tool would have a mechanism that can report back to the CI/CD pipeline invoking it regarding whether there were any critical or high vulnerabilities identified. Your organization’s minimum level of “clean” reports may vary.

Software Composition Analysis (SCA)

This is some time referred to as dependency vulnerability tracking. The tool will analyze the packaging tool manifest (for example, package.json for npm) and compare that list of dependent packages (and the dependencies of those packages, and so on) against a list of known vulnerabilities, licensing issues, and security issues. If any vulnerabilities are found, then it makes recommendations about which package version to upgrade to. The more sophisticated tools will be able to automatically upgrade the packages. For something like GitHub Dependabot, it can even open Pull Requests (PRs, or Merge Requests, MRs, for the GitLab community) with the required upgrades. If you are not using a tool like this, then, there may be some manual steps involved in updating packages, submitting PRs, and approving PRs. However, once the PR is approved, all of the automated pipeline steps described in this step should automatically happen.

For licensing issues that are identified, the solution may require a more involved process to identify an alternative package with similar functionality that has the desired licensing terms.

A report of what was updated should be generated and be available for later review.

In order for these tools to be useful (read functional) large databases of package dependencies, known vulnerabilities, licensing information, etc must be maintained. Much of this work is addressed by the major opensource packaging tools and the vulnerability tracking databases. But, this creates dependencies that these tools maintainers have no control over. For a proprietary product that maintains its own information databases, this isn’t an issue, but likely is much more expensive.

Some examples of SCA tools include:

Software Bill of Materials (SBOM)

A product Bill of Materials in the manufacturing process is the full list of all the materials, components and parts that are required to produce the thing. This includes the name, description, number-of and cost of each item.

The SBOM is the software equivalent of a product bill of materials — minus the cost of each item. Though, in some cases, you might be able to include cost, but this may be difficult to accurately describe.

OWASP has defined a standard format for SBOMs called CycloneDX. The Linux Foundation has another SBOM format spec called SPDX. The first focuses on tracking vulnerabilities; the latter is focused on license compliance. As an example, the GitHub Software Bill of Materials functionality uses the SPDX format.

Building an SBOM for the whole application/platform can be challenging if you are doing it per git repo (and multiple git repos are used. On the other hand, if you are following a mono-repo model and gitops, you may be able to capture the entire application and infrastructure details in a single SBOM report using an out-of-the-box solution.

Even if you have to stitch an SBOM together from multiple component reports, you’ve got to start somewhere.

The SBOM is one of the reports / artifacts that should be available for every release of your software.

Patch Management

First, the scope of patch management goes far beyond the scope of your Software Supply Chain for software your organization develops / maintains. The goal here is to not explore the topic in full, but as it applies to the Software Supply Chain for software you develop / maintain.

Your organization’s servers & Virtual Machines (VMs) most likely have some type of automated or partially-automated patching tool / solution in place. Likewise, devices (laptops, workstations, phones, etc) each have their patching solution. Most of the time there will be some type of staged rollout defined so that not every device gets a potentially bad-patch at the same time. The patch management solution may have shipped with the OS or there may be a larger command and control solution orchestrating it all. Most organizations I’ve worked with have various solutions in place, but it’s typically not a global solution coordinating all patching. It’s possible, but typically not the case. Even if they do have this in place, the group(s) that are managing this are probably not involved in the internally-develop software supply chain and how it addresses keeping software patches up to date.

The IT department group that manages servers (if that is a distinct job function at this point) may have a role to play in terms of patching self-hosted tools that make up the Software Supply Chain, but it’s also possible that every capability mentioned in this blog post is part of a SaaS solution or baked into a SaaS CI/CD/CT pipeline solution.

This is more about an organized process / program than about a specific tool.

There will be multiple sources of security patches for your software.

  • Container base image updates. For production builds, use base images meant for production and have actually been hardened by the community. Do not use developer images for production software builds.
  • Running update on OS package tool (dnf update, apt-get update, etc, etc). On base container image, VM, etc.
  • Package Dependency analysis + recommended upgrades.
  • Language runtime updates (think updates to the JVM or node.js binaries + supporting libraries)
  • Vulnerability scanner (DAST) recommendations.
  • Recommended updates from Software Composition Analysis (SCA) tools.
  • Source code scanner (SAST tools) style & security recommendations.
  • Opensource, commercial, or government sponsored vulnerability databases such as NVD, CISA KEV Catalog, CVE Program, Snyk Vulnerability Database, CVEDetails.com, vuldb.com). Some of these are source databases where vulnerabilities are first cataloged. Others are copies or amalgamations of upstream databases.
  • Others.

Your software development organization should have a monthly process to apply security patches from all these different sources. Furthermore, there should be a critical patching process that can react within 24-hours for critical patches — this must be limited in scope to the smallest change possible to avoid introducing additional, unexpected problems.

An exception process, managed by an information security team, must be defined to address any situation that cannot satisfy a 30-day (24-hours for critical vulnerabilities / patches) turn-around for patching. This process must include a tracking catalog of granted exceptions and follow up to ensure issues are addressed in a timely manner. This process will tie back to the organization’s risk management.

Vulnerability Management

Vulnerability Management is an ongoing, proactive, and psuedo-automated process that keeps all network nodes and software packages in your environment secure against known exploits. Vulnerability Management identifies, assesses, and addresses weaknesses (known and potential). This can help organizations prevent attacks and minimize damage in the event one does occur. The goal is to minimize risk by mitigating as many vulnerabilities as possible — let’s call this the low-hanging fruit.

The goal of vulnerability management is to reduce the organization’s overall risk exposure by mitigating as many vulnerabilities as possible. This can be a challenging task, given the number of potential vulnerabilities and the limited resources available for remediation. Vulnerability management should be a continuous process to keep up with new and emerging threats and changing environments.
How vulnerability management works

Vulnerability Management program typically includes:

  • Asset / Configuration Management (see above)
  • Vulnerability Scanning (DAST capability mentioned earlier, see above)
  • Patch Management (see above)
  • Security Incident and Event Management (SIEM)
  • Pen testing

This is more about an organized process / program than about a specific tool.

Peer Code Reviews

No matter how much automation your Software Supply Chain pipelines incorporate, there needs to be peer reviews of all Pull Requests. Yes, this part is not automated. Do not hand this over to whatever your favorite AI developer tool is. Some of this you might be able to automate with the SCA tools enforcing certain standards. But, having a human being who understands the context review everything that is going to be merged into the “main” branch is invaluable.

This is not the fun part of the software engineer’s day, but it is one of the more important parts of the job. Take it seriously and spread the responsibility around. The person doing the PR review / approval should have some familiarity with the code being modified. Mature developers will cooperate with this process. The ones who fight you every step of the way are a really good example of the ones who maybe your organization doesn’t really need.

Release Management

Release Management is about coordinating software releases to satisfy business goals while maintaining quality. Building the release and supporting artifacts should be automated within the CI/CD/CT pipelines.

Your software releases should have a defined version numbering scheme. I like to use: “ M.N.Build” format where M = Major version number, N = Minor version, Build = the Build Number. A unique build number can be obtained in GitHub Actions with ${{ github.run_id }}. The version numbering scheme is arbitrary; consistency is most importantly.

A conscious decision must also be made regarding how many concurrently supported release versions will be available. For example, two Major releases (with a unique minor version) will be supported at any given time. Again, this is arbitrary and highly dependent upon on the use cases and how easy it is to push out software releases.

Typically, a release corresponds to a dedicated branch. We’ll call it a release branch. This is part of the branching strategy. A release could also correspond to a git tag. Again, consistency is the important part of this.

The Source Code Management platform, Artifact Repository, or Container Registry can be used to store the release artifacts. The nature of the artifact depends on the software architecture web application (self-hosted, on-prem, cloud, SaaS), native desktop app, native mobile app, micro service(s), container images, etc.

Actively maintained releases (currently supported) are targeted in the patching process (see Patch Management).

Software Distribution

How software is distributed depends on the purpose of software and its architecture — see the Release Management section. Software could be distributed using cloud platform automation tools, package management tools, automatic update / distribution tools with mobile platforms or desktop OSes, or maybe just an old-fashioned manual process.

Secure Software Architecture

Secure Software Architecture (SPA w/ API backend)

Automated Testing

Your organization’s Software Supply Chain should include an automated test suite that be kicked off manually or as part of a CT pipeline.

Standardize on a programmatic testing framework such as:

There are different types of testing that can be done:

  • Unit Testing
  • Integration Testing
  • UI Testing
  • API Testing

All of these test types can be automated.

The goal of the test suite is to produce a more reliable code base and ensure that bugs in core functionality are not introduced with each PR.

Unit Testing

A unit test is a relatively simple (small, easily understood) test that exercises a block of code (possibly a function) to validate the expected outcome under various possible inputs (both positive and negative scenarios). The developer should produce unit tests at the same time the code is produced. Some schools of thought suggest the unit test should be produced prior to the actual code being written. The unit tests are meant for internal use and only useful to the one developing or maintaining the code.

Unit tests runtime dependencies should be mocked or stubbed out. There shouldn’t be dependencies on external systems. Unit tests test for internal consistency rather than interaction with external systems.

These tests will be done with one of the programmatic test frameworks.

Integration Testing

Integration tests are end-to-end tests of the application /system / platform including calls to external systems. Integration tests exercise all API endpoints and all entry points into the application. Creating a proper integration test suite is much more effort than the unit tests.

The best approach is to spin up a self-contained environment that can process the integration tests. Stub out any external calls to systems that you aren’t specifically trying to test, but is still a dependency.

This is usually done with one of the programmatic test frameworks.

Interactive / UI Testing

This consists of using headless browser testing frameworks (such as Selenium or Playwright) to simulate what a regular user would do in a browser.

API Testing

Another common form of testing for most modern platforms / applications is to exercise each API endpoint / operational function by simulating an API Consumer. This includes positive and negative test cases using a variety of inputs (data fuzzing). This can be done with JMeter or one of the programmatic tests framework and an HTTP(S) Client library.

Each API endpoint should also include a set of tests that include valid credentials and invalid credentials. If an OAuth2 Access Token is the credential, for example, then this type of testing could include a valid access token, an expired token, a corrupted token, valid token with the wrong scope, token with the wrong audience, etc. It is strongly recommended that every endpoint has this set of tests executed against it to validate the authentication + authorization policy that protects your APIs.

Secure Credential Storage

There should be a common mechanism for the pipelines to use to securely store credentials used by the pipeline at runtime. Some examples of this capability are:

  • GitHub Secrets
  • Hashicorp Vault.

This can often be built into the build / deploy automation tooling — such as with GitHub Secrets.

If an external (to the build pipeline platform) Secure Credential Storage solution is used, then there is the question of how one securely accesses credential storage securely. Where do you store the username and password to log into credential storage? There is no one-size-fits-all answer to this solution. In the case of GitHub, one could store secrets in an Azure Key Vault and authenticate to Azure Active Directory using Workload Identity Federation. Many of the products in this space support something similar, but not all. This mechanics of this authentication layer must be one of the criteria used to choose the right solution for your use case. If you wait until after the product decision has been made, you will likely have a messy situation that is less than ideal.

Build Process Audit Capability (SIEM)

The CI/CD/CT pipelines and all supporting tools should maintain logs that allow an audit or information security team to reconstruct all activities. All logs should be recorded and pushed to a central logging platform.

A complete set of artifacts generated by the various capabilities should be generated and stored for every build / deploy / test cycle (every pipeline run).

Important events should be captured and forwarded to a Security Information and Event Management (SIEM) platform. The SIEM is basically a logging platform for interesting security details. It consolidates the organization’s security information and events in real (or near) time.

For CI/CD/CT pipelines and software Supply Chain, it serves the same purpose.

For some organizations, the SIEM and central logging are the same system.

Opensource Software Security / Management

Virtually every software project one will encounter today uses Open Source Software (OSS) to some extent. If your goal is to avoid OSS software, that could be challenging in this day-and-age. Most projects I’ve encountered pull in millions of lines of OSS code through package dependencies.

Every OSS package, project, etc is going to have a license that describes what can be done with it. Every organization should have IT and legal in agreement regarding what OSS licenses are acceptable for the organization. I’m not a lawyer and I’m not going to comment further on this piece. The SAST and SCA products will often have functionality to manage and track the licenses used with each OSS components and all dependencies.

The other side of using OSS software securely is keeping dependencies up-to-date with the latest known secure versions of package dependencies. This ties back to your SAST, DAST, and SCA tools plus Patch management — see the earlier discussion on each of these.

If you wanted to go further, you could do proper code reviews on all the OSS dependencies for your project. This will quickly lead one to attempting to minimize the number of OSS dependencies to make this easier. The fewer dependencies, the easier this effort and securing your code base will be. I haven’t had many clients / projects where this was seriously considered. When it did come up, it was for security products or embedded / IOT devices.

I’ve also thought of a third-party service that provides an NPM, Maven, or other package manager repository with packages that have already been reviewed by that company. This would likely be daunting task that requires ongoing maintenance. Between fees that would have to be charged to customers and the liability involved in missing something, this probably doesn’t make for a viable business model. I haven’t seen anyone offering such a service, but it sure would make things easier.

Source of Base Images

Not everything is containerized these days, but many applications are. You could be using raw Kubernetes, OpenShift, Elastic Kubernetes Service (EKS), Azure Kubernetes Service (AKS), Google Kubernetes Engine (GKE), Podman, Docker Swarm,or similar. You could be using Docker Containers or Open Container Initative (OCI) images. Regardless of the details, you need to make some conscious decisions about where you will be getting base container images from.

Common sources of base images include DockerHub or Red Hat Universal Base Images. There are third-parties that offer a set of base images. If it is within your organization’s budget, may be a good idea.

Regardless of where your base images come from, you should always do the following:

  • Pull base images from trusted sources.
  • For something like DockerHub, you must use well-known, trusted sources. Most of the major Linux distributions publish an official set of container images.
  • Use a production-ready base image that has a minimum set of tools and features included.
  • You can use a “development” image for dev / testing activities, then use a minimal / production image for the final build.
  • Scan the base image for vulnerabilities using a container image scanning service that comes with most of the major container registry service providers.
  • Run the OS package update tool on the container image to pull in updates that have been released by the distribution, but not yet pulled into the image.
  • Update the images at least once per month as part of the patching process.

If scans of your base image come back with thousands of vulnerabilities and recommended fixes / upgrades, use another base image. Preferably, one that is meant for production use and maintained by someone who is actually trying.

Identity (Single Sign On for Developers)

All tools used in the development process and Software Supply Chain should require individual user identities that are authenticated through a Federated Single Sign On (FSSO) mechanism. If you use Google Workspace as your email and groupware provider, then use Google to authenticate developer and admin users to each tool. Furthermore, Multi-Factor Authentication (MFA) should be required to access each of these tools. Again, this can be centrally managed at the organization’s IdP.

Identity (system-to-system communication)

Some of the tools listed here will be managed/ hosted / SaaS solutions; some will be self-hosted in the cloud or in on-prem data centers. Regardless of where the tool is, there is likely to communication between the CI/CD/CT pipelines and some type of backend / runner that requires authentication. Maybe there is API key, OAuth2 Client Credentials, username / password (don’t use an individual human user’s credentials), X509 client certificate, or something similar. All the best practices for this type of communication applies here.

  • Don’t log secrets.
  • Use secure credential storage mechanisms.
  • Don’t reuse credentials.
  • Keep access for these credentials limited to minimum needed to accomplish task — Principal of Least Privilege.

Developer / DevOps Engineer Device Security

The device (laptop or workstation most likely) where development and DevOps work is performed should be managed by the organization with a balance between a security lock down and usability for the development community. The diagram above calls out having the following services /capabilities deployed on the developer’s device:

  • Endpoint security solution.
  • Virus / malware scanner
  • Encrypted file system.
  • Firewall
  • File system integrity monitoring
  • Audit logging.
  • VPN Client (if applicable)
  • Centrally-managed identity

Most of these capabilities can be implemented with a single commercial package. Most of it can be accomplished with a combination of several OSS tools on Windows or Linux.

On a couple of projects, instead of working directly on a laptop, a VM was run locally where all of the development activity was performed.

Summary

I haven’t seen two organizations do the software supply chain the same way. The security details are certainly unique as well. If you have the available budget, you can often get many of the supply chain capabilities and supply chain security capabilities from a single vendor (or a relatively short list). It’s possible to build most of these software supply chain security capabilities using OSS software, but it will result in a complex pipeline with a bunch of moving parts.

Feel free to post any comments or suggestions below.

In a follow-up post, we will explore expanding this model to include AI /LLM supply chain security capabilities.

AI / GenAI / ChatGPT / etc were not used to generate this article.

Diagram created with Lucid.App.

Leave a Reply