Google Online Security Blog: June 2021

Announcing a unified vulnerability schema for open source

June 24, 2021

Posted by Oliver Chang, Google Open Source Security team and Russ Cox, Go team
In recent months, Google has launched several efforts to strengthen open-source security on multiple fronts. One important focus is improving how we identify and respond to known security vulnerabilities without doing extensive manual work. It is essential to have a precise common data format to triage and remediate security vulnerabilities, particularly when communicating about risks to affected dependencies—it enables easier automation and empowers consumers of open-source software to know when they are impacted and make security fixes as soon as possible.

We released the Open Source Vulnerabilities (OSV) database in February with the goal of automating and improving vulnerability triage for developers and users of open source software. This initial effort was bootstrapped with a dataset of a few thousand vulnerabilities from the OSS-Fuzz project. Implementing OSV to communicate precise vulnerability data for hundreds of critical open-source projects proved the success and utility of the format, and garnered feedback to help us improve the project; for example, we dropped the Cloud API key requirement, making the database even easier to access by more users. The community response also showed that there was broad interest in extending the effort further.

Today, we’re excited to announce a new milestone in expanding OSV to several key open-source ecosystems: Go, Rust, Python, and DWF. This expansion unites and aggregates four important vulnerability databases, giving software developers a better way to track and remediate the security issues that affect them. Our effort also aligns with the recent US Executive Order on Improving the Nation’s Cybersecurity, which emphasized the need to remove barriers to sharing threat information in order to strengthen national infrastructure. This expanded shared vulnerability database marks an important step toward creating a more secure open-source environment for all users.

A simple, unified schema for describing vulnerabilities precisely

As with open source development, vulnerability databases in open source follow a distributed model, with many ecosystems and organizations creating their own database. Since each uses their own format to describe vulnerabilities, a client tracking vulnerabilities across multiple databases must handle each completely separately. Sharing of vulnerabilities between databases is also difficult.

The Google Open Source Security team, Go team, and the broader open-source community have been developing a simple vulnerability interchange schema for describing vulnerabilities that’s designed from the beginning for open-source ecosystems. After starting work on the schema a few months ago, we requested public feedback and received hundreds of comments. We have incorporated the input from readers to arrive at the current schema:

{

"id": string,

"modified": string,

"published": string,

"withdrawn": string,

"aliases": [ string ],

"related": [ string ],

"package": {

"ecosystem": string,

"name": string,

"purl": string,

"summary": string,

"details": string,

"affects": [ {

"ranges": [ {

"type": string,

"repo": string,

"introduced": string,

"fixed": string

} ],

"versions": [ string ]

} ],

"references": [ {

"type": string,

"url": string

} ],

"ecosystem_specific": { see spec },

"database_specific": { see spec },

}

This new vulnerability schema aims to address some key problems with managing vulnerabilities in open source. We found that there was no existing standard format which:

Enforces version specification that precisely matches naming and versioning schemes used in actual open source package ecosystems. For instance, matching a vulnerability such as a CVE to a package name and set of versions in a package manager is difficult to do in an automated way using existing mechanisms such as CPEs.
Can be used to describe vulnerabilities in any open source ecosystem, while not requiring ecosystem-dependent logic to process them.
Is easy to use by both automated systems and humans.

With this schema we hope to define a format that all vulnerability databases can export. A unified format means that vulnerability databases, open source users, and security researchers can easily share tooling and consume vulnerabilities across all of open source. This means a more complete view of vulnerabilities in open source for everyone, as well as faster detection and remediation times resulting from easier automation.

The current state

The vulnerability schema spec has gone through several iterations, and we are inviting further feedback as it gets closer to finalized. A number of public vulnerability databases today are already exporting this format, with more in the pipeline:

Go vulnerability database for Go packages
Rust advisory database for Cargo packages
Python advisory database for PyPI packages
DWF database for vulnerabilities in the Linux kernel and other popular software
OSS-Fuzz database for vulnerabilities in C/C++ software found by OSS-Fuzz

The OSV service has also aggregated all of these vulnerability databases, which are viewable at our web UI. They can also be queried with a single command via the same existing APIs:

curl -X POST -d \

'{"commit": "a46c08c533cfdf10260e74e2c03fa84a13b6c456"}' \

"https://api.osv.dev/v1/query"

curl -X POST -d \

'{"version": "2.4.1", "package": {"name": "jinja2", "ecosystem": "PyPI"}}' \

"https://api.osv.dev/v1/query"

Automating vulnerability database maintenance

Producing quality vulnerability data is also difficult. In addition to OSV’s existing automation, we built more automation tools for vulnerability database maintenance, and used these tools to bootstrap the community Python advisory database. This automation takes existing feeds, accurately matches them to packages, and generates entries containing precise, validated version ranges with minimal human intervention. We plan to extend this tooling to other ecosystems for which there is no existing vulnerability database, or little support for ongoing database maintenance.

Get involved

Thank you to all the open source developers who have provided feedback and adopted this format. We’re continuing to work with open source communities to develop this further and earn more widespread adoption in all ecosystems. If you are interested in adopting this format, we’d appreciate any feedback on our public spec.

Get ready for the 2021 Google CTF

June 18, 2021

Posted by Kristoffer Janke, Information Security Engineer

Are you ready for no sleep, no chill and a lot of hacking? Our annual Google CTF is back!

The competition kicks off on Saturday July 17 00:00:01 AM UTC and runs through Sunday July 18 23:59:59 UTC. Teams can register at http://goo.gle/ctf.

Just like last year, the top 16 teams will qualify for our Hackceler8 speed run and the chance to take home a total of $30,301.70 in prize money.

As we reminisce on last years event, we’d be remiss if we didn’t recognize our 2020 winning teams:

Plaid Parliament of Pwning
I Use Bing
pasten
The Flat Network Society

We are eager to see if they can defend their leet status. For those interested, we have published all 2020 Hackceler8 videos for your viewing pleasure here.

Whether you’re a seasoned CTF player or just curious about cyber security and ethical hacking, we want you to join us. Sign up to learn skills, meet new friends in the security community and even watch the pros in action. For the latest announcements, see g.co/ctf, subscribe to our mailing list or follow us on @GoogleVRP. See you there!

P.S. Curious about last year’s Google CTF challenges? We open-sourced them here.

Introducing SLSA, an End-to-End Framework for Supply Chain Integrity

June 16, 2021

Posted Kim Lewandowski, Google Open Source Security Team & Mark Lodato, Binary Authorization for Borg Team

Supply chain integrity attacks—unauthorized modifications to software packages—have been on the rise in the past two years, and are proving to be common and reliable attack vectors that affect all consumers of software. The software development and deployment supply chain is quite complicated, with numerous threats along the source ➞ build ➞ publish workflow. While point solutions do exist for some specific vulnerabilities, there is no comprehensive end-to-end framework that both defines how to mitigate threats across the software supply chain, and provides reasonable security guarantees. There is an urgent need for a solution in the face of the eye-opening, multi-billion dollar attacks in recent months (e.g. SolarWinds, Codecov), some of which could have been prevented or made more difficult had such a framework been adopted by software developers and consumers.

Our proposed solution is Supply chain Levels for Software Artifacts (SLSA, pronounced “salsa”), an end-to-end framework for ensuring the integrity of software artifacts throughout the software supply chain. It is inspired by Google’s internal “Binary Authorization for Borg” which has been in use for the past 8+ years and is mandatory for all of Google's production workloads. The goal of SLSA is to improve the state of the industry, particularly open source, to defend against the most pressing integrity threats. With SLSA, consumers can make informed choices about the security posture of the software they consume.

How SLSA helps

SLSA helps to protect against common supply chain attacks. The following image illustrates a typical software supply chain and includes examples of attacks that can occur at every link in the chain. Each type of attack has occurred over the past several years and, unfortunately, is increasing as time goes on.

	Threat	Known example	How SLSA could have helped
A	Submit bad code to the source repository	Linux hypocrite commits: Researcher attempted to intentionally introduce vulnerabilities into the Linux kernel via patches on the mailing list.	Two-person review caught most, but not all, of the vulnerabilities.
B	Compromise source control platform	PHP: Attacker compromised PHP’s self-hosted git server and injected two malicious commits.	A better-protected source code platform would have been a much harder target for the attackers.
C	Build with official process but from code not matching source control	Webmin: Attacker modified the build infrastructure to use source files not matching source control.	A SLSA-compliant build server would have produced provenance identifying the actual sources used, allowing consumers to detect such tampering.
D	Compromise build platform	SolarWinds: Attacker compromised the build platform and installed an implant that injected malicious behavior during each build.	Higher SLSA levels require stronger security controls for the build platform, making it more difficult to compromise and gain persistence.
E	Use bad dependency (i.e. A-H, recursively)	event-stream: Attacker added an innocuous dependency and then updated the dependency to add malicious behavior. The update did not match the code submitted to GitHub (i.e. attack F).	Applying SLSA recursively to all dependencies would have prevented this particular vector, because the provenance would have indicated that it either wasn’t built from a proper builder or that the source did not come from GitHub.
F	Upload an artifact that was not built by the CI/CD system	CodeCov: Attacker used leaked credentials to upload a malicious artifact to a GCS bucket, from which users download directly.	Provenance of the artifact in the GCS bucket would have shown that the artifact was not built in the expected manner from the expected source repo.
G	Compromise package repository	Attacks on Package Mirrors: Researcher ran mirrors for several popular package repositories, which could have been used to serve malicious packages.	Similar to above (F), provenance of the malicious artifacts would have shown that they were not built as expected or from the expected source repo.
H	Trick consumer into using bad package	Browserify typosquatting: Attacker uploaded a malicious package with a similar name as the original.	SLSA does not directly address this threat, but provenance linking back to source control can enable and enhance other solutions.

What is SLSA

In its current state, SLSA is a set of incrementally adoptable security guidelines being established by industry consensus. In its final form, SLSA will differ from a list of best practices in its enforceability: it will support the automatic creation of auditable metadata that can be fed into policy engines to give "SLSA certification" to a particular package or build platform. SLSA is designed to be incremental and actionable, and to provide security benefits at every step. Once an artifact qualifies at the highest level, consumers can have confidence that it has not been tampered with and can be securely traced back to source—something that is difficult, if not impossible, to do with most software today.

SLSA consists of four levels, with SLSA 4 representing the ideal end state. The lower levels represent incremental milestones with corresponding incremental integrity guarantees. The requirements are currently defined as follows.

SLSA 1 requires that the build process be fully scripted/automated and generate provenance. Provenance is metadata about how an artifact was built, including the build process, top-level source, and dependencies. Knowing the provenance allows software consumers to make risk-based security decisions. Though provenance at SLSA 1 does not protect against tampering, it offers a basic level of code source identification and may aid in vulnerability management.

SLSA 2 requires using version control and a hosted build service that generates authenticated provenance. These additional requirements give the consumer greater confidence in the origin of the software. At this level, the provenance prevents tampering to the extent that the build service is trusted. SLSA 2 also provides an easy upgrade path to SLSA 3.

SLSA 3 further requires that the source and build platforms meet specific standards to guarantee the auditability of the source and the integrity of the provenance, respectively. We envision an accreditation process whereby auditors certify that platforms meet the requirements, which consumers can then rely on. SLSA 3 provides much stronger protections against tampering than earlier levels by preventing specific classes of threats, such as cross-build contamination.

SLSA 4 is currently the highest level, requiring two-person review of all changes and a hermetic, reproducible build process. Two-person review is an industry best practice for catching mistakes and deterring bad behavior. Hermetic builds guarantee that the provenance’s list of dependencies is complete. Reproducible builds, though not strictly required, provide many auditability and reliability benefits. Overall, SLSA 4 gives the consumer a high degree of confidence that the software has not been tampered with.

More details on these proposed levels can be found in the GitHub repository, including the corresponding Source and Build/Provenance requirements. We are open to feedback and suggestions for changes on these requirements.

Proof of Concept

Today, we are releasing a proof of concept for SLSA 1 provenance generator (repo, marketplace). This will allow a user to create and upload provenance alongside their build artifacts, thereby achieving SLSA 1. To use it, add the following snippet to your workflow:

- name: Generate provenance

uses: slsa-framework/github-actions-demo@v0.1

with:

artifact_path: <path-to-artifact/directory>

Going forward, we plan to work with popular source, build, and packaging platforms to make it as easy as possible to reach higher levels of SLSA. These plans include generating provenance automatically in build systems, propagating provenance natively in package repositories, and adding security features across the major platforms. Our long-term goal is to raise the security bar across the industry so that the default expectation is higher-level SLSA security standards, with minimal effort on the part of software producers.

Summary

SLSA is a practical framework for end-to-end software supply chain integrity, based on a model proven to work at scale in one of the world’s largest software engineering organizations. Achieving the highest level of SLSA for most projects may be difficult, but incremental improvements recognized by lower SLSA levels will already go a long way toward improving the security of the open source ecosystem.

We look forward to working with the community on refining the levels as we begin adopting SLSA for our own open source projects. If you are a project maintainer and interested in trying to adopt and provide feedback on SLSA, please reach out or come join the discussions taking place in the OpenSSF Digital Identity Attestation Working Group.

Check out the Know, Prevent, Fix post to read more about Google’s overall approach to open source security.

Security Blog

Announcing a unified vulnerability schema for open source

Get ready for the 2021 Google CTF

Introducing SLSA, an End-to-End Framework for Supply Chain Integrity

Labels

Archive

Feed