Google Open Source Blog: security

Showing posts with label security. Show all posts

Common Expressions For Portable Policy and Beyond

Tuesday, June 18, 2024

I am thrilled to introduce Common Expression Language, a simple expression language that's great for writing small snippets of logic which can be run anywhere with the same semantics in nanoseconds to microseconds evaluation speed. The launch of cel.dev marks a major milestone in the growth and stability of the language.

It powers several well known products such as Kubernetes where it's used to protect against costly production misconfigurations:

object.spec.replicas <= 5

Cloud IAM uses CEL to enable fine-grained authorization:

request.path.startsWith("/finance")

Versatility

CEL is both open source and openly governed, making it well adopted both inside and outside Google. CEL is used by a number of large tech companies, either internally or as part of their public product offering. As a reflection of this open governance, cel.dev has been launched to share more about the language and the community around it.

So, what makes CEL a good choice for these applications? Why is CEL unique or different from Amazon's Cedar Policy or Open Policy Agent's Rego? These are great questions, and common ones (no pun intended):

Highly optimized evaluation O(ns) - O(μs)
Portable with stacks supported in C++, Java, and Go
Thousands of conformance tests ensure consistent behavior across stacks
Supports extension and subsetting

Subsetting is crucial for preserving predictable compute / memory impacts, and it only exists in CEL. As any latency-critical service maintainer will tell you, it's vital to have a clear understanding of compute and memory implications of any new feature. Imagine you've chosen an expression language, validated its functionality meets your security, serving, and scaling goals, but after launching an update to the library introduces new functionality which can't be disabled and leaves your product vulnerable to attack. Your alternatives are to fork the library and accept the maintenance costs, introduce custom validation logic which is likely to be insufficient to prevent abuse, or to redesign your service. CEL supported subsetting allows you to ensure that what was true at the initial product launch will remain true until you decide when to expose more of its functionality to customers.

Cedar Policy language was developed by Amazon. It is open source, implemented in Rust, and offers formal verification. Formal verification allows Cedar to validate policy correctness at authoring time. CEL is not just a policy language, but a broader expression language. It is used within larger policy projects in a way that allows users to augment their existing systems rather than adopt an entirely new one.

Formal verification often has challenges scaling to large amounts of data and is based on the belief that a formal proof of behavior is sufficient evidence of compliance. However, regulations and policies in natural language are often ambiguous. This means that logical translations of these policies are inherently ambiguous as well. CEL's approach to this challenge of compliance and reasoning about potentially ambiguous behaviors is to support evaluation over partial data at a very high speed and with minimal resources.

CEL is fast enough to be used in networking policies and expressive enough to be used in detailed application policies. Having a greater coverage of use cases improves your ability to reason about behavior holistically. But, what if you don't have all the data yet to know exactly how the policy will behave? CEL supports partial evaluation using four-valued logic which makes it possible to determine cases which are definitely allowed, denied, or where policy behavior is conditional on additional inputs. This allows for what-if analysis against historical data as well as against speculative data from new use cases or proposed policy changes.

Open Policy Agent's Rego is also open source, implemented in Golang and based on Datalog, which makes it possible to offer similar proofs as Cedar. However, the language is much more powerful than Cedar, and more powerful than CEL. This expressiveness means that OPA Rego is fantastic for non-latency critical, single-tenant solutions, but difficult to safely embed in existing offerings.

Four-valued Logic

CEL uses commutative logical operators that can render a true, false, error, or unknown status. This is a scalable alternative to formal verification and the expressiveness of Datalog. Four-valued logic allows CEL to evaluate over a partial set of inputs to deliver either a definitive result or communicate that more information is necessary to complete the evaluation.

What is four-valued logic?

True, false, and error outcomes are considered definitive: no additional information will change the outcome of the expression. Consider the following case:

1/0 != 0 && false

In traditional programming languages, this expression would be an error; however, in CEL the outcome is false.

Now consider the following case where an input variable, called unknown_var is marked as unknown:

unknown_var && true

The outcome of this expression is UNKNOWN{unknown_var} indicating that once the variable is known, the evaluation can be safely completed. An unknown indicates what was missing, and alerts the user to fix the outcome with more information. This technique both borrows from and extends SQL three-valued predicate logic which uses TRUE, FALSE, and NULL with commutative logical operators. From a CEL perspective, the error state is akin to SQL NULL that arises when there is an absence of information.

CEL compatibility with SQL

CEL leverages SQL semantics to ensure that it can be seamlessly translated to SQL. SQL optimizers perform significantly better over large data sets, making it possible to evaluate over data at rest. Imagine trying to scale a single expression evaluation over tens of millions of records. Even if the expression evaluates within a single microsecond, the evaluation would still take tens of seconds. The more complex the expression, the greater the latency. SQL excels at this use case, so translation from CEL to SQL was an important factor in the design in order to unlock the possibility of performant policy checks both online with CEL and offline with SQL.

Thank you CEL Community!

We’re proud to announce cel.dev as a major milestone in the maturity and adoption of the language, and we look forward to working with you to make CEL the best building block for writing latency-critical, portable logic. Feel free to contact us at cel-lang-discuss@googlegroups.com

By Tristan Swadell – Senior Staff Software Engineer

OSV and helping developers fix known vulnerabilities

Tuesday, April 2, 2024

In 2021, we launched the OSV project with a goal of enabling easy management of known vulnerabilities in open source software dependencies. To achieve this, we started by building an open source, comprehensive database (https://osv.dev) that accurately describes all known OSS vulnerabilities in the easy-to-use OpenSSF OSV Schema.

Over time, we worked with numerous open source communities to adopt the OSV Schema (totalling over 24 ecosystems), and introduced open source tools like our API and OSV-Scanner to directly make this database useful to developers.

The OSV project takes a very developer-focused approach to vulnerability management, as we realize that day-to-day developers are often the ones who bear the burden of managing dependency updates and triaging vulnerabilities in their dependencies.

Today the OSV team is excited to announce some exciting updates to the work we’ve been doing, and share how the OSV project as a whole helps developers with vulnerability management today.

Announcing guided remediation

Developers are often faced with an overwhelming number of vulnerabilities reported against their dependencies. To tackle this, we’re announcing a tool as part of OSV-Scanner to enable developers to both interactively and automatically prioritize and fix the vulnerabilities that matter in an easy way.

The basic usage of the tool provides a simple command for developers to run which will automatically fix as many vulnerabilities as possible by upgrading their project’s dependencies.

For developers who need or want finer control over vulnerability remediation, the tool also provides the more advanced interactive mode. In the interactive mode, developers can preview and make informed decisions on which packages to upgrade or which vulnerabilities they want to prioritize based on metrics such as vulnerability severity, dependency depth, or dependency type.

Filtering by all these advanced metrics are also available via CLI flags for running the tool non-interactively, which enables integration of guided remediation into automated workflows. For example, developers can connect the tool with their CI/test pipelines to determine the set of non-breaking dependency upgrades.

Currently, the guided remediation tool supports npm package.json and package-lock.json dependencies, but we’ll be adding support for more ecosystems in the future.

Check out our detailed documentation for more information or if you would like to try it out for yourself!

OSV-Scanner GitHub action

We’ve also recently launched the OSV-Scanner GitHub action, which provides an easy way for developers to integrate vulnerability scanning using OSV.dev into their CI/CD pipelines. This is currently being used by Tensorflow and Flutter to provide continuous scanning of their dependencies.

Our GitHub Action can be configured to do the following:

Regular vulnerability scan workflow. A common use case is to set a schedule to regularly scan the repository, with the workflow failing if a new vulnerability is found. Another use case can be to block release workflows if a vulnerability is found.

Trigger a differential vulnerability scan to run when a pull request is opened. This workflow can determine if your changes introduce new vulnerabilities and can be configured to block pull requests when the action fails. Enabling just this feature can allow you to stop new vulnerabilities from being introduced, while not breaking your existing workflows.

Head over to our documentation to see a quick and easy guide on how to get started integrating the OSV-Scanner action into your GitHub repository.

Other OSV features

Guided remediation and the GitHub actions support form is one piece of enabling our goal of making vulnerability management easier.

OSV also provides a broad suite of features:

Support for 11 language ecosystems with 19 lockfile formats

Support for C/C++ vulnerability management. C/C++ brings with it a unique set of challenges for dealing with known vulnerabilities in dependencies

Support for license scanning to detect license compliance issues

Reachability analysis to reduce false positives

Govulncheck integration to enable reachability analysis of Go vulnerabilities
Experimental Rust call analysis to enable reachability analysis of Rust vulnerabilities

What’s next?

We still have a lot more exciting work planned! A remaining challenge for dealing with known vulnerabilities in dependencies is remediation and dealing with false positives. Much of our work is focused on improving data quality and providing accurate and actionable results that lead to easy remediation.

These include:

Iterating on guided remediation: by addressing user feedback and adding support for additional ecosystems.

Improving container scanning. OSV-Scanner has so far focused on source repository scanning. One important gap we aim to fill is to provide better support for container scanning, in a way that provides actionable and useful remediation guidance, while minimizing false positives.

Continue to improve matching and data quality. A continuing focus for OSV-Scanner is making sure that our scanning is comprehensive and accurate. Accuracy is especially important for us, as one of our core goals is to minimize false positives and vulnerability noise for developers at the receiving end of the scanners through things such as reachability analysis.

Interested in using OSV in your project? Check out our OSV-Scanner and OSV.dev documentation for how to get started. Please share any feedback or bugs you encounter via our GitHub issue tracker.

By Michael Kedar, Rex Pan, and Oliver Chang – Google Open Source Security Team

Optimizing gVisor filesystems with Directfs

Tuesday, June 20, 2023

gVisor is a sandboxing technology that provides a secure environment for running untrusted code. In our previous blog post, we discussed how gVisor performance improves with a root filesystem overlay. In this post, we'll dive into another filesystem optimization that was recently launched: directfs. It gives gVisor’s application kernel (the Sentry) secure direct access to the container filesystem, avoiding expensive round trips to the filesystem gofer.

Origins of the Gofer

gVisor is used internally at Google to run a variety of services and workloads. One of the challenges we faced while building gVisor was providing remote filesystem access securely to the sandbox. gVisor’s strict security model and defense in depth approach assumes that the sandbox may get compromised because it shares the same execution context as the untrusted application. Hence the sandbox cannot be given sensitive keys and credentials to access Google-internal remote filesystems.

To address this challenge, we added a trusted filesystem proxy called a "gofer". The gofer runs outside the sandbox, and provides a secure interface for untrusted containers to access such remote filesystems. For architectural simplicity, gofers were also used to serve local filesystems as well as remote.

Gofer process intermediates filesystem operations

Isolating the Container Filesystem in runsc

When gVisor was open sourced as runsc, the same gofer model was copied over to maintain the same security guarantees. runsc was configured to start one gofer process per container which serves the container filesystem to the sandbox over a predetermined protocol (now LISAFS). However, a gofer adds a layer of indirection with significant overhead.

This gofer model (built for remote filesystems) brings very few advantages for the runsc use-case, where all the filesystems served by the gofer (like rootfs and bind mounts) are mounted locally on the host. The gofer directly accesses them using filesystem syscalls.

Linux provides some security primitives to effectively isolate local filesystems. These include, mount namespaces, pivot_root and detached bind mounts¹. Directfs is a new filesystem access mode that uses these primitives to expose the container filesystem to the sandbox in a secure manner. The sandbox’s view of the filesystem tree is limited to just the container filesystem. The sandbox process is not given access to anything mounted on the broader host filesystem. Even if the sandbox gets compromised, these mechanisms provide additional barriers to prevent broader system compromise.

Directfs

In directfs mode, the gofer still exists as a cooperative process outside the sandbox. As usual, the gofer enters a new mount namespace, sets up appropriate bind mounts to create the container filesystem in a new directory and then pivot_root(2)s into that directory. Similarly, the sandbox process enters new user and mount namespaces and then pivot_root(2)s into an empty directory to ensure it cannot access anything via path traversal. But instead of making RPCs to the gofer to access the container filesystem, the sandbox requests the gofer to provide file descriptors to all the mount points via SCM_RIGHTS messages. The sandbox then directly makes file-descriptor-relative syscalls (e.g. fstatat(2), openat(2), mkdirat(2), etc) to perform filesystem operations.

Earlier when the gofer performed all filesystem operations, we could deny all these syscalls in the sandbox process using seccomp. But with directfs enabled, the sandbox process's seccomp filters need to allow the usage of these syscalls. Most notably, the sandbox can now make openat(2) syscalls (which allow path traversal), but with certain restrictions: O_NOFOLLOW is required, no access to procfs and no directory FDs from the host. We also had to give the sandbox the same privileges as the gofer (for example CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH), so it can perform the same filesystem operations.

It is noteworthy that only the trusted gofer provides FDs (of the container filesystem) to the sandbox. The sandbox cannot walk backwards (using ‘..’) or follow a malicious symlink to escape out of the container filesystem. In effect, we've decreased our dependence on the syscall filters to catch bad behavior, but correspondingly increased our dependence on Linux's filesystem isolation protections.

Performance

Making RPCs to the gofer for every filesystem operation adds a lot of overhead to runsc. Hence, avoiding gofer round trips significantly improves performance. Let's find out what this means for some of our benchmarks. We will run the benchmarks using our newly released systrap platform on bind mounts (as opposed to rootfs). This would simulate more realistic use cases because bind mounts are extensively used while configuring filesystems in containers. Bind mounts also do not have an overlay (like the rootfs mount), so all operations go through goferfs / directfs mount.

Let's first look at our stat micro-benchmark, which repeatedly calls stat(2) on a file.

Stat benchmark improvement with directfs

The stat(2) syscall is more than 2x faster! However, since this is not representative of real-world applications, we should not extrapolate these results. So let's look at some real-world benchmarks.

We see a 12% reduction in the absolute time to run these workloads and 17% reduction in Ruby load time!

Conclusion

The gofer model in runsc was overly restrictive for accessing host files. We were able to leverage existing filesystem isolation mechanisms in Linux to bypass the gofer without compromising security. Directfs significantly improves performance for certain workloads. This is part of our ongoing efforts to improve gVisor performance. You can learn more about gVisor at gvisor.dev. You can also use gVisor in GKE with GKE Sandbox. Happy sandboxing!

¹Detached bind mounts can be created by first creating a bind mount using mount(MS_BIND) and then detaching it from the filesystem tree using umount(MNT_DETACH).

By Ayush Ranjan, Software Engineer – Google

gVisor improves performance with root filesystem overlay

Friday, April 21, 2023

Overview

Container technology is an integral part of modern application ecosystems, making container security an increasingly important topic. Since containers are often used to run untrusted, potentially malicious code it is imperative to secure the host machine from the container.

A container's security depends on its security boundaries, such as user namespaces (which isolate security-related identifiers and attributes), seccomp rules (which restrict the syscalls available), and Linux Security Module configuration. Popular container management products like Docker and Kubernetes relax these and other security boundaries to increase usability, which means that users need additional container security tools to provide a much stronger isolation boundary between the container and the host.

The gVisor open source project, developed by Google, provides an OCI compatible container runtime called runsc. It is used in production at Google to run untrusted workloads securely. Runsc (run sandbox container) is compatible with Docker and Kubernetes and runs containers in a gVisor sandbox. gVisor sandbox has an application kernel, written in Golang, that implements a substantial portion of the Linux system call interface. All application syscalls are intercepted by the sandbox and handled in the user space kernel.

Although gVisor does not introduce large fixed overheads, sandboxing does add some performance overhead to certain workloads. gVisor has made several improvements recently that help containerized applications run faster inside the sandbox, including an improvement to the container root filesystem, which we will dive deeper into.

Costly Filesystem Access in gVisor

gVisor uses a trusted filesystem proxy process (“gofer”) to access the filesystem on behalf of the sandbox. The sandbox process is considered untrusted in gVisor’s security model. As a result, it is not given direct access to the container filesystem and its seccomp filters do not allow filesystem syscalls.

In gVisor, the container rootfs and bind mounts are configured to be served by a gofer.

When the container needs to perform a filesystem operation, it makes an RPC to the gofer which makes host system calls and services the RPC. This is quite expensive due to:

RPC cost: This is the cost of communicating with the gofer process, including process scheduling, message serialization and IPC system calls.

To ameliorate this, gVisor recently developed a purpose-built protocol called LISAFS which is much more efficient than its predecessor.
gVisor is also experimenting with giving the sandbox direct access to the container filesystem in a secure manner. This would essentially nullify RPC costs as it avoids the gofer being in the critical path of filesystem operations.

Syscall cost: This is the cost of making the host syscall which actually accesses/modifies the container filesystem. Syscalls are expensive, because they perform context switches into the kernel and back into userspace.

To help with this, gVisor heavily caches the filesystem tree in memory. So operations like stat(2) on cached files are serviced quickly. But other operations like mkdir(2) or rename(2) still need to make host syscalls.

Container Root Filesystem

In Docker and Kubernetes, the container’s root filesystem (rootfs) is based on the filesystem packaged with the image. The image’s filesystem is immutable. Any change a container makes to the rootfs is stored separately and is destroyed with the container. This way, the image’s filesystem can be shared efficiently with all containers running the same image. This is different from bind mounts, which allow containers to access the bound host filesystem tree. Changes to bind mounts are always propagated to the host and persist after the container exits.

Docker and Kubernetes both use the overlay filesystem by default to configure container rootfs. Overlayfs mounts are composed of one upper layer and multiple lower layers. The overlay filesystem presents a merged view of all these filesystem layers at its mount location and ensures that lower layers are read-only while all changes are held in the upper layer. The lower layer(s) constitute the “image layer” and the upper layer is the “container layer”. When the container is destroyed, the upper layer mount is destroyed as well, discarding the root filesystem changes the container may have made. Docker’s overlayfs driver documentation has a good explanation.

Rootfs Configuration Before

Let’s consider an example where the image has files foo and baz. The container overwrites foo and creates a new file bar. The diagram below shows how the root filesystem used to be configured in gVisor earlier. We used to go through the gofer and access/mutate the overlaid directory on the host. It also shows the state of the host overlay filesystem.

Opportunity! Sandbox Internal Overlay

Given that the upper layer is destroyed with the container and that it is expensive to access/mutate a host filesystem from the sandbox, why keep the upper layer on the host at all? Instead we can move the upper layer into the sandbox.

The idea is to overlay the rootfs using a sandbox-internal overlay mount. We can use a tmpfs upper (container) layer and a read-only lower layer served by the gofer client. Any changes to rootfs would be held in tmpfs (in-memory). Accessing/mutating the upper layer would not require any gofer RPCs or syscalls to the host. This really speeds up filesystem operations on the upper layer, which contains newly created or copied-up files and directories.

Using the same example as above, the following diagram shows what the rootfs configuration would look like using a sandbox-internal overlay.

Rootfs configuration in gVisor with internal overlay

Host-Backed Overlay

The tmpfs mount by default will use the sandbox process’s memory to back all the file data in the mount. This can cause sandbox memory usage to blow up and exhaust the container’s memory limits, so it’s important to store all file data from tmpfs upper layer on disk. We need to have a tmpfs-backing “filestore” on the host filesystem. Using the example from above, this filestore on the host will store file data for foo and bar.

This would essentially flatten all regular files in tmpfs into one host file. The sandbox can mmap(2) the filestore into its address space. This allows it to access and mutate the filestore very efficiently, without incurring gofer RPCs or syscalls overheads.

Self-Backed Overlay

In Kubernetes, you can set local ephemeral storage limits. The upper layer of the rootfs overlay (writeable container layer) on the host contributes towards this limit. The kubelet enforces this limit by traversing the entire upper layer, stat(2)-ing all files and summing up their stat.st_blocks*block_size. If we move the upper layer into the sandbox, then the host upper layer is empty and the kubelet will not be able to enforce these limits.

To address this issue, we introduced “self-backed” overlays, which create the filestore in the host upper layer. This way, when the kubelet scans the host upper layer, the filestore will be detected and its stat.st_blocks should be representative of the total file usage in the sandbox-internal upper layer. It is also important to hide this filestore from the containerized application to avoid confusing it. We do so by creating a whiteout in the sandbox-internal upper layer, which blocks this file from appearing in the merged directory.

The following diagram shows what rootfs configuration would finally look like today in gVisor.

Rootfs configuration in gVisor with self-backed internal overlay

Performance Gains

Let’s look at some filesystem-intensive workloads to see how rootfs overlay impacts performance. These benchmarks were run on a gLinux desktop with KVM platform.

Micro Benchmark

Linux Test Project provides a fsstress binary. This program performs a large number of filesystem operations concurrently, creating and modifying a large filesystem tree of all sorts of files. We ran this program on the container's root filesystem. The exact usage was:

sh -c "mkdir /test && time fsstress -d /test -n 500 -p 20 -s 1680153482 -X -l 10"

You can use the -v flag (verbose mode) to see what filesystem operations are being performed.

The results were astounding! Rootfs overlay reduced the time to run this fsstress program from 262.79 seconds to 3.18 seconds! However, note that such microbenchmarks are not representative of real-world applications and we should not extrapolate these results to real-world performance.

Real-world Benchmark

Build jobs are very filesystem intensive workloads. They read a lot of source files, compile and write out binaries and object files. Let’s consider building the abseil-cpp project with bazel. Bazel performs a lot of filesystem operations in rootfs; in bazel’s cache located at ~/.cache/bazel/.

This is representative of the real-world because many other applications also use the container root filesystem as scratch space due to the handy property that it disappears on container exit. To make this more realistic, the abseil-cpp repo was attached to the container using a bind mount, which does not have an overlay.

When measuring performance, we care about reducing the sandboxing overhead and bringing gVisor performance as close as possible to unsandboxed performance. Sandboxing overhead can be calculated using the formula overhead = (s-n)/n where ‘s’ is the amount of time taken to run a workload inside gVisor sandbox and ‘n’ is the time taken to run the same workload natively (unsandboxed). The following graph shows that rootfs overlay halved the sandboxing overhead for abseil build!

The impact of rootfs overlay on sandboxing overhead for abseil build

Conclusion

Rootfs overlay in gVisor substantially improves performance for many filesystem-intensive workloads, so that developers no longer have to make large tradeoffs between performance and security. We recently made this optimization the default in runsc. This is part of our ongoing efforts to improve gVisor performance. You can learn more about gVisor at gvisor.dev. You can also use gVisor in GKE with GKE Sandbox. Happy sandboxing!

By Ayush Ranjan, Software Engineer, gVisor

Supporting DDR4 and DDR5 RDIMMs in open source DRAM security testing framework

Thursday, February 16, 2023

In 2021, Google and Antmicro introduced a platform for testing DRAM memory chips against the unfortunate side effect of the physical shrinking of memory chips—the Rowhammer vulnerability. The platform was developed to propose a radical improvement over the “security through obscurity” approach that was predominant in the industry; as both Antmicro and Google believe that the open source approach to mitigating security threats is a way towards accelerating developments in the field.

The framework was originally developed in the context of securing consumer-facing devices, using off-the-shelf Digilent Arty (DDR3, Xilinx Series7 FPGA) and Xilinx ZCU104 (DDR4, Xilinx UltraScale+ FPGA) boards, then followed by a dedicated open hardware board from Antmicro that allowed work on custom LPDDR4 modules. The framework has since helped discover a new attack method named Blacksmith and continues to provide valuable insights into how the security of both edge device and data center memory can be improved.

In constant development since then, the project has welcomed two more major elements to the ecosystem in order to enable testing of DDR4 Registered Dual In-Line Memory Modules (RDIMM)—commonly used in data centers as well as the newer DDR5 standard and continues to provide useful data.

Memory testing for data center use cases

To extend the Rowhammer tester support from consumer-facing devices to shared-compute data center infrastructure, Antmicro developed the data center DRAM tester board. We adapted this open source hardware-test platform from the original LPDDR4 board to enable Rowhammer and other memory security experiments with DDR4 RDIMMs using a fully configurable, open source FPGA-based DDR controller.

The data center DRAM Xilinx Kintex-7 FPGA based test board features:

DDR4 RDIMM connector
676 pins FPGA (compared to the 484 for the LPDDR version)
RJ45 Gigabit Ethernet
Micro-USB console
HDMI output connector
JTAG programming connector
MicroSD card slot
12 MBytes QSPI Flash memory
HyperRAM—external DRAM memory that can be used as an FPGA cache

Photo of the Antmicro data center DRAM Xilinx Kintex-7 FPGA based tester board

It’s worth mentioning that the RDIMM DDR4 memory (as opposed to the custom LPDDR4 modules designed for the original project) are generic and available off-the-shelf. This makes it easier for security researchers to get started with data center memory security research compared to edge devices using LPDDR.

The Data Center DRAM Tester board design has now been upgraded into revision 1.2, which brings new features for implementing even more complex DRAM testing scenarios. The 1.2 boards support a Power over Ethernet (PoE) supply option so the board can act as a standalone network device with data exchange and power-cycling done over a single Ethernet cable. This simplifies integration of the board in DRAM testing clusters and custom runners capable of doing hardware-in-the-loop testing.

The new revision of the board will support hot-swapping of the DRAM module under test, which should speed up testing of multiple DRAM modules without the need to power-cycle the tester. Finally, the new revision of the board will include power-measurement circuitry so it will be possible to compare the peak and average power consumption of DRAM while working with different DRAM refresh scenarios.

We are also working on a custom enclosure design suitable for desktop and networked installations.

Extending open source testing to DDR5

With DDR5 quickly becoming the new standard for data center memory, Antmicro and Google’s Platforms teams also set out to develop a platform capable of interfacing with DDR5 memories, again directly from a low-cost FPGA without a dedicated hard block. The resulting DDR5 tester platform follows the structure of the data center DDR4 tester, while expanding on functionality of the Serial Presence Detection, which monitors the power supply states and system health, or adjusting the circuitry for a nominal IO voltage of 1.1V.

Data center DRAM testing is part of Google’s and Antmicro’s belief in security through transparency. Both hyperscalers and a growing number of organizations who operate their own data centers increasingly embrace this perspective, and there is great value in providing them with a scalable, customizable, commercially supported open source platform that will help in collaborative research and mitigation of emerging security issues.

Rowhammer attacks, security threats, and countermeasures remain an active research area. With Google, Antmicro continues to adjust the Rowhammer test platform to most recent developments, opening the way for researchers and memory vendors to more sophisticated testing methods to enable testing of state-of-the-art memories used in data centers. This work stems from and complements other open source activities the companies jointly lead as members of RISC-V International and CHIPS Alliance, aimed at making the hardware ecosystem more open, secure and collaborative. If you’re interested in open source solutions for DRAM security testing and memory controller development, or more broadly, FPGA and ASIC design and verification, don’t hesitate to reach out to Antmicro at contact@antmicro.com.

By Michael Gielda – Antmicro

Sigstore project announces general availability and v1.0 releases

Tuesday, October 25, 2022

Today, the Sigstore community announced the general availability of their free, community-operated certificate authority and transparency log services. In addition, two of Sigstore’s foundational projects, Fulcio and Rekor, published v1.0 releases denoting a commitment to API stability. Google is proud to celebrate these open source community milestones. 🎉

Sigstore is a standard for signing, verifying, and protecting open source software. With increased industry attention being given to software supply chain security, including the recent Executive Order on Cybersecurity, the ability to know and trust where software comes from has never been more important. Sigstore simplifies and automates the complex parts of digitally signing software—making this more accessible and trustworthy than ever before.

Beginning in 2020 as an open source collaboration between Red Hat and Google, the Sigstore project has grown into a vendor-neutral, community operated and designed project that is part of the Open Source Security Foundation (OpenSSF). The ecosystem has also continued to grow spanning multiple package managers and ecosystems, and now if you download a new release by open source projects like Python or Kubernetes, you’ll see that they’ve been signed by Sigstore.

Google is an active, contributing member of the Sigstore community. In addition to upstream code contributions, Google has contributed in several other ways:

Core Sigstore services are built on Google-supported open source technologies, including Go, gRPC, Trillian and Certificate Transparency contributions.
We’re a diamond sponsor of this year’s SigstoreCon.
As part of Google’s commitment to advancing cybersecurity, we’re supporting foundations, such as the OpenSSF, by pledging $100 million to fix open source vulnerabilities and oversee open source security priorities.

We are part of a larger open source community helping develop and run Sigstore, and welcome new adopters and contributors! To learn more about getting started using Sigstore, the project documentation helps guide you through the process of signing and verifying your software. To get started contributing, several individual repositories within the Sigstore GitHub organization use “good first issue” labels to give a hint of approachable tasks. The project maintains a Slack community (use the invite to join) and regularly holds community meetings.

By Dave Lester – Google Open Source Programs Office, and Bob Callaway – Google Open Source Security Team

A new resource for coordinated vulnerability disclosure in open source projects

Wednesday, February 17, 2021

One of the joys of open source is the freedom it gives you to create: contributors get to build the projects they want how they want; it’s up to them. Of course, blank slates don’t come with directions, which makes more niche areas of software development and management a challenge for contributors. Vulnerability disclosure is one of those areas.

Google doesn’t restrict its open source work to one team, instead we teach any and all Googlers about open source: how to release, how to contribute, how to use, and, in general, how to be a good open source citizen. This approach scales well, and gives people the knowledge to be lifelong open source community members. This includes sharing knowledge about open source security, a topic that isn’t new, but is finally getting the industry attention it deserves.

The intimidating blank slate and a lack of time for contributors to develop policies means many open source projects have no documented vulnerability reporting information, much less a plan for how to handle and disclose a reported vulnerability. We recently updated our guidance for coordinated vulnerability disclosure in open source projects that come out of Google and have published it in hopes that other projects will find this helpful for their project security practices.

The new guide has three sections:

Setting up your vulnerability management “infrastructure”: The work you’ll want to do before an issue is reported.
The vulnerability response process: Includes a runbook for when your project receives an issue.
Templates: From SECURITY.MD to a public disclosure outline, all the communication pieces you need to handle an issue.

It’s a myth that if a project hasn’t received a vulnerability report yet, it doesn’t need a disclosure policy. It’s also a myth that you need to be “a security person” to implement a vulnerability disclosure policy. A successful coordinated vulnerability disclosure frequently comes down to good process management and clear, thoughtful communication. You don’t have to be an expert in operating systems capabilities to understand how a reporter manipulated it to cause an account privilege escalation through your project. A predetermined policy, some templates, and a well-executed runbook will take you through discovering, patching, and disclosing most kinds of vulnerabilities.

Coordinated Vulnerability Disclosure in Open Source Projects

Vulnerability disclosure is part of Fix in the Know, Prevent, Fix framework we proposed recently for open source vulnerability management. In today’s industry, with all of our supply chain dependencies, improving open source project security in even one project can have a multiplying effect. Vulnerability disclosure is a key aspect of that overall security posture. Our hope is that projects will take this guide, remix and adapt to their projects, and share their changes with others so we can collectively increase our open source security.

By Anne Bertucio, Google Open Source

Updates on the Tsunami Security Scanning Engine

Wednesday, February 10, 2021

Several months ago, we open sourced the Tsunami security scanner: a false-positive-free infrastructure scanning engine focusing on high severity, actively exploited vulnerabilities. Today, we are releasing the first major update for Tsunami.

In the last few months, we have done a lot of work in the background to prepare Tsunami for the next step and focused on the following:

Vulnerability research: In order to keep Tsunami's detection capabilities up-to-date, we kicked-off various projects to research the exploitation of vulnerabilities in the wild. We will soon publish more information about our initiatives in this space—stay tuned.

New detection capabilities: Based on our research, we have added 15 new detector plugins to Tsunami for actively exploited vulnerabilities.

Continuous Integration pipeline for our open-source builds: We set up a CI/CD pipeline that automatically mirrors and tests changes between our internal version management system and the open source repository. This will enable us to easily merge internal and external contributions.

Test bed for end-to-end testing: This summer we hosted an intern (Yuxin Wu), who built and open-sourced a test bed for Tsunami. The test bed can automatically deploy arbitrary versions of off-the-shelf software based on docker images. We are using the test bed to automatically check whether a Tsunami detector is working for all vulnerable versions of a software and keeps functioning for future versions.

Web application fingerprinting: We added Web application fingerprinting capabilities to Tsunami. Tsunami, now detects popular off-the-shelve Web applications. This information can be used by Tsunami for more precise and less intrusive vulnerability verification. Furthermore, it enables security teams to build a software inventory based on Tsunami scans. We'll keep working on refining our fingerprinting approach and extending our fingerprinting database.

Today, we are releasing the new detectors and the fingerprinting capabilities. You can find the new detectors and the web fingerprinter in our plugin repository.

If you are adopting Tsunami within your organization and if you have questions or would like to contribute, feel free to contact us at any time at tsunami-scanner@google.com.

By Guoli Ma, Claudio Criscione & Sebastian Lekies, Vulnerability Management Team

Google joins the Rust Foundation

Monday, February 8, 2021

Droidstacean: Rust mascot Ferris, with Android mascot color/aspects

Droidstacean by Ivan Lozano, based on a design by Karen Rustad Tölva.

Rust is a systems programming language that combines low-level control over performance with modern language features and a focus on memory safety. Memory safety has been an enduring challenge for software developers, particularly those working on systems programs. Google has begun using Rust in settings where memory safety and performance are key considerations, including in key Android systems.

The Rust Core Team recently completed its work to build a new home for Rust: The Rust Foundation. Building on Google’s longstanding investments in C/C++ and the compilers and toolchains, we are delighted to announce our membership in the Rust Foundation. We look forward to participating more in the Rust community, in particular working across the industry on key issues including interoperability with C++, coordinating security reviews and decreasing the costs of crate updates, and continuing to grow our investments in existing Rust projects.

Memory safety security defects frequently threaten device safety, especially for applications and operating systems. For example, on Android, we’ve found that more than half of the security vulnerabilities we addressed in 2019 resulted from memory safety bugs. And this is despite significant efforts from Google and other contributors to the Android Open Source Project to either invest in or invent a variety of technologies, including AddressSanitizer, improved memory allocators, and numerous fuzzers and other code checking tools. Rust has proven effective at providing an additional layer of protection beyond even these tools in many other settings, including browsers, games, and even key libraries. We are excited to expand both our usage of Rust at Google and our contributions to the Rust Foundation and Rust ecosystem.

Today, some examples of projects where Google is either already using Rust or contributing to the Rust ecosystem include:

Operating system modules in Android, including bluetooth and Keystore 2.0
Low-level projects, such as the crosvm virtual machine monitor and drivers (alternative to QEMU) used in ChromeOS
Contributing to open source projects that we use and use Rust, such as the Mercurial source code control system
Firmware for FIDO security key support

And, there are many additional projects that are evaluating the use of Rust for new libraries or products. Some examples include:

The software internationalization project, ICU4X
Parts of the new experimental operating system, Fuchsia
Research on GPU font rendering

We are also excited to support key Rust projects and their maintainers, such as:

Adding Rust code to curl
Working with ISRG to add a Rust TLS module to the Apache HTTP Server Project

We can’t wait to work across the industry to contribute to and support existing projects and libraries as well as help build out key areas such as C++ interoperability and security review.

By Lars Bergstrom, Director of Engineering, Android Platform Programming Languages

Launching OSV - Better vulnerability triage for open source

Friday, February 5, 2021

We are excited to launch OSV (Open Source Vulnerabilities), our first step towards improving vulnerability triage for developers and consumers of open source software. The goal of OSV is to provide precise data on where a vulnerability was introduced and where it got fixed, thereby helping consumers of open source software accurately identify if they are impacted and then make security fixes as quickly as possible. We have started OSV with a data set of fuzzing vulnerabilities found by the OSS-Fuzz service. OSV project evolved from our recent efforts to improve vulnerability management in open source ("Know, Prevent, Fix" framework).

Vulnerability management can be painful for both consumers and maintainers of open source software, with tedious manual work involved in many cases.

For consumers of open source software, it is often difficult to map a vulnerability such as a Common Vulnerabilities and Exposures (CVE) entry to the package versions they are using. This comes from the fact that versioning schemes in existing vulnerability standards (such as Common Platform Enumeration (CPE)) do not map well with the actual open source versioning schemes, which are typically versions/tags and commit hashes. The result is missed vulnerabilities that affect downstream consumers.

Similarly, it is time consuming for maintainers to determine an accurate list of affected versions or commits across all their branches for downstream consumers after a vulnerability is fixed, in addition to the process required for publication. Unfortunately, many open source projects, including ones that are critical to modern infrastructure, are under resourced and overworked. Maintainers don't always have the bandwidth to create and publish thorough, accurate information about their vulnerabilities even if they want to.

These challenges result in open source consumers not incorporating important security fixes promptly. OSV aims to:

Reduce the work required by maintainers to publish vulnerabilities, and
Improve the accuracy of vulnerability queries for downstream consumers by providing precise vulnerability metadata in an easy-to-query database (complementing existing vulnerability databases).

Automation

OSV aims to simplify the vulnerability reporting process for an open source package maintainer by accurately determining the list of affected versions and commits. This requires providing both the commits that introduce and fix the bugs. If that information is not available, OSV requires providing a reproduction test case and steps to generate an application build, and then it performs bisection to find these commits in an automated fashion. OSV takes care of the rest of the analysis to figure out impacted commit ranges (accounting for cherry picks) and versions/tags.

OSV automates the triage workflow for an open source package consumer by providing an API to query for vulnerabilities. A typical OSV workflow for a package consumer looks like the picture above:

A package consumer sends a query to OSV with a package version or commit hash as input.

   curl -X POST -d \
'{"commit": "6879efc2c1596d11a6a6ad296f80063b558d5e0f"}' \
  'https://api.osv.dev/v1/query?key=$API_KEY'

curl -X POST -d \
   '{"version": "1.0.0", "package": {"name": "pkg", "ecosystem": "pypi"}' \

'https://api.osv.dev/v1/query?key=$API_KEY'

OSV looks up the set of vulnerabilities affecting that particular version and returns a list of vulnerabilities impacting the package. The vulnerability metadata is returned in a machine-readable JSON format.
The package consumer uses this information to either cherry-pick security fixes (based on precise fix metadata) or update to a later version.

Ongoing workOSV currently provides access to thousands of vulnerabilities from 380+ critical OSS projects integrated with OSS-Fuzz. We are planning to work with open source communities to extend with data from various language ecosystems (e.g. NPM, PyPI) and work out a pipeline for package maintainers to submit vulnerabilities with minimal work.

Our goal with OSV is to rethink and promote better, scalable vulnerability tracking for open source. In an ideal world, vulnerability management should be done closer to the actual open source development process, aided by automated infrastructure. Projects that depend on open source should be promptly notified and fixes uptaken quickly when a vulnerability is reported.

You can access the OSV website and documentation at https://osv.dev. You can explore the open source repo or contribute to the project on GitHub, and join the mailing list to stay up to date with OSV and share your thoughts on vulnerability tracking.

By Oliver Chang and Kim Lewandowski, Google Security Team

Assess the security of Google Kubernetes Engine (GKE) with InSpec for GCP

Monday, January 25, 2021

We are excited to announce the GKE CIS 1.1.0 Benchmark InSpec profile under an open source software license is now available on GitHub, which allows you to assess Google Kubernetes Engine (GKE) clusters against security controls recommended by CIS. You can validate the security posture of your GKE clusters using Chef InSpec™ by assessing their compliance against the Center for Internet Security (CIS) 1.1.0 benchmark for GKE.
Validating security compliance of GKEGKE is a popular platform to run containerized applications. Many organizations have selected GKE for its scalability, self-healing, observability and integrations with other services on Google Cloud. Developer agility is one of the most compelling arguments for moving to a microservices architecture on Kubernetes, introducing configuration changes at a faster pace and demanding security checks as part of the development lifecycle.

Validating the security settings of your GKE cluster is a complex challenge and requires an analysis of multiple layers within your Cloud infrastructure:

GKE is a managed service on GCP, with controls to tweak the cluster’s behaviour which have an impact on its security posture. These Cloud resource configurations can be configured and audited via Infrastructure-as-Code (IaC) frameworks such as Terraform, the gcloud command line or the Google Cloud Console.
Application workloads are deployed on GKE by interacting via the Kubernetes (K8S) API. Kubernetes resources such as pods, deployments and services are often deployed from yaml templates using the command line tool kubectl.
Kubernetes uses configuration files (such as the kube-proxy and kubelet file) typically in yaml format which are stored on the nodes’ file system.

InSpec for auditing GKE InSpec is a popular DevSecOps framework that checks the configuration state of resources in virtual machines and containers, on Cloud providers such as Google Cloud, AWS, and Microsoft Azure. The InSpec GCP resource pack 1.8 (InSpec-GCP) provides a consistent way to audit GCP resources and can be used to validate the attributes of a GKE cluster against a desired state declared in code. We previously released a blog post on how to validate your Google Cloud resources with InSpec-GCP against compliance profiles such as the CIS 1.1.0 benchmark for GCP.

While you can use the InSpec-GCP resource pack to define the InSpec controls to validate resources against the Google Cloud API, it does not directly allow you to validate configurations of other relevant layers such as Kubernetes resources and config files on the nodes. Luckily, the challenge to audit Kubernetes resources with InSpec has already been solved by the inspec-k8s resource pack. Further, files on nodes can be audited using remote access via SSH. All together, we can validate the security posture of GKE holistically using the inspec-gcp and inspec-k8s resource packs as well as controls using the InSpec file resource executed in an ssh session.
Running the CIS for GKE compliance profile with InSpec

With the GKE CIS 1.1.0 Benchmark InSpec Profile we have implemented the security controls to validate a GKE cluster against the recommended settings on GCP resource level, Kubernetes API level and file system level. The repository is split into three profiles (inspec-gke-cis-gcp, inspec-gke-cis-k8s and inspec-gke-cis-ssh), since each profile requires a different “target”, or -t parameter when run using the InSpec command line. For ease of use, a wrapper script run_profiles.sh has been provided in the root directory of the repository with the purpose of running all three profiles and storing the reports in the dedicated folder reports.
The script requires the cluster name (-c), ssh username (-u), private key file for ssh authentication (-k), cluster region or zone (-r or -z) and InSpec input file as required by the inspec.yml files in each profile (-i). As an example, the following line will run all three profiles to validate the compliance of cluster inspec-cluster in zone us-central1-a:

./run_profiles.sh -c inspec-cluster \
                          -u konrad \
                           -k /home/konrad/.ssh/google_compute_engine \
                           -z us-central1-a \
                          -i inputs.yml
Running InSpec profile inspec-gke-cis-gcp ...

Profile: InSpec GKE CIS 1.1 Benchmark (inspec-gke-cis-gcp)
Version: 0.1.0
Target: gcp://<service account used for InSpec>

<lots of InSpec output omitted>

Profile Summary: 16 successful controls, 10 control failures, 2 controls skipped
Test Summary: 18 successful, 11 failures, 2 skipped
Stored report in reports/inspec-gke-cis-gcp_report.
Running InSpec profile inspec-gke-cis-k8s …

Profile: InSpec GKE CIS 1.1 Benchmark (inspec-gke-cis-k8s)
Version: 0.1.0
Target: kubernetes://<IP address of K8S endpoint>:443

<lots of InSpec output omitted>

Profile Summary: 9 successful controls, 1 control failure, 0 controls skipped
Test Summary: 9 successful, 1 failure, 0 skipped
Stored report in reports/inspec-gke-cis-gcp_report.
Running InSpec profile inspec-gke-cis-ssh on node <cluster node 1> ...

Profile: InSpec GKE CIS 1.1 Benchmark (inspec-gke-cis-ssh)
Version: 0.1.0
Target: ssh://<username>@<cluster node 1>:22

<lots of InSpec output omitted>

Profile Summary: 10 successful controls, 5 control failures, 1 control skipped
Test Summary: 12 successful, 6 failures, 1 skipped
Stored report in reports/inspec-gke-cis-ssh_<cluster node 1>_report.

Analyze your scan reportsOnce the wrapper script has completed successfully you should analyze the JSON or HTML reports to validate the compliance of your GKE cluster. One way to perform the analysis is to upload the collection of JSON reports of a single run from the reports folder to the open source InSpec visualization tool Heimdall Lite (GitHub) by the Mitre Corporation. An example of a compliance dashboard is shown below:

Try it yourself and run the GKE CIS 1.1.0 Benchmark InSpec profile in your Google Cloud environment! Clone the repository and follow the CLI example in the Readme file to run the InSpec profiles against your GKE clusters. We also encourage you to report any issues on GitHub that you may find, suggest additional features and to contribute to the project using pull requests. Also, you can read our previous blog post on using InSpec-GCP for compliance validations of your GCP environment.

By Bakh Inamov, Security Specialist Engineer and Konrad Schieban, Infrastructure Cloud Consultant