[go: nahoru, domu]

Use C++11 atomics

This CL moves the Chrome code base to C++11 atomics instead of inline
assembly for the toolchains that support C++11's <atomic>. The patch is
constructed to modify as little code as possible, and will be followed-up with
other patches to clean up the rest of the code base.

This change should allow compilers to emit better code when dealing with atomics
(LLVM has recently seen substantial improvements thanks to morisset's work), be
more portable, eventually delete a bunch of code, and in subsequent changes
atomicity will be based on memory location instead of individual accesses (making
the code less error-prone, and easier to analyze).

This patch stems from a fix I recently made to V8 to build under NaCl and
PNaCl. This was gating V8's C++11 migration: they needed to move to the
LLVM-based NaCl toolchain which didn't work with atomicops' inline assembly. V8
currently uses the __sync_* primitives for the NaCl/PNaCl build, which imply
sequential consistency and are therefore suboptimal. Before doing further work I
want to fix Chrome's own headers, and trickle these changes to V8.

Future changes:
 * The atomicops headers are copied in 8 locations [0] in the Chrome code base,
   not all of them are up to date. Those should all be updated in their
   individual repositories (some are in thiry_party).
 * The signature of functions should be updated to take an atomic location
   instead of the pointer argument, and all call sites should be updated (there
   are current 127 such calls [1] in chromium/src in non-implementation and
   non-test files).
 * Atomic operations should be used directly.

A few things worth noting:
 * The atomicops_internals_portable.h file contains the main implementation, and
   has extra notes on implementation choices.
 * The CL removes x86 CPU features detection from the portable implementation:
    - Because the headers are copied in a few places and the x86 CPU feature
      detection uses a static global that parses cpuid, the
      atomicops_internals_x86_gcc.cc file was only built once, but its struct
      interface was declared external and used in protobuf and
      tcmalloc. Removing the struct from Chrome's own base makes the linker sad
      because of the two uses. This has two implications:
       # Don't specialize atomics for SSE2 versus non SSE2. This is a non-issue
         since Chrome requires a minimum of SSE2. The code is therefore faster
         (skip a branch).
       # The code doesn't detect the AMD Opteron Rev E memory barrier bug from
         AMD errata 147 [2]. This bug affects Opterons 1xx/2xx/8xx that shipped
         in 2003-2004. In general compilers should work around this bug, not
         library code. GCC has had this workaround since 2009 [3], but it is
         still an open bug in LLVM [4]. See comment #27 on the code review: this
         CPU is used by 0.02% of Chrome users, for whom the race would have to
         occur and the compiler not have the workaround for the bug to manifest.
         This also makes the code faster by skipping a branch.
 * The CL moves the declaration of the x86 CPU features struct to atomicops.h,
   and matches its fields with the ones duplicated across the code base. This
   is transitional: it fixes a link failure because tcmalloc relied on the
   x86_gcc.{h,cc} declaration and implementation, and it fixes an ODR violation
   where the struct didn't always have 3 members (and the other members were
   sometimes accessed, uninitialized).
 * tsan already rewrites all atomics to its own primitives, its header is
   therefore now unnecessary. The implementation takes care to detect and error
   out if that turns out not to be true.
 * MemoryBarrier works around a libstdc++ bug in older versions [5]. The
   workaround is exactly the non-buggy code for libstdc++ (call out to the
   compiler's builtin).
 * The assembly generated by GCC 4.8 and LLVM 3.6 for x86-32/x86-64/ARM when
   using this portable implementation can be found in [6].

[0]: find . -name "atomicops.h"
  ./base/atomicops.h
  ./v8/src/base/atomicops.h
  ./native_client_sdk/src/libraries/sdk_util/atomicops.h
  ./third_party/webrtc/base/atomicops.h
  ./third_party/tcmalloc/chromium/src/base/atomicops.h
  ./third_party/tcmalloc/vendor/src/base/atomicops.h
  ./third_party/re2/util/atomicops.h
  ./third_party/protobuf/src/google/protobuf/stubs/atomicops.h
[1]: git grep Barrier_ | grep -v atomicops | grep -v unittest | wc -l
[2]: http://support.amd.com/us/Processor_TechDocs/25759.pdf
[3]: https://gcc.gnu.org/ml/gcc-patches/2009-10/txt00046.txt
[4]: http://llvm.org/bugs/show_bug.cgi?id=5934
[5]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51038
[6]: https://code.google.com/p/chromium/issues/detail?id=423074

TEST=ninja -C out/Release base_unittests
TEST=trybots
BUG=246514
BUG=234215
BUG=420970
BUG=423074
R= dvyukov@google.com, thakis@chromium.org, glider@chromium.org, hboehm@google.com, morisset@google.com, willchan@chromium.org

Review URL: https://codereview.chromium.org/636783002

Cr-Commit-Position: refs/heads/master@{#299485}
diff --git a/base/BUILD.gn b/base/BUILD.gn
index 52f6c03..cb8820c 100644
--- a/base/BUILD.gn
+++ b/base/BUILD.gn
@@ -78,7 +78,7 @@
     "atomicops.h",
     "atomicops_internals_gcc.h",
     "atomicops_internals_mac.h",
-    "atomicops_internals_tsan.h",
+    "atomicops_internals_portable.h",
     "atomicops_internals_x86_gcc.cc",
     "atomicops_internals_x86_gcc.h",
     "atomicops_internals_x86_msvc.h",
diff --git a/base/atomicops.h b/base/atomicops.h
index 84be8c0..833e1704 100644
--- a/base/atomicops.h
+++ b/base/atomicops.h
@@ -28,8 +28,11 @@
 #ifndef BASE_ATOMICOPS_H_
 #define BASE_ATOMICOPS_H_
 
+#include <cassert>  // Small C++ header which defines implementation specific
+                    // macros used to identify the STL implementation.
 #include <stdint.h>
 
+#include "base/base_export.h"
 #include "build/build_config.h"
 
 #if defined(OS_WIN) && defined(ARCH_CPU_64_BITS)
@@ -137,28 +140,66 @@
 }  // namespace subtle
 }  // namespace base
 
-// Include our platform specific implementation.
-#if defined(THREAD_SANITIZER)
-#include "base/atomicops_internals_tsan.h"
-#elif defined(OS_WIN) && defined(COMPILER_MSVC) && defined(ARCH_CPU_X86_FAMILY)
-#include "base/atomicops_internals_x86_msvc.h"
-#elif defined(OS_MACOSX)
-#include "base/atomicops_internals_mac.h"
-#elif defined(OS_NACL)
-#include "base/atomicops_internals_gcc.h"
-#elif defined(COMPILER_GCC) && defined(ARCH_CPU_ARMEL)
-#include "base/atomicops_internals_arm_gcc.h"
-#elif defined(COMPILER_GCC) && defined(ARCH_CPU_ARM64)
-#include "base/atomicops_internals_arm64_gcc.h"
-#elif defined(COMPILER_GCC) && defined(ARCH_CPU_X86_FAMILY)
-#include "base/atomicops_internals_x86_gcc.h"
-#elif defined(COMPILER_GCC) && \
-      (defined(ARCH_CPU_MIPS_FAMILY) || defined(ARCH_CPU_MIPS64_FAMILY))
-#include "base/atomicops_internals_mips_gcc.h"
-#else
-#error "Atomic operations are not supported on your platform"
+// The following x86 CPU features are used in atomicops_internals_x86_gcc.h, but
+// this file is duplicated inside of Chrome: protobuf and tcmalloc rely on the
+// struct being present at link time. Some parts of Chrome can currently use the
+// portable interface whereas others still use GCC one. The include guards are
+// the same as in atomicops_internals_x86_gcc.cc.
+#if defined(__i386__) || defined(__x86_64__)
+// This struct is not part of the public API of this module; clients may not
+// use it.  (However, it's exported via BASE_EXPORT because clients implicitly
+// do use it at link time by inlining these functions.)
+// Features of this x86.  Values may not be correct before main() is run,
+// but are set conservatively.
+struct AtomicOps_x86CPUFeatureStruct {
+  bool has_amd_lock_mb_bug; // Processor has AMD memory-barrier bug; do lfence
+                            // after acquire compare-and-swap.
+  // The following fields are unused by Chrome's base implementation but are
+  // still used by copies of the same code in other parts of the code base. This
+  // causes an ODR violation, and the other code is likely reading invalid
+  // memory.
+  // TODO(jfb) Delete these fields once the rest of the Chrome code base doesn't
+  //           depend on them.
+  bool has_sse2;            // Processor has SSE2.
+  bool has_cmpxchg16b;      // Processor supports cmpxchg16b instruction.
+};
+BASE_EXPORT extern struct AtomicOps_x86CPUFeatureStruct
+    AtomicOps_Internalx86CPUFeatures;
 #endif
 
+// Try to use a portable implementation based on C++11 atomics.
+//
+// Some toolchains support C++11 language features without supporting library
+// features (recent compiler, older STL). Whitelist libstdc++ and libc++ that we
+// know will have <atomic> when compiling C++11.
+#if ((__cplusplus >= 201103L) &&                            \
+     ((defined(__GLIBCXX__) && (__GLIBCXX__ > 20110216)) || \
+      (defined(_LIBCPP_VERSION) && (_LIBCPP_STD_VER >= 11))))
+#  include "base/atomicops_internals_portable.h"
+#else  // Otherwise use a platform specific implementation.
+#  if defined(THREAD_SANITIZER)
+#    error "Thread sanitizer must use the portable atomic operations"
+#  elif (defined(OS_WIN) && defined(COMPILER_MSVC) && \
+         defined(ARCH_CPU_X86_FAMILY))
+#    include "base/atomicops_internals_x86_msvc.h"
+#  elif defined(OS_MACOSX)
+#    include "base/atomicops_internals_mac.h"
+#  elif defined(OS_NACL)
+#    include "base/atomicops_internals_gcc.h"
+#  elif defined(COMPILER_GCC) && defined(ARCH_CPU_ARMEL)
+#    include "base/atomicops_internals_arm_gcc.h"
+#  elif defined(COMPILER_GCC) && defined(ARCH_CPU_ARM64)
+#    include "base/atomicops_internals_arm64_gcc.h"
+#  elif defined(COMPILER_GCC) && defined(ARCH_CPU_X86_FAMILY)
+#    include "base/atomicops_internals_x86_gcc.h"
+#  elif (defined(COMPILER_GCC) && \
+         (defined(ARCH_CPU_MIPS_FAMILY) || defined(ARCH_CPU_MIPS64_FAMILY)))
+#    include "base/atomicops_internals_mips_gcc.h"
+#  else
+#    error "Atomic operations are not supported on your platform"
+#  endif
+#endif   // Portable / non-portable includes.
+
 // On some platforms we need additional declarations to make
 // AtomicWord compatible with our other Atomic* types.
 #if defined(OS_MACOSX) || defined(OS_OPENBSD)
diff --git a/base/atomicops_internals_portable.h b/base/atomicops_internals_portable.h
new file mode 100644
index 0000000..b25099f
--- /dev/null
+++ b/base/atomicops_internals_portable.h
@@ -0,0 +1,227 @@
+// Copyright (c) 2014 The Chromium Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style license that can be
+// found in the LICENSE file.
+
+// This file is an internal atomic implementation, use atomicops.h instead.
+//
+// This implementation uses C++11 atomics' member functions. The code base is
+// currently written assuming atomicity revolves around accesses instead of
+// C++11's memory locations. The burden is on the programmer to ensure that all
+// memory locations accessed atomically are never accessed non-atomically (tsan
+// should help with this).
+//
+// TODO(jfb) Modify the atomicops.h API and user code to declare atomic
+//           locations as truly atomic. See the static_assert below.
+//
+// Of note in this implementation:
+//  * All NoBarrier variants are implemented as relaxed.
+//  * All Barrier variants are implemented as sequentially-consistent.
+//  * Compare exchange's failure ordering is always the same as the success one
+//    (except for release, which fails as relaxed): using a weaker ordering is
+//    only valid under certain uses of compare exchange.
+//  * Acquire store doesn't exist in the C11 memory model, it is instead
+//    implemented as a relaxed store followed by a sequentially consistent
+//    fence.
+//  * Release load doesn't exist in the C11 memory model, it is instead
+//    implemented as sequentially consistent fence followed by a relaxed load.
+//  * Atomic increment is expected to return the post-incremented value, whereas
+//    C11 fetch add returns the previous value. The implementation therefore
+//    needs to increment twice (which the compiler should be able to detect and
+//    optimize).
+
+#ifndef BASE_ATOMICOPS_INTERNALS_PORTABLE_H_
+#define BASE_ATOMICOPS_INTERNALS_PORTABLE_H_
+
+#include <atomic>
+
+namespace base {
+namespace subtle {
+
+// This implementation is transitional and maintains the original API for
+// atomicops.h. This requires casting memory locations to the atomic types, and
+// assumes that the API and the C++11 implementation are layout-compatible,
+// which isn't true for all implementations or hardware platforms. The static
+// assertion should detect this issue, were it to fire then this header
+// shouldn't be used.
+//
+// TODO(jfb) If this header manages to stay committed then the API should be
+//           modified, and all call sites updated.
+typedef volatile std::atomic<Atomic32>* AtomicLocation32;
+static_assert(sizeof(*(AtomicLocation32) nullptr) == sizeof(Atomic32),
+              "incompatible 32-bit atomic layout");
+
+inline void MemoryBarrier() {
+#if defined(__GLIBCXX__)
+  // Work around libstdc++ bug 51038 where atomic_thread_fence was declared but
+  // not defined, leading to the linker complaining about undefined references.
+  __atomic_thread_fence(std::memory_order_seq_cst);
+#else
+  std::atomic_thread_fence(std::memory_order_seq_cst);
+#endif
+}
+
+inline Atomic32 NoBarrier_CompareAndSwap(volatile Atomic32* ptr,
+                                         Atomic32 old_value,
+                                         Atomic32 new_value) {
+  ((AtomicLocation32)ptr)
+      ->compare_exchange_strong(old_value,
+                                new_value,
+                                std::memory_order_relaxed,
+                                std::memory_order_relaxed);
+  return old_value;
+}
+
+inline Atomic32 NoBarrier_AtomicExchange(volatile Atomic32* ptr,
+                                         Atomic32 new_value) {
+  return ((AtomicLocation32)ptr)
+      ->exchange(new_value, std::memory_order_relaxed);
+}
+
+inline Atomic32 NoBarrier_AtomicIncrement(volatile Atomic32* ptr,
+                                          Atomic32 increment) {
+  return increment +
+         ((AtomicLocation32)ptr)
+             ->fetch_add(increment, std::memory_order_relaxed);
+}
+
+inline Atomic32 Barrier_AtomicIncrement(volatile Atomic32* ptr,
+                                        Atomic32 increment) {
+  return increment + ((AtomicLocation32)ptr)->fetch_add(increment);
+}
+
+inline Atomic32 Acquire_CompareAndSwap(volatile Atomic32* ptr,
+                                       Atomic32 old_value,
+                                       Atomic32 new_value) {
+  ((AtomicLocation32)ptr)
+      ->compare_exchange_strong(old_value,
+                                new_value,
+                                std::memory_order_acquire,
+                                std::memory_order_acquire);
+  return old_value;
+}
+
+inline Atomic32 Release_CompareAndSwap(volatile Atomic32* ptr,
+                                       Atomic32 old_value,
+                                       Atomic32 new_value) {
+  ((AtomicLocation32)ptr)
+      ->compare_exchange_strong(old_value,
+                                new_value,
+                                std::memory_order_release,
+                                std::memory_order_relaxed);
+  return old_value;
+}
+
+inline void NoBarrier_Store(volatile Atomic32* ptr, Atomic32 value) {
+  ((AtomicLocation32)ptr)->store(value, std::memory_order_relaxed);
+}
+
+inline void Acquire_Store(volatile Atomic32* ptr, Atomic32 value) {
+  ((AtomicLocation32)ptr)->store(value, std::memory_order_relaxed);
+  MemoryBarrier();
+}
+
+inline void Release_Store(volatile Atomic32* ptr, Atomic32 value) {
+  ((AtomicLocation32)ptr)->store(value, std::memory_order_release);
+}
+
+inline Atomic32 NoBarrier_Load(volatile const Atomic32* ptr) {
+  return ((AtomicLocation32)ptr)->load(std::memory_order_relaxed);
+}
+
+inline Atomic32 Acquire_Load(volatile const Atomic32* ptr) {
+  return ((AtomicLocation32)ptr)->load(std::memory_order_acquire);
+}
+
+inline Atomic32 Release_Load(volatile const Atomic32* ptr) {
+  MemoryBarrier();
+  return ((AtomicLocation32)ptr)->load(std::memory_order_relaxed);
+}
+
+#if defined(ARCH_CPU_64_BITS)
+
+typedef volatile std::atomic<Atomic64>* AtomicLocation64;
+static_assert(sizeof(*(AtomicLocation64) nullptr) == sizeof(Atomic64),
+              "incompatible 64-bit atomic layout");
+
+inline Atomic64 NoBarrier_CompareAndSwap(volatile Atomic64* ptr,
+                                         Atomic64 old_value,
+                                         Atomic64 new_value) {
+  ((AtomicLocation64)ptr)
+      ->compare_exchange_strong(old_value,
+                                new_value,
+                                std::memory_order_relaxed,
+                                std::memory_order_relaxed);
+  return old_value;
+}
+
+inline Atomic64 NoBarrier_AtomicExchange(volatile Atomic64* ptr,
+                                         Atomic64 new_value) {
+  return ((AtomicLocation64)ptr)
+      ->exchange(new_value, std::memory_order_relaxed);
+}
+
+inline Atomic64 NoBarrier_AtomicIncrement(volatile Atomic64* ptr,
+                                          Atomic64 increment) {
+  return increment +
+         ((AtomicLocation64)ptr)
+             ->fetch_add(increment, std::memory_order_relaxed);
+}
+
+inline Atomic64 Barrier_AtomicIncrement(volatile Atomic64* ptr,
+                                        Atomic64 increment) {
+  return increment + ((AtomicLocation64)ptr)->fetch_add(increment);
+}
+
+inline Atomic64 Acquire_CompareAndSwap(volatile Atomic64* ptr,
+                                       Atomic64 old_value,
+                                       Atomic64 new_value) {
+  ((AtomicLocation64)ptr)
+      ->compare_exchange_strong(old_value,
+                                new_value,
+                                std::memory_order_acquire,
+                                std::memory_order_acquire);
+  return old_value;
+}
+
+inline Atomic64 Release_CompareAndSwap(volatile Atomic64* ptr,
+                                       Atomic64 old_value,
+                                       Atomic64 new_value) {
+  ((AtomicLocation64)ptr)
+      ->compare_exchange_strong(old_value,
+                                new_value,
+                                std::memory_order_release,
+                                std::memory_order_relaxed);
+  return old_value;
+}
+
+inline void NoBarrier_Store(volatile Atomic64* ptr, Atomic64 value) {
+  ((AtomicLocation64)ptr)->store(value, std::memory_order_relaxed);
+}
+
+inline void Acquire_Store(volatile Atomic64* ptr, Atomic64 value) {
+  ((AtomicLocation64)ptr)->store(value, std::memory_order_relaxed);
+  MemoryBarrier();
+}
+
+inline void Release_Store(volatile Atomic64* ptr, Atomic64 value) {
+  ((AtomicLocation64)ptr)->store(value, std::memory_order_release);
+}
+
+inline Atomic64 NoBarrier_Load(volatile const Atomic64* ptr) {
+  return ((AtomicLocation64)ptr)->load(std::memory_order_relaxed);
+}
+
+inline Atomic64 Acquire_Load(volatile const Atomic64* ptr) {
+  return ((AtomicLocation64)ptr)->load(std::memory_order_acquire);
+}
+
+inline Atomic64 Release_Load(volatile const Atomic64* ptr) {
+  MemoryBarrier();
+  return ((AtomicLocation64)ptr)->load(std::memory_order_relaxed);
+}
+
+#endif  // defined(ARCH_CPU_64_BITS)
+}
+}  // namespace base::subtle
+
+#endif  // BASE_ATOMICOPS_INTERNALS_PORTABLE_H_
diff --git a/base/atomicops_internals_tsan.h b/base/atomicops_internals_tsan.h
deleted file mode 100644
index 24382fd9..0000000
--- a/base/atomicops_internals_tsan.h
+++ /dev/null
@@ -1,186 +0,0 @@
-// Copyright (c) 2012 The Chromium Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style license that can be
-// found in the LICENSE file.
-
-// This file is an internal atomic implementation for compiler-based
-// ThreadSanitizer. Use base/atomicops.h instead.
-
-#ifndef BASE_ATOMICOPS_INTERNALS_TSAN_H_
-#define BASE_ATOMICOPS_INTERNALS_TSAN_H_
-
-#include <sanitizer/tsan_interface_atomic.h>
-
-namespace base {
-namespace subtle {
-
-inline Atomic32 NoBarrier_CompareAndSwap(volatile Atomic32* ptr,
-                                         Atomic32 old_value,
-                                         Atomic32 new_value) {
-  Atomic32 cmp = old_value;
-  __tsan_atomic32_compare_exchange_strong(ptr, &cmp, new_value,
-      __tsan_memory_order_relaxed, __tsan_memory_order_relaxed);
-  return cmp;
-}
-
-inline Atomic32 NoBarrier_AtomicExchange(volatile Atomic32* ptr,
-                                         Atomic32 new_value) {
-  return __tsan_atomic32_exchange(ptr, new_value,
-      __tsan_memory_order_relaxed);
-}
-
-inline Atomic32 Acquire_AtomicExchange(volatile Atomic32* ptr,
-                                       Atomic32 new_value) {
-  return __tsan_atomic32_exchange(ptr, new_value,
-      __tsan_memory_order_acquire);
-}
-
-inline Atomic32 Release_AtomicExchange(volatile Atomic32* ptr,
-                                       Atomic32 new_value) {
-  return __tsan_atomic32_exchange(ptr, new_value,
-      __tsan_memory_order_release);
-}
-
-inline Atomic32 NoBarrier_AtomicIncrement(volatile Atomic32* ptr,
-                                          Atomic32 increment) {
-  return increment + __tsan_atomic32_fetch_add(ptr, increment,
-      __tsan_memory_order_relaxed);
-}
-
-inline Atomic32 Barrier_AtomicIncrement(volatile Atomic32* ptr,
-                                        Atomic32 increment) {
-  return increment + __tsan_atomic32_fetch_add(ptr, increment,
-      __tsan_memory_order_acq_rel);
-}
-
-inline Atomic32 Acquire_CompareAndSwap(volatile Atomic32* ptr,
-                                       Atomic32 old_value,
-                                       Atomic32 new_value) {
-  Atomic32 cmp = old_value;
-  __tsan_atomic32_compare_exchange_strong(ptr, &cmp, new_value,
-      __tsan_memory_order_acquire, __tsan_memory_order_acquire);
-  return cmp;
-}
-
-inline Atomic32 Release_CompareAndSwap(volatile Atomic32* ptr,
-                                       Atomic32 old_value,
-                                       Atomic32 new_value) {
-  Atomic32 cmp = old_value;
-  __tsan_atomic32_compare_exchange_strong(ptr, &cmp, new_value,
-      __tsan_memory_order_release, __tsan_memory_order_relaxed);
-  return cmp;
-}
-
-inline void NoBarrier_Store(volatile Atomic32* ptr, Atomic32 value) {
-  __tsan_atomic32_store(ptr, value, __tsan_memory_order_relaxed);
-}
-
-inline void Acquire_Store(volatile Atomic32* ptr, Atomic32 value) {
-  __tsan_atomic32_store(ptr, value, __tsan_memory_order_relaxed);
-  __tsan_atomic_thread_fence(__tsan_memory_order_seq_cst);
-}
-
-inline void Release_Store(volatile Atomic32* ptr, Atomic32 value) {
-  __tsan_atomic32_store(ptr, value, __tsan_memory_order_release);
-}
-
-inline Atomic32 NoBarrier_Load(volatile const Atomic32* ptr) {
-  return __tsan_atomic32_load(ptr, __tsan_memory_order_relaxed);
-}
-
-inline Atomic32 Acquire_Load(volatile const Atomic32* ptr) {
-  return __tsan_atomic32_load(ptr, __tsan_memory_order_acquire);
-}
-
-inline Atomic32 Release_Load(volatile const Atomic32* ptr) {
-  __tsan_atomic_thread_fence(__tsan_memory_order_seq_cst);
-  return __tsan_atomic32_load(ptr, __tsan_memory_order_relaxed);
-}
-
-inline Atomic64 NoBarrier_CompareAndSwap(volatile Atomic64* ptr,
-                                         Atomic64 old_value,
-                                         Atomic64 new_value) {
-  Atomic64 cmp = old_value;
-  __tsan_atomic64_compare_exchange_strong(ptr, &cmp, new_value,
-      __tsan_memory_order_relaxed, __tsan_memory_order_relaxed);
-  return cmp;
-}
-
-inline Atomic64 NoBarrier_AtomicExchange(volatile Atomic64* ptr,
-                                         Atomic64 new_value) {
-  return __tsan_atomic64_exchange(ptr, new_value, __tsan_memory_order_relaxed);
-}
-
-inline Atomic64 Acquire_AtomicExchange(volatile Atomic64* ptr,
-                                       Atomic64 new_value) {
-  return __tsan_atomic64_exchange(ptr, new_value, __tsan_memory_order_acquire);
-}
-
-inline Atomic64 Release_AtomicExchange(volatile Atomic64* ptr,
-                                       Atomic64 new_value) {
-  return __tsan_atomic64_exchange(ptr, new_value, __tsan_memory_order_release);
-}
-
-inline Atomic64 NoBarrier_AtomicIncrement(volatile Atomic64* ptr,
-                                          Atomic64 increment) {
-  return increment + __tsan_atomic64_fetch_add(ptr, increment,
-      __tsan_memory_order_relaxed);
-}
-
-inline Atomic64 Barrier_AtomicIncrement(volatile Atomic64* ptr,
-                                        Atomic64 increment) {
-  return increment + __tsan_atomic64_fetch_add(ptr, increment,
-      __tsan_memory_order_acq_rel);
-}
-
-inline void NoBarrier_Store(volatile Atomic64* ptr, Atomic64 value) {
-  __tsan_atomic64_store(ptr, value, __tsan_memory_order_relaxed);
-}
-
-inline void Acquire_Store(volatile Atomic64* ptr, Atomic64 value) {
-  __tsan_atomic64_store(ptr, value, __tsan_memory_order_relaxed);
-  __tsan_atomic_thread_fence(__tsan_memory_order_seq_cst);
-}
-
-inline void Release_Store(volatile Atomic64* ptr, Atomic64 value) {
-  __tsan_atomic64_store(ptr, value, __tsan_memory_order_release);
-}
-
-inline Atomic64 NoBarrier_Load(volatile const Atomic64* ptr) {
-  return __tsan_atomic64_load(ptr, __tsan_memory_order_relaxed);
-}
-
-inline Atomic64 Acquire_Load(volatile const Atomic64* ptr) {
-  return __tsan_atomic64_load(ptr, __tsan_memory_order_acquire);
-}
-
-inline Atomic64 Release_Load(volatile const Atomic64* ptr) {
-  __tsan_atomic_thread_fence(__tsan_memory_order_seq_cst);
-  return __tsan_atomic64_load(ptr, __tsan_memory_order_relaxed);
-}
-
-inline Atomic64 Acquire_CompareAndSwap(volatile Atomic64* ptr,
-                                       Atomic64 old_value,
-                                       Atomic64 new_value) {
-  Atomic64 cmp = old_value;
-  __tsan_atomic64_compare_exchange_strong(ptr, &cmp, new_value,
-      __tsan_memory_order_acquire, __tsan_memory_order_acquire);
-  return cmp;
-}
-
-inline Atomic64 Release_CompareAndSwap(volatile Atomic64* ptr,
-                                       Atomic64 old_value,
-                                       Atomic64 new_value) {
-  Atomic64 cmp = old_value;
-  __tsan_atomic64_compare_exchange_strong(ptr, &cmp, new_value,
-      __tsan_memory_order_release, __tsan_memory_order_relaxed);
-  return cmp;
-}
-
-inline void MemoryBarrier() {
-  __tsan_atomic_thread_fence(__tsan_memory_order_seq_cst);
-}
-
-}  // namespace base::subtle
-}  // namespace base
-
-#endif  // BASE_ATOMICOPS_INTERNALS_TSAN_H_
diff --git a/base/atomicops_internals_x86_gcc.cc b/base/atomicops_internals_x86_gcc.cc
index 3f47458a..c21e96d 100644
--- a/base/atomicops_internals_x86_gcc.cc
+++ b/base/atomicops_internals_x86_gcc.cc
@@ -10,15 +10,11 @@
 
 #include "base/atomicops.h"
 
-// This file only makes sense with atomicops_internals_x86_gcc.h -- it
-// depends on structs that are defined in that file.  If atomicops.h
-// doesn't sub-include that file, then we aren't needed, and shouldn't
-// try to do anything.
-#ifdef BASE_ATOMICOPS_INTERNALS_X86_GCC_H_
-
 // Inline cpuid instruction.  In PIC compilations, %ebx contains the address
 // of the global offset table.  To avoid breaking such executables, this code
 // must preserve that register's value across cpuid instructions.
+//
+// The include guards are the same as in atomicops.h.
 #if defined(__i386__)
 #define cpuid(a, b, c, d, inp) \
   asm("mov %%ebx, %%edi\n"     \
@@ -39,7 +35,10 @@
 // if we haven't been initialized yet, we're probably single threaded, and our
 // default values should hopefully be pretty safe.
 struct AtomicOps_x86CPUFeatureStruct AtomicOps_Internalx86CPUFeatures = {
-  false,          // bug can't exist before process spawns multiple threads
+  false, // bug can't exist before process spawns multiple threads
+  false, // Chrome requires SSE2, but for transition assume not and initialize
+         // this properly.
+  false, // cmpxchg16b isn't present on early AMD64 CPUs.
 };
 
 namespace {
@@ -81,6 +80,12 @@
   } else {
     AtomicOps_Internalx86CPUFeatures.has_amd_lock_mb_bug = false;
   }
+
+  // edx bit 26 is SSE2 which we use to tell use whether we can use mfence
+  AtomicOps_Internalx86CPUFeatures.has_sse2 = ((edx >> 26) & 1);
+
+  // ecx bit 13 indicates whether the cmpxchg16b instruction is supported
+  AtomicOps_Internalx86CPUFeatures.has_cmpxchg16b = ((ecx >> 13) & 1);
 }
 
 class AtomicOpsx86Initializer {
@@ -96,5 +101,3 @@
 }  // namespace
 
 #endif  // if x86
-
-#endif  // ifdef BASE_ATOMICOPS_INTERNALS_X86_GCC_H_
diff --git a/base/atomicops_internals_x86_gcc.h b/base/atomicops_internals_x86_gcc.h
index 7386fab..69eacdb 100644
--- a/base/atomicops_internals_x86_gcc.h
+++ b/base/atomicops_internals_x86_gcc.h
@@ -7,20 +7,6 @@
 #ifndef BASE_ATOMICOPS_INTERNALS_X86_GCC_H_
 #define BASE_ATOMICOPS_INTERNALS_X86_GCC_H_
 
-#include "base/base_export.h"
-
-// This struct is not part of the public API of this module; clients may not
-// use it.  (However, it's exported via BASE_EXPORT because clients implicitly
-// do use it at link time by inlining these functions.)
-// Features of this x86.  Values may not be correct before main() is run,
-// but are set conservatively.
-struct AtomicOps_x86CPUFeatureStruct {
-  bool has_amd_lock_mb_bug; // Processor has AMD memory-barrier bug; do lfence
-                            // after acquire compare-and-swap.
-};
-BASE_EXPORT extern struct AtomicOps_x86CPUFeatureStruct
-    AtomicOps_Internalx86CPUFeatures;
-
 #define ATOMICOPS_COMPILER_BARRIER() __asm__ __volatile__("" : : : "memory")
 
 namespace base {
diff --git a/base/base.gypi b/base/base.gypi
index 85f4cf4..7a980ac 100644
--- a/base/base.gypi
+++ b/base/base.gypi
@@ -81,7 +81,7 @@
           'atomicops.h',
           'atomicops_internals_gcc.h',
           'atomicops_internals_mac.h',
-          'atomicops_internals_tsan.h',
+          'atomicops_internals_portable.h',
           'atomicops_internals_x86_gcc.cc',
           'atomicops_internals_x86_gcc.h',
           'atomicops_internals_x86_msvc.h',