Go 并发原语 - atomic | Chen Shungen - Golang 开发者 / 技术探索者

前面讲 Mutex、WaitGroup 等并发原语的实现时，你会发现它们的底层都依赖 sync/atomic 包的原子操作。原子操作是并发编程的最底层基石——比锁更轻量、比 Channel 更快，适合特定场景下的高性能并发控制。这一篇我们专门来讲 atomic。

一、什么是原子操作？

原子操作是指 不会被中断的操作。在其他 goroutine 看来，原子操作要么已经完成，要么还没开始，不会看到"执行了一半"的中间状态。

普通操作（count++，三步）：         原子操作（atomic.AddInt64，一步）：

  goroutine A     goroutine B         goroutine A     goroutine B
      │               │                   │               │
      │ 读取 count=5  │                   │               │
      │               │ 读取 count=5      │ AddInt64(+1)  │
      │ count=5+1=6   │                   │ (不可分割)     │
      │ 写回 count=6  │ count=5+1=6       │               │ AddInt64(+1)
      │               │ 写回 count=6      │               │ (不可分割)
      │               │                   │               │
      结果：6（丢失一次+1）💀            结果：7（正确）✅

原子操作为什么能保证这一点？因为 CPU 硬件直接支持：

┌──────────────────────────────────────────────────────────────┐
│  CPU 层面的原子操作实现                                       │
├──────────────────────────────────────────────────────────────┤
│                                                              │
│  x86/amd64：                                                 │
│    LOCK 指令前缀（如 LOCK CMPXCHG）                          │
│    XCHG 指令本身自带 LOCK 语义                               │
│                                                              │
│  ARM：                                                       │
│    LDREX / STREX（Load-Link / Store-Conditional）            │
│                                                              │
│  MIPS：                                                      │
│    LL / SC（Load-Linked / Store-Conditional）                │
│                                                              │
│  共同点：硬件保证在操作期间其他核不会访问同一内存地址         │
│                                                              │
└──────────────────────────────────────────────────────────────┘

二、atomic 包提供的操作

Go 的 sync/atomic 包提供了五类原子操作：

┌──────────────────────────────────────────────────────────────────┐
│                     sync/atomic 五类操作                         │
├──────────────┬───────────────────────────────────────────────────┤
│  Add         │ 原子加减，返回新值                                │
│              │ AddInt32, AddInt64, AddUint32, AddUint64, AddUintptr│
├──────────────┼───────────────────────────────────────────────────┤
│  Load        │ 原子读取                                          │
│              │ LoadInt32, LoadInt64, LoadUint32, LoadUint64, ...  │
├──────────────┼───────────────────────────────────────────────────┤
│  Store       │ 原子写入                                          │
│              │ StoreInt32, StoreInt64, StoreUint32, ...           │
├──────────────┼───────────────────────────────────────────────────┤
│  Swap        │ 原子交换，返回旧值                                │
│              │ SwapInt32, SwapInt64, SwapUint32, ...              │
├──────────────┼───────────────────────────────────────────────────┤
│  CompareAndSwap│ CAS，旧值匹配才交换，返回是否成功              │
│              │ CompareAndSwapInt32, CompareAndSwapInt64, ...      │
└──────────────┴───────────────────────────────────────────────────┘

Add：原子加减

var counter int64

// 多个 goroutine 并发递增
var wg sync.WaitGroup
for i := 0; i < 10000; i++ {
    wg.Add(1)
    go func() {
        defer wg.Done()
        atomic.AddInt64(&counter, 1) // 原子 +1
    }()
}
wg.Wait()
fmt.Println(counter) // 稳定输出 10000

减法怎么做？加一个负数：

atomic.AddInt64(&counter, -1)  // 原子 -1

// 对于 uint 类型，利用补码：
atomic.AddUint32(&ucount, ^uint32(0))  // 等价于 -1

Load / Store：原子读写

var flag int32

// goroutine A：原子写
atomic.StoreInt32(&flag, 1)

// goroutine B：原子读
if atomic.LoadInt32(&flag) == 1 {
    fmt.Println("flag 已设置")
}

为什么需要 Load/Store？直接读写变量不行吗？

直接读写 int64 在 32 位系统上不是原子的：

  int64 = 高 32 位 + 低 32 位

  goroutine A 写入：           goroutine B 读取：
  ┌────────┬────────┐
  │ 写高位  │        │         ┌────────┬────────┐
  │        │ 写低位  │   ←──   │ 读高位  │ 读低位  │
  └────────┴────────┘         └────────┴────────┘
                               可能读到：新高位 + 旧低位 💀

在 64 位系统上读写一个对齐的 int64 通常是原子的，但 Go 规范没有保证这一点。用 atomic 才是可移植的正确做法。

CompareAndSwap（CAS）：原子比较交换

CAS 是最强大的原子操作，很多无锁数据结构都基于它：

// CAS: 如果当前值 == old，则设置为 new，返回是否成功
swapped := atomic.CompareAndSwapInt64(&value, old, new)

典型用法——无锁的自旋更新：

// 原子地把 value 翻倍
for {
    old := atomic.LoadInt64(&value)
    new := old * 2
    if atomic.CompareAndSwapInt64(&value, old, new) {
        break // CAS 成功，退出
    }
    // CAS 失败（被其他 goroutine 修改了），重试
}

CAS 操作流程：

  ┌─ 读取当前值 old
  │
  ├─ 计算新值 new = f(old)
  │
  └─ CAS(addr, old, new)
      │
      ├─ *addr == old ──→ *addr = new, return true ✅
      │
      └─ *addr != old ──→ return false ❌ （被其他人改了，重试）

三、Go 1.19+ atomic 类型

Go 1.19 引入了一组泛型化的原子类型，更安全、更易用：

┌──────────────────────────────────────────────────────────────┐
│                    atomic 类型（Go 1.19+）                    │
├──────────────────────────────────────────────────────────────┤
│  atomic.Bool          原子布尔值                             │
│  atomic.Int32         原子 int32                             │
│  atomic.Int64         原子 int64                             │
│  atomic.Uint32        原子 uint32                            │
│  atomic.Uint64        原子 uint64                            │
│  atomic.Uintptr       原子 uintptr                           │
│  atomic.Pointer[T]    原子指针（泛型）                       │
│  atomic.Value         原子任意值（interface{}）               │
└──────────────────────────────────────────────────────────────┘

对比新旧两种写法：

// 旧写法：函数式，需要传指针
var counter int64
atomic.AddInt64(&counter, 1)
v := atomic.LoadInt64(&counter)

// 新写法（Go 1.19+）：方法式，更安全
var counter atomic.Int64
counter.Add(1)
v := counter.Load()

新类型的优势：

┌─────────────────────────────────────────────────────────────┐
│  旧函数式 API 的问题                                        │
├─────────────────────────────────────────────────────────────┤
│  • 必须传指针，容易误传值                                   │
│  • 变量声明和使用分离，可能被非原子地访问                   │
│  • 没有类型约束，int32 和 int64 容易混淆                   │
├─────────────────────────────────────────────────────────────┤
│  新方法式 API 的改进                                        │
├─────────────────────────────────────────────────────────────┤
│  • 方法调用，不会传错                                       │
│  • 类型本身就标记了"这是原子变量"                           │
│  • go vet 可以检测对原子类型的非原子访问                    │
└─────────────────────────────────────────────────────────────┘

atomic.Value：存储任意类型

atomic.Value 可以原子地存储和加载任意类型的值，常用于配置热更新：

var config atomic.Value

// 初始化
config.Store(map[string]string{
    "host": "localhost",
    "port": "8080",
})

// goroutine A：原子更新配置
config.Store(map[string]string{
    "host": "prod.example.com",
    "port": "443",
})

// goroutine B：原子读取配置
cfg := config.Load().(map[string]string)
fmt.Println(cfg["host"])

注意：atomic.Value 存储的值类型必须一致。第一次 Store 的是什么类型，后续 Store 的也必须是同一类型，否则 panic。

atomic.Pointer[T]：类型安全的原子指针

type Config struct {
    Host string
    Port int
}

var configPtr atomic.Pointer[Config]

// 原子更新
configPtr.Store(&Config{Host: "localhost", Port: 8080})

// 原子读取
cfg := configPtr.Load()
fmt.Println(cfg.Host) // 无需类型断言

四、实战：基于 CAS 的无锁栈

用 CAS 实现一个无锁并发栈：

type node[T any] struct {
    value T
    next  *node[T]
}

type LockFreeStack[T any] struct {
    top atomic.Pointer[node[T]]
}

func (s *LockFreeStack[T]) Push(value T) {
    n := &node[T]{value: value}
    for {
        old := s.top.Load()
        n.next = old
        if s.top.CompareAndSwap(old, n) {
            return // CAS 成功
        }
        // CAS 失败，重试
    }
}

func (s *LockFreeStack[T]) Pop() (T, bool) {
    for {
        old := s.top.Load()
        if old == nil {
            var zero T
            return zero, false // 栈为空
        }
        if s.top.CompareAndSwap(old, old.next) {
            return old.value, true // CAS 成功
        }
        // CAS 失败，重试
    }
}

Push(3):                            Pop():

  top → [2] → [1] → nil              top → [3] → [2] → [1] → nil
                                       │
  new = [3]                            │ CAS(top, [3], [2])
  new.next = top                       │
  CAS(top, [2], [3])                   top → [2] → [1] → nil
                                       return 3
  top → [3] → [2] → [1] → nil

五、atomic vs Mutex：怎么选？

特性	atomic	Mutex
性能	✅ 极高（硬件级）	一般（涉及调度）
适用操作	简单的读/写/加减	任意复杂的临界区
复合操作	❌ 只能单步原子	✅ 多步骤原子
代码复杂度	低（简单场景）	低（通用）
无锁数据结构	✅ CAS 自旋	❌ 需要锁

选择依据：

简单的计数器、标记位         ──→ atomic ✅
配置的原子替换               ──→ atomic.Value / atomic.Pointer ✅
需要保护多个变量的一致性     ──→ Mutex ✅
复杂的临界区逻辑             ──→ Mutex ✅

六、使用 atomic 最常踩的 3 个坑

坑 1：64 位原子操作在 32 位系统上的对齐问题

在 32 位系统上，64 位原子操作要求变量 8 字节对齐，否则会 panic：

type Bad struct {
    flag bool        // 1 byte
    counter int64    // ❌ 偏移 1 byte，未对齐
}

type Good struct {
    counter int64    // ✅ 放在第一个字段，保证对齐
    flag bool
}

提示：Go 1.19+ 的 atomic.Int64 等类型内部已经处理了对齐问题，优先使用它们。

坑 2：混合使用原子和非原子操作

var counter int64

// goroutine A
atomic.AddInt64(&counter, 1) // 原子操作

// goroutine B
fmt.Println(counter) // ❌ 直接读取，非原子！

一旦一个变量被原子操作访问，所有对它的访问都必须用原子操作。 混合使用会导致数据竞争。

坑 3：CAS 自旋不加退避

高竞争场景下，纯 CAS 自旋会浪费 CPU：

// ❌ 高竞争下空转
for !atomic.CompareAndSwapInt64(&v, old, new) {
    old = atomic.LoadInt64(&v)
    new = f(old)
}

// ✅ 加入 runtime.Gosched() 让出 CPU
for !atomic.CompareAndSwapInt64(&v, old, new) {
    runtime.Gosched() // 让其他 goroutine 有机会执行
    old = atomic.LoadInt64(&v)
    new = f(old)
}

七、实战建议

简单计数器用 atomic.Int64——比 Mutex 快一个数量级
配置热更新用 atomic.Value 或 atomic.Pointer[T]——无锁读，写时原子替换
Go 1.19+ 优先用原子类型——比函数式 API 更安全、更清晰
不要过度使用无锁结构——CAS 自旋在高竞争下可能比 Mutex 更慢
所有访问要么全用原子，要么全用锁——混合使用是数据竞争

# 两个应该加入 CI 的命令
go vet ./...
go test -race ./...

atomic 是 Go 并发编程的最底层工具——所有上层的 Mutex、WaitGroup、Once 都建立在它之上。在性能敏感的场景中，直接使用 atomic 可以获得硬件级的并发性能。但它只适合简单的原子操作，复杂的临界区逻辑还是交给 Mutex 更靠谱。