原子写入附近的单字节变量

Suppose, on a multiprocessor machine, there are two global variables A and B, each one byte in size, located near each other in memory, and two CPUs executing the following code.

假设，在多处理器机器上，有两个全局变量A和B，每个大小一个字节，在内存中彼此靠近，两个CPU执行以下代码。

CPU 1:

CPU 1：

read A
calculate new value
write A

CPU 2:

CPU 2：

read B
calculate new value
write B

Just looking at what would tend to physically happen, we would expect the above would be incorrect without any explicit locking because A and B could be in the same cache line, and CPU 1 needs to read the entire cache line, change the value of a single byte and write the line again; if CPU 2 does its read-modify-write of the cache line in between, the update to B could be lost. (I'm assuming it doesn't matter what order A and B are updated in, I'm only concerned with making sure neither update is lost.)

只是看看物理上会发生什么，我们会期望上面的内容是不正确的，没有任何显式锁定，因为A和B可能在同一个缓存行中，而CPU 1需要读取整个缓存行，更改a的值单字节并再次写入该行;如果CPU 2对其间的高速缓存行进行读 - 修改 - 写，则对B的更新可能会丢失。（我假设更新A和B的顺序并不重要，我只关心确保两个更新都没有丢失。）

But x86 guarantees this code is okay. On x86, a write to a single variable only becomes non-atomic if that variable is misaligned or bigger than the CPU word size.

但是x86保证这个代码没问题。在x86上，如果该变量未对齐或大于CPU字大小，则对单个变量的写入仅变为非原子变量。

Does an x86 CPU automatically carry out extra locking on the front side bus in order to make such individual variable updates, work correctly without explicit locking?

x86 CPU是否会自动在前端总线上执行额外锁定以进行此类单独的变量更新，无需显式锁定即可正常工作？

2 个解决方案

#1

This code is correct because of cache coherency protocol. When CPU1 modifies cache line, this line became Invalid in the cache of CPU 2, and CPU 2 can't write B and must wait (See https://en.wikipedia.org/wiki/MESIF_protocol for the state machine).

由于缓存一致性协议，此代码是正确的。当CPU1修改高速缓存行时，该行在CPU 2的高速缓存中变为无效，并且CPU 2无法写入B并且必须等待（请参阅https://en.wikipedia.org/wiki/MESIF_protocol以获取状态机）。

So no updates are lost, and no bus locks required.

因此不会丢失更新，也不需要总线锁。

#2

The code is correct because the standard provides the following guarantee (1.7.3):

代码是正确的，因为该标准提供了以下保证（1.7.3）：

Two or more threads of execution can access separate memory locations without interfering with each other.

两个或多个执行线程可以访问单独的存储器位置而不会相互干扰。

It is possible that the variables share the same cache line. That may lead to false sharing, i.e. each core invalidates the cache line upon a write and other cores that access the same cache line will have to get their data from memory higher up in the chain.

变量可能共享相同的缓存行。这可能导致错误共享，即每个核心在写入时使高速缓存行无效，并且访问相同高速缓存行的其他核心必须从链中的较高位置获得其数据。

That will slow things down, but from a correctness point of view, false sharing is irrelevant since separate memory locations can still be accessed without synchronization.

这将减慢速度，但从正确的角度来看，错误共享是无关紧要的，因为仍然可以在没有同步的情况下访问单独的存储器位置。

#1