arm gcc：没有易失性的获取-释放？

Question

我尝试使用共享索引来指示数据已写入共享循环缓冲区。有没有一种有效的方法可以在 ARM（带有 -O3 的arm gcc 9.3.1）上执行此操作，而不使用 discouraged

volatile

关键字？

以下 C 函数在 x86 上运行良好：

void Test1(int volatile* x) { *x = 5; }
void Test2(int* x) { __atomic_store_n(x, 5, __ATOMIC_RELEASE); }

两者在 x86 上的编译效率相同且相同：

0000000000000000 <Test1>:
   0:   c7 07 05 00 00 00       movl   $0x5,(%rdi)
   6:   c3                      retq   
   7:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
   e:   00 00 

0000000000000010 <Test2>:
  10:   c7 07 05 00 00 00       movl   $0x5,(%rdi)
  16:   c3                      retq

但是，在 ARM 上，

__atomic

内置函数会生成数据内存屏障，而

volatile

则不会：

00000000 <Test1>:
   0:   2305            movs    r3, #5
   2:   6003            str     r3, [r0, #0]
   4:   4770            bx      lr
   6:   bf00            nop

00000000 <Test2>:
   0:   2305            movs    r3, #5
   2:   f3bf 8f5b       dmb     ish
   6:   6003            str     r3, [r0, #0]
   8:   4770            bx      lr
   a:   bf00            nop

如何避免内存障碍（或类似的低效率），同时避免

volatile

？

Answer 1

volatile

分配不是发布存储。如果您想要的话，请使用

__ATOMIC_RELAXED

。（或者更好，stdatomic.h 或

std::atomic_ref

与

memory_order_relaxed

。）

dmb ISHST

至少是一个 StoreStore 屏障，所以在 asm 中你可以获得发布语义。较早的商店，但不较早的装载。这对于

std::memory_order_release

又名

__ATOMIC_RELEASE

来说还不够，所以没有办法让编译器为你使用它。（https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html中的操作或栅栏都没有映射到此）。

对于

-mcpu=cortex-a53

或其他 ARMv8 CPU，即使在 AArch32 状态下，

stl

也可用作发布存储。因此，用它来避免发布商店或获取负载的昂贵的

dmb ish

完全障碍：https://godbolt.org/z/1hzvGMbon

# GCC -O2 -mcpu=cortex-a53      (or -march=armv8-a)
Test2(int*):
        movs    r3, #5
        stl     r3, [r0]       // release store
        bx      lr

arm gcc：没有易失性的获取-释放？

问题描述投票：0回答：1

1个回答

最新问题

arm gcc：没有易失性的获取-释放？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1