`movs vs. GCC中的moveups：它是如何决定的？

Question

我最近在一个用GCC 8编译的软件中研究了段错误。代码看起来如下（这只是一个草图）

struct Point
{
  int64_t x, y;
};

struct Edge
{
  // some other fields
  // ...
  Point p; // <- at offset `0xC0`

  Edge(const Point &p) p(p) {}
};

Edge *create_edge(const Point &p)
{
  void raw_memory = my_custom_allocator(sizeof(Edge));
  return new (raw_memory) Edge(p);
}

这里的关键是my_custom_allocator()将指针返回未对齐的内存。该代码崩溃是因为为了将原始点p复制到新对象的字段Edge::p中，编译器在[内联]构造函数代码中使用了movdqu / movaps对。

movdqu 0x0(%rbp), %xmm1 ; read the original object at `rbp` ... movaps %xmm1, 0xc0(%rbx) ; store it into the new `Edge` object at `rbx` - crash!

起初，这里的一切似乎都很清楚：内存未正确对齐，movaps崩溃。我的错。

但是吗？

[尝试在Godbolt上重现问题，我观察到GCC 8实际上试图相当智能地处理它。如果确定内存正确对齐，则使用movaps，就像在我的代码中一样。这个

#include <new>
#include <cstdlib>

struct P { unsigned long long x, y; };

unsigned char buffer[sizeof(P) * 100];

void *alloc()
{
  return buffer;
}

void foo(const P& s)
{
  void *raw = alloc();
  new (raw) P(s);
}
结果]
foo(P const&):
    movdqu  xmm0, XMMWORD PTR [rsi]
    movaps  XMMWORD PTR buffer[rip], xmm0
    ret
https://godbolt.org/z/a3uSid
但是不确定时，它使用movups。例如。如果我在上面的示例中“隐藏”分配器的定义，它将在同一代码中选择movups

foo(P const&): push rbx mov rbx, rdi call alloc() movdqu xmm0, XMMWORD PTR [rbx] movups XMMWORD PTR [rax], xmm0 pop rbx ret

https://godbolt.org/z/cNKe5A

因此，如果应该以这种方式运行，为什么要在我在本文开头提到的软件中使用movaps？在我的情况下，编译器在调用时看不到my_custom_allocator()的实现，这就是为什么我希望GCC选择movups的原因。

这里还有哪些其他因素在起作用？这是GCC中的错误吗？如何强制GCC使用movups，最好在任何地方使用？

我最近在用GCC 8编译的软件中研究了段错误。代码看起来如下（这只是一个草图）struct Point {int64_t x，y; }; struct Edge {//其他字段...

Answer 1

由于Edge结构具有编译器确定的对齐要求，因此编译器可以自由地假定该类型的所有对象均已正确对齐。如果您的自定义分配器没有返回指向正确对齐的内存的指针，则在该地址使用对象会导致未定义行为。

`movs vs. GCC中的moveups：它是如何决定的？

问题描述投票：0回答：1

1个回答

最新问题

`movs vs. GCC中的moveups：它是如何决定的？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1