C++ 简单类大小的参数传递优化

Question

我有一个简单的

struct A

，它代表 std::int8_t 的类型安全包装器。

已被简化以表示最小的可重现问题。

出于性能原因，使用

-O3

时，结构应遵循与

std::int8_t

相同的优化，但是在此示例中中将结构作为参数传递时：

#include <cstdint>

using in = std::int8_t;

template<typename T>
struct A {
    T value;
};

template <typename T>
inline constexpr auto operator +( A <T> lhs, A <T> rhs)
noexcept -> decltype(lhs.value + rhs.value)
{ return lhs.value + rhs.value; }

template <typename T>
inline constexpr auto operator -( A <T> lhs, A <T> rhs)
noexcept -> decltype(lhs.value - rhs.value)
{ return lhs.value - rhs.value; }

template <typename T>
inline constexpr auto operator *( A <T> lhs, A <T> rhs)
noexcept -> decltype(lhs.value * rhs.value)
{ return lhs.value * rhs.value; }

auto h(A<in> a, A<in> b){
    return (a*b)*(a-b);
}

auto g(in a, in b){
    return (a*b)*(a-b);
}

该函数（我称之为

h(A<std::int8_t>, A<std::int8_t>)

）与函数

movsx

相比，包含两个

g(std::int8_t, std::int8_t)

的开销：

h(A<signed char>, A<signed char>):                            
        movsx   eax, dil  <-- diff
        movsx   ecx, sil  <-- diff
        mov     edx, ecx
        imul    edx, eax
        sub     eax, ecx
        imul    eax, edx
        ret
g(signed char, signed char):                                  
        mov     eax, esi
        imul    eax, edi
        sub     edi, esi
        imul    eax, edi
        ret

问题是否与标准 C++ 的某些结构空间约束有关？请解释一下背后的理论
提供（如果存在）可移植/特定于编译器的优化方法
```
A
```
。

注意：h == g 对于每个 std::int*_t != (std::int8_t 或 std::int16_t) （即 std::int32_t、std::int64_t）

Answer 1

Clang 使用 x86-64 System V 调用约定的未记录扩展，其中窄整数参数扩展为 32 位。显然这仅适用于原始类型，不适用于结构包装器。 GCC 使函数调用与此兼容，但不依赖它来获取传入参数：它将对这两个函数使用

movsx

。请参阅向 x86-64 ABI 的指针添加 32 位偏移量时是否需要符号或零扩展？

https://godbolt.org/z/ssYqrhr1n 显示 AArch64 clang 必须在两个版本中使用

sxtb

，因此这个调用约定扩展似乎是特定于目标的。

您的函数返回

int

（因为您使用

auto

返回类型，并且窄

具有隐式整数提升为

int

）。如果尚未保证输入符号扩展至

int8_t

宽度，则需要对

int

输入进行符号扩展，才能从提升值的运算中获得正确的

int

结果。

如果你让

h()

或

g()

返回

int8_t

，clang 足够聪明，可以优化符号扩展，因为它知道像

add

/

sub

/

mul

这样的低位操作不会t 依赖于临时变量的较高位；只有像除法和右移这样的东西才能做到这一点。（向 x86-64 ABI 的指针添加 32 位偏移量时是否需要符号或零扩展？）

int8_t h(A<in> a, A<in> b){
    return (a*b)*(a-b);
}

神箭

// x86-64 clang 17 -O3
h(A<signed char>, A<signed char>):
        mov     eax, esi
        mul     dil
        sub     dil, sil
        mul     dil
        ret

// AArch64 clang
h(A<signed char>, A<signed char>):
        mul     w8, w1, w0
        sub     w9, w0, w1
        mul     w0, w8, w9
        ret

正如 @463035818_is_not_an_ai 指出的那样，您实际上可能希望模板函数返回

A<T>

，而不仅仅是

或

int

；这也将避免符号扩展的需要。

C++ 简单类大小的参数传递优化

问题描述投票：0回答：1

1个回答

最新问题

C++ 简单类大小的参数传递优化

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1