对reinterpret_cast<>

问题描述 投票:0回答:1

我最近在

reinterpret_cast
上阅读了大量内容,因为我想确保我正确使用它并且不会意外调用未定义的行为。我觉得 cppreference这篇关于严格别名的精彩文章 已经让我完成了 95% 的工作,但我想澄清一下我对什么是 UB、什么不是 UB 的理解。

假设我有一个结构:

struct __attribute__((packed)) SimpleStruct {
    uint32_t a = 0;
    uint8_t b = 1;
    int16_t c = 2;
    uint8_t d[5] = {0, 1, 2, 3, 4};
};

我使用了

__attribute__((packed))
指令来确保不使用填充字节,从而损害性能/优化。根据标准,允许通过对象的
reinterpret_cast
unsigned char *
检查字节表示,而不是 UB:

unsigned char *bytes_of_simple_struct = reinterpret_cast<unsigned char *>(&simple_struct);

现在,这是我想要澄清的部分,我相信通过此指针修改结构的字节是允许的,而不是UB(假设你遵守对象的大小):

static_assert(sizeof(simple_struct) == 12);
bytes_of_simple_struct[0] = 0x1U;

现在,我明白

simple_struct.a
的值取决于系统的字节序。但是,在字节修改后访问
simple_struct.a
仍然是定义的行为正确吗?因为只要我没有将字节修改为它们所组成的类型的无效表示,就仍然应该定义行为。

相反,如果我的结构有一个

bool
代替:

struct __attribute__((packed)) SimpleStruct {
    bool a_bool = false;
    uint8_t b = 1;
    int16_t c = 2;
    uint8_t d[5] = {0, 1, 2, 3, 4};
};

然后做这样的事情:

bytes_of_simple_struct[0] = 0xFFU;
assert(simple_struct.a_bool == false);

将调用 UB,因为我现在已经修改了

a_bool
的底层字节,使得
bool
类型没有有效的表示。基本上,只要任何字节修改仍然遵守哪些字节可以代表每种类型的规则,就应该定义行为吗?对于基本数字类型,您基本上可以将字节修改为任何内容(这是否有用是另一回事),因为任何字节值都是有效的
uint8_t
,任何两个字节都是有效的
uint16_t
等等...

我的理解正确吗?

c++ reinterpret-cast
1个回答
0
投票

对于任何可能觉得这有帮助的人,以一些注释代码的形式很好地总结了我对未定义行为的遗漏:

#include <stdio.h>
#include <string.h>
#include <stdint.h>
#include <cassert>

struct __attribute__((packed)) SimpleStruct
{
  bool a_bool = false;
  uint8_t b = 1;
  int16_t c = 2;
  uint8_t d[6] = { 0, 1, 2, 3, 4, 5 };
};

int
main ()
{

    SimpleStruct simple_struct{};

    // Ensuring padding has indeed been removed from the struct with __attribute__((packed)) 
    static_assert (sizeof (simple_struct) == 10);

    // Defined behavior, casting to unsigned char (or std::byte in C++20) to view the byte representation of an object is allowed
    unsigned char *bytes_of_simple_struct =
    reinterpret_cast <unsigned char *>(&simple_struct);
    for (int i = 0; i < sizeof (simple_struct); i++)
    {
        printf("Byte %d of struct: %02X\n", i, bytes_of_simple_struct[i]);
    }
    
    // Defined behavior, using memcpy() to copy bytes into an object is allowed
    static_assert(sizeof(bool) == 1);
    uint8_t byte_array[sizeof(SimpleStruct)] = {
        0x00U, // Critical this is either 0x00U or 0x01U, the only two valid byte representations for type bool
        0x00U,
        0x00U,
        0x00U,
        0x00U,
        0x00U,
        0x00U,
        0x00U,
        0x00U,
        0x00U,
    };
    static_assert(sizeof(simple_struct) == sizeof(byte_array));
    memcpy(&simple_struct, byte_array, sizeof(simple_struct));
    assert(simple_struct.b == 0);
    
    // Undefined behavior! This violates strict aliasing, because:
    // - bytes_of_simple_struct[1] - Undefined! We've now de-referenced the unsigned char *,
    //                               but the unsigned char * actually points at a SimpleStruct!
    //                               Assigning values as if it was an unsigned char is undefined.
    bytes_of_simple_struct[1] = 0xFFU;
    assert(simple_struct.b = 0xFFU);
    
    // A subtly different way to assign a single byte to bytes_of_simple_struct[1] that is defined.
    // While it looks similar, the entire reason this is "defined" is because memcpy is not
    // interpreting simple_struct as any type, it is simply copying bytes from one memory
    // location to another.
    unsigned char a_byte = 0xFFU;
    memcpy(bytes_of_simple_struct + 1, &a_byte, sizeof(a_byte));
    assert(simple_struct.b = 0xFFU);
    
    // However, extreme care must be taken to ensure that the byte representation of the
    // type being copied into is still valid post memcpy(). If the byte representation isn't
    // valid, undefined behavior still occurs. For example:
    a_byte = 0xFFU;
    memcpy(bytes_of_simple_struct, &a_byte, sizeof(a_byte));
    
    // A bool can only be represented by bytes 0x0 and 0x1, by copying 0xFF into a bool type
    // and referencing simple_struct.a_bool, undefined behavior is invoked.
    // For example, these assertions both pass compiled with GCC 13.2!
    assert(simple_struct.a_bool != false);
    assert(simple_struct.a_bool != true)
    
}

© www.soinside.com 2019 - 2024. All rights reserved.