内存中的long long类型表示形式

问题描述 投票:5回答:4

我想从8字节类型中提取字节,类似于char func(long long number, size_t offset),因此对于偏移量n,我将获得第n个字节(0 <= n <= 7)。在这样做的时候,我意识到我不知道内存中实际上如何表示8字节变量。希望您能帮助我解决。我首先编写了一个简短的python脚本,以在每个字节中打印由A s(ascii值为65)组成的数字

sumx = 0
for x in range(8):
    sumx += (ord('A')*256**x)
    print('x {} sumx {}'.format(x,sumx))

输出为

x 0 sumx 65
x 1 sumx 16705
x 2 sumx 4276545
x 3 sumx 1094795585
x 4 sumx 280267669825
x 5 sumx 71748523475265
x 6 sumx 18367622009667905
x 7 sumx 4702111234474983745

在我看来,每个数字都是一堆A,后跟0。接下来,我编写了一个简短的C ++代码以提取第n个字节

#include <iostream>
#include <array>

char func0(long long number, size_t offset)
{
  offset <<= 3;
  return (number & (0x00000000000000FF << offset)) >> offset;
}

char func1(long long unsigned number, size_t offset)
{
  char* ptr = (char*)&number;
  return ptr[offset];
}

int main()
{
  std::array<long long,8> arr{65,16705,4276545,1094795585,280267669825,71748523475265,18367622009667905,4702111234474983745};
  for (int i = 0; i < arr.size(); i++)
    for (int j = 0; j < sizeof(long long unsigned); j++)
      std::cout << "char " << j << " in number " << i << " (" << arr[i] << ") func0 " << func0(arr[i], j) << " func1 " << func1(arr[i], j) << std::endl;
  return 0;
}

这里是程序输出(注意从第5个字节开始的区别)

~ # g++ -std=c++11 prog.cpp -o prog; ./prog
char 0 in number 0 (65) func0 A func1 A
char 1 in number 0 (65) func0  func1
char 2 in number 0 (65) func0  func1
char 3 in number 0 (65) func0  func1
char 4 in number 0 (65) func0  func1
char 5 in number 0 (65) func0  func1
char 6 in number 0 (65) func0  func1
char 7 in number 0 (65) func0  func1
char 0 in number 1 (16705) func0 A func1 A
char 1 in number 1 (16705) func0 A func1 A
char 2 in number 1 (16705) func0  func1
char 3 in number 1 (16705) func0  func1
char 4 in number 1 (16705) func0  func1
char 5 in number 1 (16705) func0  func1
char 6 in number 1 (16705) func0  func1
char 7 in number 1 (16705) func0  func1
char 0 in number 2 (4276545) func0 A func1 A
char 1 in number 2 (4276545) func0 A func1 A
char 2 in number 2 (4276545) func0 A func1 A
char 3 in number 2 (4276545) func0  func1
char 4 in number 2 (4276545) func0  func1
char 5 in number 2 (4276545) func0  func1
char 6 in number 2 (4276545) func0  func1
char 7 in number 2 (4276545) func0  func1
char 0 in number 3 (1094795585) func0 A func1 A
char 1 in number 3 (1094795585) func0 A func1 A
char 2 in number 3 (1094795585) func0 A func1 A
char 3 in number 3 (1094795585) func0 A func1 A
char 4 in number 3 (1094795585) func0  func1
char 5 in number 3 (1094795585) func0  func1
char 6 in number 3 (1094795585) func0  func1
char 7 in number 3 (1094795585) func0  func1
char 0 in number 4 (280267669825) func0 A func1 A
char 1 in number 4 (280267669825) func0 A func1 A
char 2 in number 4 (280267669825) func0 A func1 A
char 3 in number 4 (280267669825) func0 A func1 A
char 4 in number 4 (280267669825) func0  func1 A
char 5 in number 4 (280267669825) func0  func1
char 6 in number 4 (280267669825) func0  func1
char 7 in number 4 (280267669825) func0  func1
char 0 in number 5 (71748523475265) func0 A func1 A
char 1 in number 5 (71748523475265) func0 A func1 A
char 2 in number 5 (71748523475265) func0 A func1 A
char 3 in number 5 (71748523475265) func0 A func1 A
char 4 in number 5 (71748523475265) func0  func1 A
char 5 in number 5 (71748523475265) func0  func1 A
char 6 in number 5 (71748523475265) func0  func1
char 7 in number 5 (71748523475265) func0  func1
char 0 in number 6 (18367622009667905) func0 A func1 A
char 1 in number 6 (18367622009667905) func0 A func1 A
char 2 in number 6 (18367622009667905) func0 A func1 A
char 3 in number 6 (18367622009667905) func0 A func1 A
char 4 in number 6 (18367622009667905) func0  func1 A
char 5 in number 6 (18367622009667905) func0  func1 A
char 6 in number 6 (18367622009667905) func0  func1 A
char 7 in number 6 (18367622009667905) func0  func1
char 0 in number 7 (4702111234474983745) func0 A func1 A
char 1 in number 7 (4702111234474983745) func0 A func1 A
char 2 in number 7 (4702111234474983745) func0 A func1 A
char 3 in number 7 (4702111234474983745) func0 A func1 A
char 4 in number 7 (4702111234474983745) func0  func1 A
char 5 in number 7 (4702111234474983745) func0  func1 A
char 6 in number 7 (4702111234474983745) func0  func1 A
char 7 in number 7 (4702111234474983745) func0 A func1 A

此代码具有2个函数,func1返回期望值,func0我以为它应该返回与func1相同的值,但没有,我不确定为什么。基本上,我理解8字节类型,例如8字节数组,func1清楚地表明了这种情况。我不确定为什么要使用移位来移至第n个字节不起作用,并且我不确定我是否完全理解内存中如何安排8个字节的变量

c++ long-integer
4个回答
5
投票

问题在于代码中

 0x00000000000000FF << offset

左边的数字0xFF只是一个整数(无论您放置了多少个零),向左移动给出的整数(实际上是整数的大小...向左移动大于整数的大小不是)便携式代码)。

改为使用:

 0xFFull << offset

解决了问题(因为后缀ull告诉它应被视为unsigned long long。]

当然,正如另一个答案所述,(number >> (offset * 8)) & 0xFF更简单且有效。


8
投票

这是一种非常简单的操作,非常复杂。您甚至不需要考虑字节序问题,因为您不需要仅为了获取字节就访问long long的内存表示形式。

获得第n个字节仅是掩盖所有其他字节并将该值转换为unsigned char的问题。像这样:

unsigned char nth_byte(unsigned long long int value, int n)
{
  //Assert that n is on the range [0, 8)
  value = value >> (8 * n);   //Move the desired byte into the first byte.
  value = value & 0xFF;      //Mask away everything that isn't the first byte.
  return unsigned char(value); //Return the first byte.
}

2
投票

func0中的问题是,您的十六进制文字虽然包含8个字节的数据,但由于未指定精度而被解释为long。使用0xffULL(0xff unsigned long long)代替0x00000000000000ff应该可以为您提供所需的内容。

线索是,它在前32位中运行良好,然后跌落了。不过,我不知所措地解释了第七个A的来源。


2
投票

分析变量的基础内存表示的正确方法是使用memcpy并将其复制到char数组(参考:C aliasing rules and memcpy):

#include <cstring>

char get_char(long long num, size_t offs)
{
    char array[sizeof(long long)];

    memcpy(array, &num, sizeof(long long));

    return array[offs];
}

然后为以下示例:

int main()
{
    long long var = 0x7766554433221100;

    for (size_t idx = 0; idx < sizeof(long long); ++idx)
        std::cout << '[' << idx << ']' << '=' << std::hex << static_cast<int>(get_char(var, idx)) << '\n';
}

在小端系统上,我们得到:

[0]=0    
[1]=11    
[2]=22   
[3]=33    
[4]=44    
[5]=55    
[6]=66    
[7]=77

在大端系统上,我们得到:

[0]=77    
[1]=66    
[2]=55   
[3]=44    
[4]=33    
[5]=22    
[6]=11    
[7]=0

https://en.wikipedia.org/wiki/Endianness

https://godbolt.org/z/xrPMVw

© www.soinside.com 2019 - 2024. All rights reserved.