为什么 od 和我的 C++ 代码读取的字节顺序与十六进制编辑器呈现的字节顺序不同？

Question

我注意到一个奇怪的行为，即

od -H

和 Vim 的十六进制编辑器（打开文件并使用命令

:%!xxd

）对相同数据显示不同的字节序。我编写了一些 C++ 代码，从文件中转储第一个

uint32_t

，其字节序与

od

的字节序匹配，而不是十六进制编辑器中显示的内容：

dump.cc

：

#include <cstdio>
#include <iostream>
#include <stdexcept>
#include <vector>

std::vector<uint8_t> ReadFile(const std::string &filename) {
  FILE *file = fopen(filename.c_str(), "rb");
  if (file == NULL) {
    throw std::runtime_error("Error opening file: " + filename);
  }

  fseek(file, 0L, SEEK_END);
  size_t file_size = ftell(file);
  rewind(file);

  std::vector<uint8_t> buffer(file_size);
  size_t bytes_read = fread(buffer.data(), 1, file_size, file);
  if (bytes_read != file_size) {
    fclose(file);
    throw std::runtime_error("Error reading file: " + filename);
  }
  fclose(file);
  return buffer;
}

int main(int argc, char **argv) {
  if (argc != 2) {
    std::cerr << "usage: dump FILE" << std::endl;
    return EXIT_FAILURE;
  }
  const char *filename = argv[1];
  const std::vector<uint8_t> buf = ReadFile(filename);

  uint32_t first_int;
  memcpy(&first_int, buf.data(), sizeof(uint32_t));
  std::cout << std::hex << first_int << std::endl;

  return EXIT_SUCCESS;
}

编译并运行：

$ g++ ./dump.cc -o dump
$ ./dump ./dump.cc
636e6923

相比之下，这是

od -H

的前两行：

$ od -H ./dump.cc | head -n 2
0000000          636e6923        6564756c        73633c20        6f696474
0000020          69230a3e        756c636e        3c206564        74736f69

另一方面，这是 Vim 显示的内容：

00000000: 2369 6e63 6c75 6465 203c 6373 7464 696f  #include <cstdio
00000010: 3e0a 2369 6e63 6c75 6465 203c 696f 7374  >.#include <iost

我还在十六进制编辑器应用程序中打开了该文件，它以与 Vim 显示的相同字节序呈现：

 0    23 69 6e 63 6c 75 64 65 20 3c 63 73 74 64 69 6f 3e 0a 23 69
20    6e 63 6c 75 64 65 20 3c 69 6f 73 74 72 65 61 6d 3e 0a 23 69

为什么

od

和我的代码显示不同的字节序？如何让我的代码以这些十六进制编辑器显示的相同字节序读取？

我使用的是 Apple Silicon 上的 macOS 14；但是，我在 x86 上的 Windows 11 WSL 上运行的 Ubuntu 上观察到相同的行为。

提前谢谢您。

Answer 1

vim

和您的十六进制编辑器正在字节级别工作，按文件中的顺序显示它们。

od

解释字节序列。选项

-H

读取四个字节并将其解释为 32 位（四个字节）

int

。你必须知道，内存中

int

的字节存在不同的映射（就像在纸上写东西一样，L到R或R到L），基本上有两种：

BIG ENDIAN：字节在内存中按从高到低的顺序存储。
LITTLE ENDIAN ：字节在内存中从低字节到最高字节存储。

该文件以

23 69 6e 63

开头，但由于您的平台是小端字节序，因此

int

为 63.256^3 + 6e.256^2 + 69.256^1 + 23.256^0。

您可以使用

od

通过

od -tx1

逐字节读取。

为什么 od 和我的 C++ 代码读取的字节顺序与十六进制编辑器呈现的字节顺序不同？

问题描述投票：0回答：1

1个回答

最新问题

为什么 od 和我的 C++ 代码读取的字节顺序与十六进制编辑器呈现的字节顺序不同？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1