为什么这个简单的代码不能一致地编译？

Question

以下代码在 g++、clang 和 Visual Studio 上编译：

#define HEX(hex_)  0x##hex_
int main()
{
    return HEX(BadC0de);
}

与此修改一样，使用 C++14 数字分隔符：

    return HEX(1'Bad'C0de);

但是这不会在 g++ 或 clang 上编译（它可以在 Visual Studio 上运行）：

#define HEX(hex_)  0x##hex_
int main()
{
    return HEX(A'Bad'C0de);
}

g++ 输出：

<source>:4:1: warning: multi-character character constant [-Wmultichar]
    4 |     return HEX(A'Bad'C0de);
      | ^  
<source>: In function 'int main()':
<source>:4:17: error: expected ';' before user-defined character literal
    4 |     return HEX(A'Bad'C0de);
      |                 ^~~~~~~~~
<source>:1:25: note: in definition of macro 'HEX'
    1 | #define HEX(hex_)   0x##hex_
      |                         ^~~~
<source>:4:17: error: unable to find character literal operator 'operator""C0de' with 'int' argument
    4 |     return HEX(A'Bad'C0de);
      |                 ^~~~~~~~~
<source>:1:25: note: in definition of macro 'HEX'
    1 | #define HEX(hex_)   0x##hex_
      |                         ^~~~

更新：有趣的是，预处理器输出是

    return 0xA'Bad'C0de;

它确实编译，所以显然独立预处理器在这里的工作方式与统一预处理器不同。

这在 g++/clang 上也失败了，但有不同的错误：

    return HEX(Bad'C0de);

g++ 输出：

<source>:4:19: warning: missing terminating ' character
    4 |     return HEX(Bad'C0de);
      |                   ^
<source>:5:2: error: unterminated argument list invoking macro "HEX"
    5 | }
      |  ^
<source>: In function 'int main()':
<source>:4:12: error: 'HEX' was not declared in this scope
    4 |     return HEX(Bad'C0de);
      |            ^~~
<source>:4:15: error: expected ';' at end of input
    4 |     return HEX(Bad'C0de);
      |               ^
      |               ;
<source>:4:15: error: expected '}' at end of input
<source>:3:1: note: to match this '{'
    3 | {
      | ^

更新：在这种情况下，预处理器在解析 HEX() 参数之前停止。

我愿意相信这是一个 g++ bug，但考虑到 Visual Studio 预处理器在历史上的不合规性有多严重，也许这只是一厢情愿的想法。事实上，最后一个程序不仅在 g++ 上失败，它还会在 Visual Studio 上触发内部编译器错误（至少在 godbolt.org 上）！ msvc 输出：

<source>(4): error C2001: newline in constant <source>(4): fatal error C1057: unexpected end of file in macro expansion Internal Compiler Error in Z:\opt\compiler-explorer\windows\19.00.24210\bin\amd64\cl.exe. You will be prompted to send an error report to Microsoft later. INTERNAL COMPILER ERROR in 'Z:\opt\compiler-explorer\windows\19.00.24210\bin\amd64\cl.exe' Please choose the Technical Support command on the Visual C++ Help menu, or open the Technical Support help file for more information

天真地，我希望所有编译器都将所有文本传递给宏替换，然后再尝试解释其含义（毕竟它是一个预处理器！）；只有在 ## 连接之后，我才会期望检查标记的含义。 （是的，我知道一些基本的解析恰好匹配括号、方括号等，因此它们中的逗号不会分割参数，但我不希望它扩展到任何其他语言结构。）

标准对这些程序有什么规定吗？它们是否在某种程度上不符合规定，或者它们合法但编译器有缺陷？

Answer 1

现在的问题来自这样一个事实：

0xA'Bad'C0de

是单个预处理标记，但

A'Bad'C0de

不是——它是三个预处理标记（

、

'Bad'

和

C0de

）和标记粘贴运算符

##

定义为仅粘贴两个相邻的标记。在这种情况下，标记化阶段取决于已定义的宏以及它们可能执行的操作。

修复此问题需要进行重大规范更改，并且需要跟踪直接相邻的预处理标记与非直接相邻的标记（它们之间有空格或注释的标记），并让

##

运算符可能粘贴其他直接相邻的标记当这有意义的时候。

这仍然会遇到像

HEX(A'B)

这样的问题——你如何判断

何时应该成为多字符字符常量标记的一部分，而不是结束宏参数列表？

为什么这个简单的代码不能一致地编译？

问题描述投票：0回答：1

1个回答

最新问题

为什么这个简单的代码不能一致地编译？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1