GNU as、puts 可以工作，但 printf 不行

Question

这是我现在正在使用的代码：

# file-name: test.s
# 64-bit GNU as source code.
    .global main

    .section .text
main:
    lea message, %rdi
    push %rdi
    call puts

    lea message, %rdi
    push %rdi
    call printf

    push $0
    call _exit

    .section .data
message: .asciz "Hello, World!"

编译指令：gcc test.s -o test

修订版1：

    .global main
    .section .text
main:
    lea message, %rdi
    call puts

    lea message, %rdi
    call printf

    mov $0, %rdi
    call _exit

    .section .data
message: .asciz "Hello, World!"

最终修订（作品）：

    .global main
    .section .text
main:
    lea message, %rdi
    call puts

    mov $0, %rax
    lea message, %rdi
    call printf

    # flush stdout buffer.
    mov $0, %rdi
    call fflush

    # put newline to offset PS1 prompt when the program ends.  
    # - ironically, doing this makes the flush above redundant and can be removed.
    # - The call to  fflush is retained for display and 
    #      to keep the block self contained.  
    mov $'\n', %rdi
    call putchar

    mov $0, %rdi
    call _exit

    .section .data
message: .asciz "Hello, World!"

我很难理解为什么对 put 的调用会成功，但对 printf 的调用会导致分段错误。

有人可以解释这种行为以及 printf 是如何被调用的吗？

提前谢谢。

总结：

printf 从 %rdi 获取打印字符串以及 %rax 的低位 DWORD 中的附加参数数量。
在将换行符放入 stdout 或调用 fflush(0) 之前，无法看到 printf 结果。

Answer 1

puts

隐式附加换行符，并且 stdout 是行缓冲的（默认情况下在终端上）。因此，来自

printf

的文本可能就位于缓冲区中。您对 _exit(2)

的调用不会刷新缓冲区，因为它是

exit_group(2)

系统调用，而不是

exit(3)

库函数。（请参阅下面我的代码版本）。

您对

printf(3)

的调用也不太正确，因为在调用没有 FP 参数的 var-args 函数之前，您没有将

%al

归零。（很好的收获@RossRidge，我错过了）。

xor %eax,%eax

是最好的方法。

%al

将不为零（来自

puts()

的返回值），这可能就是 printf 出现段错误的原因。我在我的系统上进行了测试，当堆栈未对齐时， printf 似乎并不介意（确实如此，因为您在调用它之前推送了两次，与 put 不同）。

（更新：较新版本的 glibc will printf 中的段错误即使 AL=0 也会出现未对齐的 RSP，因为 gcc 更多地使用 SSE 一次加载或存储 16 个字节，当然还利用了 ABI 保证对齐。请参阅 scanf 的示例以及如何避免它）

此外，您不需要该代码中的任何

push

说明。第一个参数放入

%rdi

。前 6 个整数参数存放在寄存器中，第 7 个及后面的存放在堆栈中。您还忽略了在函数返回后弹出堆栈，这只有效，因为您的函数在弄乱堆栈后永远不会尝试返回。

ABI 确实需要按 16B 对齐堆栈，而

push

是实现此目的的一种方法，实际上，在具有堆栈引擎的最新 Intel CPU 上，它实际上比

sub $8, %rsp

更高效，并且需要更少的字节。（请参阅 x86-64 SysV ABI，以及 x86 标签 wiki 中的其他链接）。

改进的代码：

.text
.global main
main:
    lea     message(%rip), %rdi     # or  mov $message, %edi  if you don't need the code to be position-independent: default code model has all labels in the low 2G, so you can use shorter 32bit instructions
    push    %rbx              # align the stack for another call
    mov     %rdi, %rbx        # save for later
    call   puts

    xor     %eax,%eax         # %al = 0 = number of FP args for var-args functions
    mov     %rbx, %rdi        # or mov %ebx, %edi  in a non-PIE executable, since the pointer is known to be pointing to static storage which will be in the low 2GiB
    call   printf

    # optionally putchar a '\n', or include it in the string you pass to printf

    #xor    %edi,%edi    # exit with 0 status
    #call  exit          # exit(3) does an fflush and other cleanup

    pop     %rbx         # restore caller's rbx, and restore the stack

    xor     %eax,%eax    # return 0 from main is equivalent to exit(0)
    ret

    .section .rodata     # constants should go in .rodata
message: .asciz "Hello, World!"

lea message(%rip), %rdi

很便宜，而且执行两次比使用

mov

的两个

%rbx

指令要少。但由于我们需要将堆栈调整 8B 以严格遵循 ABI 的 16B 对齐保证，因此我们不妨通过保存调用保留寄存器来实现。

mov reg,reg

非常便宜而且很小，因此利用呼叫保留的寄存器是很自然的。

现代发行版现在默认生成 PIE 可执行文件，因此即使对于静态存储，指针也是 64 位的。您需要 RIP 相关的 LEA，并且需要 64 位操作数大小来复制它们。请参阅 如何将函数或标签的地址加载到寄存器中，以了解与非 PIE 中的

mov $message, %edi

的情况。永远没有理由将

lea message, %rdi

与 32 位绝对寻址模式一起使用，只能使用 RIP 相对 LEA 或 mov-immediate。

GNU as、puts 可以工作，但 printf 不行

问题描述投票：0回答：1

1个回答

最新问题

GNU as、puts 可以工作，但 printf 不行

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1