您好,世界汇编语言与Linux系统调用?

问题描述 投票:-1回答:1
  1. 我知道int 0x80在Linux中正在中断。但是,我不明白这段代码是如何工作的。它返回什么吗?

  2. $ - msg代表什么?

global _start

section .data
    msg db "Hello, world!", 0x0a
    len equ $ - msg

section .text
_start:
    mov eax, 4
    mov ebx, 1
    mov ecx, msg
    mov edx, len
    int 0x80 ;What is this?
    mov eax, 1
    mov ebx, 0
    int 0x80 ;and what is this?
linux assembly x86 nasm system-calls
1个回答
0
投票

[How does $ work in NASM, exactly?解释了$ - msg如何让NASM为您计算字符串长度作为汇编时间常数,而不是对其进行硬编码。


[我最初是为SO Docs (topic ID: 1164, example ID: 19078)编写了其余的内容,重写了@runner的一个基本的,评论较少的示例。与以前移动的part of my answer to another question相比,这看起来是个更好的放置位置SO文档实验结束后将其添加。


进行系统调用是通过将参数放入寄存器,然后运行int 0x80(32位模式)或syscall(64位模式)来完成的。 What are the calling conventions for UNIX & Linux system calls on i386 and x86-64The Definitive Guide to Linux System Calls

int 0x80视为一种跨越用户/内核特权边界“调用”内核的方法。内核会根据int 0x80执行时寄存器中的值进行填充,然后最终返回。返回值以EAX表示。

当执行到达内核的入口点时,它将查看EAX并根据EAX中的调用号将其分派到正确的系统调用。来自其他寄存器的值作为函数args传递给该系统调用的内核处理程序。 (例如eax = 4 / int 0x80将使内核调用其sys_write内核函数,从而实现POSIX write系统调用。)

并且另请参阅What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?-该答案包括对内核入口点中由int 0x80“调用”的asm的查看。 (也适用于32位用户空间,而不仅仅是不应该使用int 0x80的64位用户空间。)>


如果您还不了解低级Unix系统编程,则可能只想在asm中编写带有args并返回值(或通过指针arg更新数组)的函数,然后从C或C ++程序中调用它们。然后,您可以只担心学习如何处理寄存器和内存,而无需学习POSIX系统调用API和使用它的ABI。这也使将代码与C实现的编译器输出进行比较变得非常容易。编译器通常在编写高效代码方面做得很好,但是are rarely perfect

libc为系统调用提供了包装函数,因此编译器生成的代码将是call write,而不是直接用int 0x80调用(或如果您关心性能,则是sysenter)。 (在x86-64代码中,为use syscall for the 64-bit ABI。)另请参见syscall

系统调用记录在第2节手册中,如syscalls(2)。有关libc包装函数与基础Linux系统调用之间的区别,请参见“注意”部分。请注意,syscalls(2)的包装器是write(2),而不是首先刷新stdio缓冲区和其他清除操作的write(2) ISO C函数。还有一个sys_exit系统调用,即_exit(2)_exit(2)实际上使用了它,因为单线程进程没有任何缺点。

此代码进行了2次系统调用:

  • exit(3)
  • exit(3)
  • 我评论得很厉害(以至于它开始掩盖了实际代码而没有突出显示颜色语法)。这是一种尝试,旨在向所有初学者指出问题,而不是通常应如何注释代码。

exit_group

注意,我们

将字符串长度存储在数据存储器中的任何位置。它是一个汇编时间常数,因此将其作为立即操作数比加载更为有效。我们也可以使用三个ends all threads指令将字符串数据压入堆栈,但是过多地膨胀代码大小并不是一件好事。

在Linux上,您可以将该文件另存为exit(3),并且使用这些命令从中构建32位可执行文件

sys_write(1, "Hello, World!\n", sizeof(...));

请参阅sys_write(1, "Hello, World!\n", sizeof(...));,以获取有关将汇编构建为32或64位静态或动态链接的Linux可执行文件的更多详细信息,以用于带有GNU sys_exit(0);指令的NASM / YASM语法或GNU AT&T语法。 (要点:在64位主机上构建32位代码时,请确保使用sys_exit(0);或同等功能,否则在运行时会出现令人困惑的问题。)


您可以使用section .text ; Executable code goes in the .text section global _start ; The linker looks for this symbol to set the process entry point, so execution start here ;;;a name followed by a colon defines a symbol. The global _start directive modifies it so it's a global symbol, not just one that we can CALL or JMP to from inside the asm. ;;; note that _start isn't really a "function". You can't return from it, and the kernel passes argc, argv, and env differently than main() would expect. _start: ;;; write(1, msg, len); ; Start by moving the arguments into registers, where the kernel will look for them mov edx,len ; 3rd arg goes in edx: buffer length mov ecx,msg ; 2nd arg goes in ecx: pointer to the buffer ;Set output to stdout (goes to your terminal, or wherever you redirect or pipe) mov ebx,1 ; 1st arg goes in ebx: Unix file descriptor. 1 = stdout, which is normally connected to the terminal. mov eax,4 ; system call number (from SYS_write / __NR_write from unistd_32.h). int 0x80 ; generate an interrupt, activating the kernel's system-call handling code. 64-bit code uses a different instruction, different registers, and different call numbers. ;; eax = return value, all other registers unchanged. ;;;Second, exit the process. There's nothing to return to, so we can't use a ret instruction (like we could if this was main() or any function with a caller) ;;; If we don't exit, execution continues into whatever bytes are next in the memory page, ;;; typically leading to a segmentation fault because the padding 00 00 decodes to add [eax],al. ;;; _exit(0); xor ebx,ebx ; first arg = exit status = 0. (will be truncated to 8 bits). Zeroing registers is a special case on x86, and mov ebx,0 would be less efficient. ;; leaving out the zeroing of ebx would mean we exit(1), i.e. with an error status, since ebx still holds 1 from earlier. mov eax,1 ; put __NR_exit into eax int 0x80 ;Execute the Linux function section .rodata ; Section for read-only constants ;; msg is a label, and in this context doesn't need to be msg:. It could be on a separate line. ;; db = Data Bytes: assemble some literal bytes into the output file. msg db 'Hello, world!',0xa ; ASCII string constant plus a newline (0x10) ;; No terminating zero byte is needed, because we're using write(), which takes a buffer + length instead of an implicit-length string. ;; To make this a C string that we could pass to puts or strlen, we'd need a terminating 0 byte. (e.g. "...", 0x10, 0) len equ $ - msg ; Define an assemble-time constant (not stored by itself in the output file, but will appear as an immediate operand in insns that use it) ; Calculate len = string length. subtract the address of the start ; of the string from the current position ($) ;; equivalently, we could have put a str_end: label after the string and done len equ str_end - str 跟踪其执行,以查看其执行的系统调用

push imm32

将其与动态链接过程的跟踪进行比较(例如gcc从hello.c或从运行Hello.asm获得的信息,以了解在动态链接和C库启动的幕后发生了多少事情。

stderr上的跟踪和stdout上的常规输出都将到达此处的终端,因此它们会干扰nasm -felf32 Hello.asm # assemble as 32-bit code. Add -Worphan-labels -g -Fdwarf for debug symbols and warnings gcc -static -nostdlib -m32 Hello.o -o Hello # link without CRT startup code or libc, making a static binary 系统调用。如果需要,可以重定向或跟踪到文件。请注意,这使我们能够轻松地查看syscall返回值,而不必添加代码来打印它们,并且实际上比使用常规调试器(如gdb)单步执行并查看this answer更容易。有关gdb asm技巧,请参见as的底部。 (标记Wiki的其余部分包含指向良好资源的链接。)

此程序的x86-64版本将非常相似,将相同的args传递到相同的系统调用,只是在不同的寄存器中,并使用-m32而不是strace。有关编写字符串并以64位代码退出的工作示例,请参见$ strace ./Hello execve("./Hello", ["./Hello"], [/* 72 vars */]) = 0 [ Process PID=4019 runs in 32 bit mode. ] write(1, "Hello, world!\n", 14Hello, world! ) = 14 _exit(0) = ? +++ exited with 0 +++ 的底部。


相关:strace /bin/ls。您可以运行的最小二进制文件只是进行exit()系统调用。那是关于最小化二进制大小,而不是源大小,甚至只是实际运行的指令数。

© www.soinside.com 2019 - 2024. All rights reserved.