Backtrace 在 GDB 中不起作用,但在 LLDB 中起作用

问题描述 投票:0回答:1

我正在尝试调试 Nodejs 核心转储作为实验。我的主要目标是模拟生产问题并在核心转储中查看 V8 堆栈跟踪。因此我特意在

NodeJS
中编写了一段代码,在分叉的NodeJS进程中执行恒定的字符串连接,最后由于内存限制,它被中止并进行了核心转储。

到目前为止一切顺利,但是在安装

GDB
并提供转储文件后,它已加载,但是当我执行
bt
时,它只显示以下输出。

$ gdb /usr/local/bin/node -c core.f8f32091796c.node.1700129772.28
GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-alpine-linux-musl".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/local/bin/node...
[New LWP 28]
[New LWP 30]
[New LWP 29]
[New LWP 33]
[New LWP 31]
[New LWP 32]
[New LWP 34]
Core was generated by `/usr/local/bin/node template.js'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fa1a8f5a3f2 in setjmp () from /lib/ld-musl-x86_64.so.1
[Current thread is 1 (LWP 28)]
(gdb) bt
#0  0x00007fa1a8f5a3f2 in setjmp () from /lib/ld-musl-x86_64.so.1
#1  0x00007fa1a8f5a54d in raise () from /lib/ld-musl-x86_64.so.1
#2  0x00007fa1a8f5b9a9 in ?? () from /lib/ld-musl-x86_64.so.1
#3  0x00007fa1a8faae98 in ?? () from /lib/ld-musl-x86_64.so.1
#4  0x0000000000000000 in ?? ()
(gdb) info sharedlibrary
From                To                  Syms Read   Shared Object Library
0x00007fa1a8e317d0  0x00007fa1a8ebab61  Yes (*)     /usr/lib/libstdc++.so.6
0x00007fa1a8d5c2f0  0x00007fa1a8d6c551  Yes (*)     /usr/lib/libgcc_s.so.1
0x00007fa1a8f29070  0x00007fa1a8f70761  Yes (*)     /lib/ld-musl-x86_64.so.1
(*): Shared library is missing debugging information.
(gdb) info frame
Stack level 0, frame at 0x7ffce26c2038:
 rip = 0x7fa1a8f5a3f2 in setjmp; saved rip = 0x7fa1a8f5a54d
 called by frame at 0x7ffce26c2050
 Arglist at 0x7ffce26c2030, args:
 Locals at 0x7ffce26c2030, Previous frame's sp is 0x7ffce26c2040
 Saved registers:
  rip at 0x7ffce26c2038

如果我使用 LLDB 加载相同的转储文件,它会按预期工作并显示以下输出。

(lldb) target create "/usr/local/bin/node" --core "core.f8f32091796c.node.1700129772.28"
Core file '/dump/core.f8f32091796c.node.1700129772.28' (x86_64) was loaded.

(lldb) bt
* thread #1, name = 'node', stop reason = signal SIGABRT
  * frame #0: 0x00007fa1a8f5a3f2 ld-musl-x86_64.so.1`__setjmp + 118
    frame #1: 0x00007fa1a8f5a54d ld-musl-x86_64.so.1`raise + 64
    frame #2: 0x00007fa1a8f30f25 ld-musl-x86_64.so.1`abort + 14
    frame #3: 0x00005641a6ef5e55 node`node::Abort() + 37
    frame #4: 0x00005641a6e00d27 node`node::OnFatalError(char const*, char const*) + 283
    frame #5: 0x00005641a70ed0e2 node`v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) + 82
    frame #6: 0x00005641a70ed46f node`v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) + 847
    frame #7: 0x00005641a72bf365 node`v8::internal::Heap::FatalProcessOutOfMemory(char const*) + 21
...
...

两者都在同一个容器上运行,这是所使用的所有工具的版本信息。

$ cat /etc/*release
3.14.3
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.14.3
PRETTY_NAME="Alpine Linux v3.14"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://bugs.alpinelinux.org/"

$ lldb --version
lldb version 11.1.0

$ gdb --version
GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

$ node --version
v16.13.0

我想让它与GDB一起运行,我知道有

llnode
。因此,我深入研究了所有相关概念,例如
stack unwinding, .eh_frame, debug frames etc.
,甚至从 alpine 存储库编译并安装了 musl-dbg,但没有成功。 Gdb 抱怨 CRC 不匹配。但我不认为这是根本原因,因为
LLDB
无需安装任何调试包即可工作。

更新1

我已将容器升级到

node:20.10-alpine3.18
并安装了该 Alpine 版本上可用的最新 GDB 版本(13.1)。确保这不是
GDB 10.2
中的错误。复制了一个新的核心转储,但没有任何改变,GDB 仍然无法显示
backtrace
,但 LLDB 可以。

更新2

自从我升级了

alpine
版本后,我以为我可以找到匹配的
musl-dbg
软件包来摆脱
CRC mismatch
错误,并给了它最后一次机会并最终安装了匹配的软件包。惊喜它正在处理某些特定的帧,但无法显示某些帧,例如 LLDB。这是 GDB 工作输出的一部分。

(gdb) bt
#0  __restore_sigs (set=set@entry=0x7fffdc487220) at ./arch/x86_64/syscall_arch.h:40
#1  0x00007ff282f41e5c in raise (sig=sig@entry=6) at src/signal/raise.c:11
#2  0x00007ff282f14fa8 in abort () at src/exit/abort.c:11
#3  0x000055d41a8c3e86 in node::Abort() ()
#4  0x000055d41a7865cd in node::OOMErrorHandler(char const*, v8::OOMDetails const&) ()
#5  0x000055d41aaf8010 in v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) ()
#6  0x000055d41aaf82ff in v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) ()
#7  0x000055d41ad28917 in v8::internal::Heap::FatalProcessOutOfMemory(char const*) ()
#8  0x000055d41ad41d61 in v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) ()
...
...

很高兴看到它有效,但主要问题仍然存在,为什么 LLDB 可以在没有调试符号的情况下显示所有帧,而 GDB 却不能?

node.js linux gdb v8 lldb
1个回答
0
投票

我深入兔子洞并探索为什么它不起作用。基本上,GDB 使用基于

prologue
的展开策略,但错误地计算了帧,可能是由于纯粹用
setjmp
中的
assembly
编写的
musl-libc
函数。最后,我为
musl
框架编写了一个自定义展开器,并且它起作用了。这是代码,我也写了一个博客系列来解释问题和解决方案,不幸的是它不是英文的。

import re
import gdb
from gdb.unwinder import Unwinder


def debug(pc, current_rsp, offset, addr, frame_id, func):
    print('=============debug===========')
    print('{:<20}:{:<8}'.format('function',func))
    print('{:<20}:{:<8}'.format('pc', str(pc)))
    print('{:<20}:{:<8}'.format('current_rsp', str(current_rsp)))
    print('{:<20}:{:<8}'.format('offset', str(offset)))
    print('{:<20}:{:<8}'.format('return address', hex(addr)))
    print('{:<20}:{:<8}'.format('frame_id', str(frame_id)))

u64_ptr = gdb.lookup_type('unsigned long long').pointer()

class FrameID:
    def __init__(self, sp, pc):
        self.sp = sp
        self.pc = pc

    def __str__(self):
        return f'sp: {self.sp}, pc: {self.pc}'

class MuslUnwinder(Unwinder):
    def __init__(self):
        super().__init__("musl_unwinder")

    def is_musl_frame(self,pc):
        obj = gdb.execute("info symbol 0x%x" % pc, False, True)
        return "musl" in obj

    def dereference(self,adr):
        deref = gdb.parse_and_eval("0x%x" % adr).cast(u64_ptr).dereference()
        return deref

    def __call__(self, pending_frame):
        pc = pending_frame.read_register("pc")
        if not self.is_musl_frame(pc):
            return None
        asm = gdb.execute("disassemble 0x%x" % pc, False, True)
        lines = asm.splitlines()
        func = None
        args_bytes = 0
        locals_bytes = 0
        rbp_bytes = 0

        for line in lines:
            m = re.match('Dump of assembler code for function (.*):', line)
            if m:
                func = m.group(1)
            elif re.match('.*push[ ]*%', line):
                args_bytes += 8 
                if "rbp" in line:
                    rbp_bytes += 8
            elif m := re.match('.*sub[ ]*\\$0x([A-Fa-f0-9]+),%rsp', line):
                locals_bytes = int(m.group(1), 16)
                break

        offset = locals_bytes + args_bytes
        current_rsp = pending_frame.read_register("rsp")
        current_rbp = pending_frame.read_register("rbp")
        rsp = current_rsp + offset + 8
        return_addr = self.dereference(current_rsp + offset)
        frame_id = FrameID(rsp, pc)

        unwind_info = pending_frame.create_unwind_info(frame_id)
        unwind_info.add_saved_register("rsp", rsp)
        unwind_info.add_saved_register("rip", return_addr)

        if rbp_bytes > 0:
            saved_rbp = self.dereference(current_rsp+locals_bytes+rbp_bytes)
            unwind_info.add_saved_register("rbp", saved_rbp)
        else:
            unwind_info.add_saved_register("rbp", current_rbp)

        if gdb.parameter("verbose"):
            debug(pc, current_rsp, offset, return_addr, frame_id, func)

        return unwind_info

gdb.execute('set disassembly-flavor att')
gdb.unwinder.register_unwinder(None, MuslUnwinder(), replace=True)
gdb.invalidate_cached_frames()
© www.soinside.com 2019 - 2024. All rights reserved.