libbpf / blazesym Goto Github PK

blazesym is a library for address symbolization and related tasks

License: BSD 3-Clause "New" or "Revised" License

Rust 94.93% C 5.04% Shell 0.03%

blazesym's Introduction

blazesym

blazesym is a library that can be used to symbolize addresses. Address symbolization is a common problem in tracing contexts, for example, where users want to reason about functions by name, but low level components report only the "raw" addresses (e.g., in the form of stacktraces).

In addition to symbolization, blazesym also provides APIs for the reverse operation: looking up addresses from symbol names. That can be useful, for example, for configuring breakpoints or tracepoints.

The library aims to provide a "batteries-included" experience. That is to say, it tries to do the expected thing by default. When offering such convenience comes at the cost of performance, we aim to provide advanced APIs that allow for runtime configuration of the corresponding features.

blazesym supports a variety of formats, such as DWARF, ELF, Breakpad, and Gsym (see below for an up-to-date list).

The library is written in Rust and provides a first class C API. This crate adheres to Cargo's semantic versioning rules. At a minimum, it builds with the most recent Rust stable release minus five minor versions ("N - 5"). E.g., assuming the most recent Rust stable is 1.68, the crate is guaranteed to build with 1.63 and higher.

Status

blazesym is at the core of Meta's internal continuous profiling solution, where it handles billions of symbolization requests per day.

The library is being actively worked on, with a major goal being stabilization of the API surface. Feel free to contribute with discussions, feature suggestions, or code contributions!

As alluded to above, the library provides support for a variety of formats. For symbolization specifically, the following table lays out what features each format supports and whether blazesym can currently use this feature:

Format	Feature	Supported by format?	Supported by blazesym?
Breakpad	symbol size	✔️	✔️
	source code location information	✔️	✔️
	inlined function information	✔️	✔️
ELF	symbol size	✔️	✔️
	source code location information	✖️	✖️
	inlined function information	✖️	✖️
DWARF	symbol size	✔️	✔️
	source code location information	✔️	✔️
	inlined function information	✔️	✔️
Gsym	symbol size	✔️	✔️
	source code location information	✔️	✔️
	inlined function information	✔️	✔️
Ksym	symbol size	✖️	✖️
	source code location information	✖️	✖️
	inlined function information	✖️	✖️

Here is rough roadmap of currently planned features (in no particular order):

OS Support

The library's primary target operating system is Linux (it should work on all semi-recent kernel versions and distributions).

MacOS is not actively supported at this point (though it may work), but we would be happy to incorporate pull requests to fix any potential short comings.

Windows is supported for file based symbolization (i.e., using one of the Breakpad, Elf, or Gsym symbolization sources). Standalone address normalization as well as process or kernel symbolization are not supported.

Build & Use

blazesym requires a standard Rust toolchain and can be built using the Cargo project manager (e.g., cargo build).

Rust

Consumption from a Rust project should happen via Cargo.toml:

[dependencies]
blazesym = "=0.2.0-rc.1"

For a quick set of examples please refer to the examples/ folder. Please refer to the documentation for a comprehensive explanation of individual types and functions.

C

The companion crate blazesym-c provides the means for interfacing with the library from C. Please refer to its README for usage details.

Command-line

The library also comes with a command line interface for quick experimentation and debugging. You can run it directly from the repository, e.g.:

cargo run -p blazecli -- symbolize elf --path /lib64/libc.so.6 00000000000caee1

Please refer to its README as well as the help text for additional information and usage instructions.

Statically linked binaries for various target triples are available on-demand here.

blazesym's People

Contributors

Stargazers

Watchers

blazesym's Issues

Use `/tmp/perf-[pid].map`

perf uses data from /tmp/perf-[pid].map files to symbolize stack traces -- this is required for JIT-ed languages to update the mappings dynamically. There are tools for such languages for generating these files, e.g. jcmd <java-pid> Compiler.perfmap or perf-map-agent for Java. It would be great if blazesym used these perf map files when symbolizing addresses for a process.

Migrate from `log` to `tracing`

We likely want to migrate over to tracing, as it provides spans, which help in providing necessary context to log statements. For our intents and purposes, both are pretty much source compatible and so the switch should be close to trivial (even more so given how few log statements we have to begin with).

Tests are failing in release mode

Here is a summary of running cargo test --release on c78091f:

running 25 tests
test dwarf::debug_info::tests::test_parse_abbrev ... ok
test dwarf::debug_info::tests::test_parse_cu_abbrevs ... ok
test dwarf::tests::test_decode_leb128 ... ok
test dwarf::tests::test_decode_words ... ok
test dwarf::tests::test_debug_info_parse_symbols ... FAILED
test dwarf::tests::test_run_debug_line_stmts_1 ... ok
test dwarf::tests::test_dwarf_resolver ... FAILED
test dwarf::tests::test_parse_aranges_elf ... FAILED
test dwarf::tests::test_run_debug_line_stmts_2 ... ok
test dwarf::tests::test_parse_debug_line_elf ... FAILED
test elf::tests::test_elf64_parser ... ok
test elf_cache::tests::test_cache ... ok
test elf::tests::test_elf_header_sections ... ok
test ksym::tests::ksym_resolver_load_find ... ignored, system-dependent; may fail
test ksym::tests::find_addresses_ksym ... ok
test tests::hello_world_stack ... ok
test elf::tests::test_elf64_find_address ... FAILED
test tests::load_symbolfilecfg_process ... ok
test elf::tests::test_elf64_symtab ... FAILED
test dwarf::debug_info::tests::test_unititer ... ok
test dwarf::tests::test_dwarf_find_addr_regex ... ok
test ksym::tests::find_addresses_ksym_exhaust ... ok
test ksym::tests::ksym_cache ... ok
test tests::load_symbolfilecfg_invalid_kernel ... ok
test tests::load_symbolfilecfg_processkernel ... ok

failures:

---- dwarf::tests::test_debug_info_parse_symbols stdout ----
thread 'dwarf::tests::test_debug_info_parse_symbols' panicked at 'assertion failed: result.is_ok()', src/dwarf.rs:1511:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- dwarf::tests::test_dwarf_resolver stdout ----
thread 'dwarf::tests::test_dwarf_resolver' panicked at 'index out of bounds: the len is 0 but the index is 0', src/dwarf.rs:929:27

---- dwarf::tests::test_parse_aranges_elf stdout ----
Custom { kind: NotFound, error: "Does not found the give section: .debug_aranges" }
thread 'dwarf::tests::test_parse_aranges_elf' panicked at 'assertion failed: r.is_ok()', src/dwarf.rs:1479:9

---- dwarf::tests::test_parse_debug_line_elf stdout ----
Custom { kind: NotFound, error: "Does not found the give section: .debug_line" }
thread 'dwarf::tests::test_parse_debug_line_elf' panicked at 'assertion failed: r.is_ok()', src/dwarf.rs:1355:9

---- elf::tests::test_elf64_find_address stdout ----
thread 'elf::tests::test_elf64_find_address' panicked at 'index out of bounds: the len is 7901 but the index is 7901', src/elf.rs:761:15

---- elf::tests::test_elf64_symtab stdout ----
thread 'elf::tests::test_elf64_symtab' panicked at 'index out of bounds: the len is 7901 but the index is 7901', src/elf.rs:761:15


failures:
    dwarf::tests::test_debug_info_parse_symbols
    dwarf::tests::test_dwarf_resolver
    dwarf::tests::test_parse_aranges_elf
    dwarf::tests::test_parse_debug_line_elf
    elf::tests::test_elf64_find_address
    elf::tests::test_elf64_symtab

test result: FAILED. 18 passed; 6 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.08s

I sincerely hope we can agree that tests shouldn't be failing depending on the build type used, that we should look into the issues and fix them

Use after free in C APIs

I added the following test:

    /// Make sure that we can symbolize an address using DWARF information.
    #[test]
    fn symbolize_dwarf() {
        let symbolizer = unsafe { blazesym_new() };
        let () = unsafe { blazesym_free(symbolizer) };

        let test_dwarf = Path::new(&env!("CARGO_MANIFEST_DIR"))
            .join("data")
            .join("test-dwarf.bin");
        let test_dwarf_c = CString::new(test_dwarf.to_str().unwrap()).unwrap();

        let elf_src = ManuallyDrop::new(blazesym_ssc_elf {
            file_name: test_dwarf_c.as_ptr(),
            base_address: 0,
        });
        let srcs = [blazesym_sym_src_cfg {
            src_type: blazesym_src_type::BLAZESYM_SRC_T_ELF,
            params: blazesym_ssc_params { elf: elf_src },
        }];

        let addrs = [0x2000100];
        let result = unsafe {
            blazesym_symbolize(
                symbolizer,
                srcs.as_ptr(),
                srcs.len() as u32,
                addrs.as_ptr(),
                addrs.len(),
            )
        };

        let () = unsafe { blazesym_result_free(result) };
        let () = unsafe { blazesym_free(symbolizer) };
    }

Running it with ASAN enabled we see:

running 1 test
=================================================================
==1288337==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000004010 at pc 0x55923d677ce2 bp 0x7fe3076fdf90 sp 0x7fe3076fdf88
READ of size 8 at 0x602000004010 thread T1
    #0 0x55923d677ce1 in blazesym_symbolize src/c_api.rs:452:31
    #1 0x55923d784a0f in blazesym::c_api::tests::symbolize_dwarf::h03f432b61e089263 src/c_api.rs:981:13
    #2 0x55923d62b3f6 in blazesym::c_api::tests::symbolize_dwarf::_$u7b$$u7b$closure$u7d$$u7d$::h3ae6da5a73188884 src/c_api.rs:961:26
    #3 0x55923d86e01b in core::ops::function::FnOnce::call_once::h541f9c807cb09f7c /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/core/src/ops/function.rs:250:5
    #4 0x55923d8f314e in core::ops::function::FnOnce::call_once::h9bda4eaab5290b1f /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/core/src/ops/function.rs:250:5
    #5 0x55923d8f314e in test::__rust_begin_short_backtrace::h23e123a039e26489 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/test/src/lib.rs:656:18
    #6 0x55923d8c4a0b in test::run_test::_$u7b$$u7b$closure$u7d$$u7d$::he950e05d6f1dbd00 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/test/src/lib.rs:647:30
    #7 0x55923d8c4a0b in core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h95303f0da586eff0 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/core/src/ops/function.rs:250:5
    #8 0x55923d8f21a9 in _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h32c17ce88d7259e0 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/alloc/src/boxed.rs:1988:9
    #9 0x55923d8f21a9 in _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h84f8fe1fd670d150 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/core/src/panic/unwind_safe.rs:271:9
    #10 0x55923d8f21a9 in std::panicking::try::do_call::h5642e33b54428bd9 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/std/src/panicking.rs:483:40
    #11 0x55923d8f21a9 in std::panicking::try::h963b9f810a28d717 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/std/src/panicking.rs:447:19
    #12 0x55923d8f21a9 in std::panic::catch_unwind::h069fcf3237999cbc /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/std/src/panic.rs:140:14
    #13 0x55923d8f21a9 in test::run_test_in_process::hd0cafc116ecbdf2f /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/test/src/lib.rs:679:27
    #14 0x55923d8f21a9 in test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::h57f67fc984194f0e /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/test/src/lib.rs:573:39
    #15 0x55923d8bf100 in test::run_test::run_test_inner::_$u7b$$u7b$closure$u7d$$u7d$::h20c3d9912fbc2df2 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/test/src/lib.rs:600:37
    #16 0x55923d8bf100 in std::sys_common::backtrace::__rust_begin_short_backtrace::h12f8af0e65ead28c /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/std/src/sys_common/backtrace.rs:121:18
    #17 0x55923d8c4a9a in std::thread::Builder::spawn_unchecked_::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h2bb3045f9353ab08 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/std/src/thread/mod.rs:558:17
    #18 0x55923d8c4a9a in _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h8a0ac497bae5b92b /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/core/src/panic/unwind_safe.rs:271:9
    #19 0x55923d8c4a9a in std::panicking::try::do_call::h93401540f615a976 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/std/src/panicking.rs:483:40
    #20 0x55923d8c4a9a in std::panicking::try::h71e8fea4583ef8bd /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/std/src/panicking.rs:447:19
    #21 0x55923d8c4a9a in std::panic::catch_unwind::hee7a8bad17b8acf3 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/std/src/panic.rs:140:14
    #22 0x55923d8c4a9a in std::thread::Builder::spawn_unchecked_::_$u7b$$u7b$closure$u7d$$u7d$::hd210af67f877171f /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/std/src/thread/mod.rs:557:30
    #23 0x55923d8c4a9a in core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hec93153cee8c3bd9 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/core/src/ops/function.rs:250:5
    #24 0x55923dfea912 in _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h3c13af4dc753235c /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/alloc/src/boxed.rs:1988:9
    #25 0x55923dfea912 in _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h0cec7c2c35746e2e /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/alloc/src/boxed.rs:1988:9
    #26 0x55923dfea912 in std::sys::unix::thread::Thread::new::thread_start::heea3e3f3ff9e2de9 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/std/src/sys/unix/thread.rs:108:17
    #27 0x7fe30a88cdcc in start_thread /usr/src/debug/glibc-2.35-22.fc36.x86_64/nptl/pthread_create.c:442:8
    #28 0x7fe30a91262f in __GI___clone3 /usr/src/debug/glibc-2.35-22.fc36.x86_64/misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

0x602000004010 is located 0 bytes inside of 8-byte region [0x602000004010,0x602000004018)
freed by thread T1 here:
    #0 0x55923d5e8242 in __interceptor_free /rustc/llvm/src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:52:3
    #1 0x55923d890b6a in alloc::alloc::dealloc::h36ce8fc304361093 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/alloc/src/alloc.rs:113:14
    #2 0x55923d890b6a in _$LT$alloc..alloc..Global$u20$as$u20$core..alloc..Allocator$GT$::deallocate::h0df795ebd87c5c41 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/alloc/src/alloc.rs:250:22
    #3 0x55923d6ee94a in alloc::alloc::box_free::h73f4d5af9b053ddd /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/alloc/src/alloc.rs:348:9
    #4 0x55923d87c04e in core::ptr::drop_in_place$LT$alloc..boxed..Box$LT$blazesym..c_api..blazesym$GT$$GT$::h6b79c4145a77090f /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/core/src/ptr/mod.rs:490:1
    #5 0x55923d8969f6 in core::mem::drop::h86e6f3e12804895b /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/core/src/mem/mod.rs:980:24
    #6 0x55923d675ea2 in blazesym_free src/c_api.rs:340:9
    #7 0x55923d62b3f6 in blazesym::c_api::tests::symbolize_dwarf::_$u7b$$u7b$closure$u7d$$u7d$::h3ae6da5a73188884 src/c_api.rs:961:26
    #8 0x55923d8f314e in core::ops::function::FnOnce::call_once::h9bda4eaab5290b1f /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/core/src/ops/function.rs:250:5
    #9 0x55923d8f314e in test::__rust_begin_short_backtrace::h23e123a039e26489 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/test/src/lib.rs:656:18

previously allocated by thread T1 here:
    #0 0x55923d5e836e in malloc /rustc/llvm/src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:69:3
    #1 0x55923d8902cb in alloc::alloc::alloc::h7c9c48dc58dd3295 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/alloc/src/alloc.rs:95:14
    #2 0x55923d8902cb in alloc::alloc::Global::alloc_impl::h65603d18c52e6bf0 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/alloc/src/alloc.rs:177:73
    #3 0x55923d88f874 in _$LT$alloc..alloc..Global$u20$as$u20$core..alloc..Allocator$GT$::allocate::he00423b073c74452 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/alloc/src/alloc.rs:237:9
    #4 0x55923d88f874 in alloc::alloc::exchange_malloc::h30e81eaf2f21a912 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/alloc/src/alloc.rs:326:18
    #5 0x55923d675cb5 in alloc::boxed::Box$LT$T$GT$::new::h8f21fe67477aa81a /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/alloc/src/boxed.rs:219:9
    #6 0x55923d675cb5 in blazesym_new src/c_api.rs:280:17
    #7 0x55923d783fdd in blazesym::c_api::tests::symbolize_dwarf::h03f432b61e089263 src/c_api.rs:962:35
    #8 0x55923d62b3f6 in blazesym::c_api::tests::symbolize_dwarf::_$u7b$$u7b$closure$u7d$$u7d$::h3ae6da5a73188884 src/c_api.rs:961:26
    #9 0x55923d8f314e in core::ops::function::FnOnce::call_once::h9bda4eaab5290b1f /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/core/src/ops/function.rs:250:5
    #10 0x55923d8f314e in test::__rust_begin_short_backtrace::h23e123a039e26489 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/test/src/lib.rs:656:18

Thread T1 created by T0 here:
    #0 0x55923d5d144c in __interceptor_pthread_create /rustc/llvm/src/llvm-project/compiler-rt/lib/asan/asan_interceptors.cpp:208:3
    #1 0x55923dfea77d in std::sys::unix::thread::Thread::new::h2860800939ea9971 /rustc/d7948c843de94245c794e8c63dd4301a78bb5ba3/library/std/src/sys/unix/thread.rs:87:19

SUMMARY: AddressSanitizer: heap-use-after-free src/c_api.rs:452:31 in blazesym_symbolize
Shadow bytes around the buggy address:
  0x0c047fff87b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff87c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff87d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff87e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff87f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c047fff8800: fa fa[fd]fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8810: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8820: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8830: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8840: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8850: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==1288337==ABORTING
error: test failed, to rerun pass `--lib`

Caused by:
  process didn't exit successfully: `target/x86_64-unknown-linux-gnu/debug/deps/blazesym-e4814ec934f8e72f symbolize_dwarf` (exit status: 1)

Test failures when `/proc/kallsyms` is not present

Seeing the following test failures when /proc/kallsyms is not present:

failures:

---- ksym::tests::ksym_cache stdout ----
thread 'ksym::tests::ksym_cache' panicked at 'assertion failed: resolver.is_ok()', src/ksym.rs:301:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- tests::load_symbolfilecfg_invalid_kernel stdout ----
fail to load the kernel image /dev/null
thread 'tests::load_symbolfilecfg_invalid_kernel' panicked at 'assertion failed: signatures.iter().any(|x| x.contains(\"KernelResolver\"))', src/lib.rs:1034:9

---- tests::load_symbolfilecfg_processkernel stdout ----
fail to load the kernel image /usr/lib/debug/boot/vmlinux-5.10.161-gentoo
thread 'tests::load_symbolfilecfg_processkernel' panicked at 'assertion failed: signatures.iter().any(|x| x.contains(\"KernelResolver\"))', src/lib.rs:1014:9


failures:
    ksym::tests::ksym_cache
    tests::load_symbolfilecfg_invalid_kernel
    tests::load_symbolfilecfg_processkernel

unsoundness in `extract_string` function

pub fn extract_string(raw: &[u8], off: usize) -> Option<&str> {
    let mut end = off;

    if off >= raw.len() {
        return None;
    }
    while raw[end] != 0 {
        end += 1;
    }
    let blk = raw[off..end].as_ptr() as *mut u8;
    let r = unsafe { String::from_raw_parts(blk, end - off, end - off) };
    let ret = Some(unsafe { &*(r.as_str() as *const str) }); // eliminate lifetime
    r.into_bytes().leak();
    ret
}

The function's implementation seems broken. First off, it should document safety invariants upheld in unsafe blocks with SAFETY comments. For String::from_raw_parts, these read:

    /// # Safety
    ///
    /// This is highly unsafe, due to the number of invariants that aren't
    /// checked:
    ///
    /// * The memory at `buf` needs to have been previously allocated by the
    ///   same allocator the standard library uses, with a required alignment of exactly 1.
    /// * `length` needs to be less than or equal to `capacity`.
    /// * `capacity` needs to be the correct value.
    /// * The first `length` bytes at `buf` need to be valid UTF-8.
    ///
    /// Violating these may cause problems like corrupting the allocator's
    /// internal data structures. For example, it is normally **not** safe to
    /// build a `String` from a pointer to a C `char` array containing UTF-8
    /// _unless_ you are certain that array was originally allocated by the
    /// Rust standard library's allocator.
    ///
    /// The ownership of `buf` is effectively transferred to the
    /// `String` which may then deallocate, reallocate or change the
    /// contents of memory pointed to by the pointer at will. Ensure
    /// that nothing else uses the pointer after calling this
    /// function.

Evaluating these:

The memory at buf needs to have been previously allocated by the same allocator the standard library uses, with a required alignment of exactly 1.
- this is with almost 100% certainty violated: the memory has not been allocated with the standard allocator; it's also not communicated to callers in any way. So at best this function is unsound and a ticking time bomb, but most likely plain undefined behavior.
capacity needs to be the correct value
- Questionable; could be the case but unlikely. Again, unsoundness at the very least as it's certainly not ensured/enforced anywhere.
The first length bytes at buf need to be valid UTF-8
- not ensured anywhere. Again, callers may randomly uphold this invariant, but in all likelihood it's violated in all but a few places that got lucky. It's also not communicated to callers in any way.

inspect: Provide access to all discovered symbols

As it stands, the inspect API and its Inspector type specifically only support targeted symbol lookup by name. For some use cases it may be necessary to know all available symbols in advance. We should extend the API to support such use case.

Test `ksym::tests::ksym_resolver_load_find` is failing

Test ksym::tests::ksym_resolver_load_find is failing for me:

---- ksym::tests::ksym_resolver_load_find stdout ----
thread 'ksym::tests::ksym_resolver_load_find' panicked at 'assertion failed: `(left == right)`
  left: `"bpf_prog_6deef7357e7b4530"`,
 right: `"__kstrtab_gen_pool_first_fit_align"`', src/ksym.rs:201:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I am at 7bfc553, but issue seems to have been there since the very first commit.

Support symbol demangling

We should evaluate whether to add support to demangling of symbols from languages such as C++ or Rust as part of the symbolization process. It's not entirely clear whether it makes sense to integrate that into the library, given that it basically could be done entirely as a close-to-trivial post processing step (with the likes of https://crates.io/crates/symbolic-demangle), but it may be beneficial to do so if we aim to be a completely batteries-included solution.

Support DWARF v5

See subject. Seems to be not implemented currently:

blazesym/src/dwarf/debug_info.rs

Line 1040 in 442f295

todo!(); // BlazeSym supports only v4 so far.

Support/honor mini debug information?

We should consider whether we want/need to support mini debug information. Potentially relevant: iovisor/bcc#4250

Cache demangling step

Context

The demangling step is done on the fly after retrieving the symbols from the cache.
This implies extra CPU costs when requesting demangled symbols.

Proposed solution

Add a demangled symbol entry within the caching structure.

Incomplete ELF support

It appears as if the ELF parser does not handle all possible values of e_shstrndx correctly:

https://github.com/libbpf/blazesym/blob/master/src/elf/parser.rs#L192-L207

ELF(5) states:

       e_shstrndx
              This  member  holds the section header table index of the entry associ‐
              ated with the section name string table.  If the file  has  no  section
              name string table, this member holds the value SHN_UNDEF.

              If  the  index  of  section name string table section is larger than or
              equal to SHN_LORESERVE (0xff00), this member holds SHN_XINDEX  (0xffff)
              and  the real index of the section name string table section is held in
              the sh_link member of the initial entry in section header table.   Oth‐
              erwise, the sh_link member of the initial entry in section header table
              contains the value zero.

So we may have to special case SHN_XINDEX.

ELF symbolization of `__libc_init_first` from glibc doesn't work anymore

Until 1b68789 blazesym successfully symbolizes an address from __libc_init_first when symbolizing an ELF executable, but starting with the specified commit it doesn't.

Steps to reproduce:

Compile the following simple C program (I used gcc 11.3.0 or clang 14.0.0 on Ubuntu 22.04.1) and launch it:

int main() {
    while (1) {}
}

Build profile.c from libbpf-bootstrap which at the moment uses blazesym v0.2.0-alpha.2 (i.e. it doesn't include 1b68789). Run profile for a second. The output will contain something like:

COMM: test.out (pid=10139) @ CPU 1
No Kernel Stack
Userspace:
  0 [<0000556769c1c131>] main+0x556769c1b008
  1 [<00007f3953829d90>] __libc_init_first+0x7f3953800090

Update blazesym submodule in libbpf-bootstrap to 1b68789 (or any later commit), rebuild profile.c and run it for a second again. Now the output will contain something like this:

COMM: test.out (pid=11843) @ CPU 2
No Kernel Stack
Userspace:
  0 [<000055e729872131>] main+0x55e729871008
  1 [<00007f4d50e29d90>]

As you can see, 00007f4d50e29d90 which got symbolized to __libc_init_first+0x7f3953800090 before, doesn't get symbolized now.

Program aborted by allocation failed when running after 30min.

I flowed the doc to use blazesym's C version to create some libbpf tools. One occasionally crashed by allocation failed, the error msg may like this:

memory allocation of 2656 bytes failed
Aborted

I checked my c code but found no mem-leak points, and this msg seems to be printed by Rust. So I suspect it comes from this line:

blazesym/src/lib.rs

Line 1188 in 47afb3d

alloc(Layout::from_size_align(buf_size + mem::size_of::<u64>(), 8).unwrap());

Ref to https://moshg.github.io/rust-std-ja/std/alloc/trait.GlobalAlloc.html#errors, the alloc may fail when memory is exhausted and cause abort, am I right? Is it more suitable to check the null pointer and let the user handle this situation?

ELF 32 bit support

We may want to add ELF 32 bit support.

Support inlined function lookup

We should support symbolization of inlined functions, information of which may be contained in DWARF from what I gather.

Support split debug information

We should support split debug information.

Perform fewer steps on "local" system for remote symbolization

We may want to move the virtual-address to normalized-address translation from the local system to the remote. Access to map_files may not always be possible and this is arguably unnecessary work done on the local system (which is generally assumed to be resource constrained).

Kernel callstacks not show correct

hi, I am using version 885b4d5297c1633444317d439b312626920b5486 from libbpf-bootstrap with C API , when I set cfg like this:

src.src_type = SRC_T_KERNEL;
src.params.kernel.kallsyms = NULL;
src.params.kernel.kernel_image = NULL;

It shows like this:

fail to load the kernel image /usr/lib/debug/boot/vmlinux-4.18.0-22-generic
  0 [<ffffffff9dabc7a1>]
  1 [<ffffffff9e3f46ac>]
  2 [<ffffffff9e3f8989>]
  3 [<ffffffff9e3f8a73>]
  4 [<ffffffff9dcb6836>]
  5 [<ffffffff9dcb77d6>]
  6 [<ffffffff9dcb8473>]
  7 [<ffffffff9da042aa>]
  8 [<ffffffff9e400088>]

The /proc/kallsyms exists, but /usr/lib/debug/boot/vmlinux-4.18.0-22-generic don't. Also I know on some machine, there is no kernel_image, such as Xavier. When I try to not give src.params.kernel.kernel_image and run, it crashed.

Seems kernel sym trans failed and can it run with only provide kallsyms path?

Add more context to errors

With df959af we have a custom error type that allows for additional context to be provided by the library (e.g., on top of system errors such as ENOENT). We should make use of this capability to improve the quality of error messages in general. Basically, when there is a reason to believe that a low-level error is not particularly useful without context, add some via ErrorExt::context or ErrorExt::with_context. Here is an example:

blazesym/src/dwarf/units.rs

Lines 89 to 94 in cd6456c

 let dw_unit = sections.unit(header).with_context(|| { 

 format!( 

 "failed to retrieve DWARF unit for unit header @ {}", 

 format_offset(header.offset()) 

 ) 

 })?;

On this path it would probably make sense to add further context higher up the stack so that users know exactly which DWARF file the error occurred on.

Among other things, doing so could help with error reports such as:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error: No such file or directory (os error 2)', benches/symbolize.rs:70:10
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

This task is a good opportunity to become familiar with the code base and it can be done in stages (every improvement helps).

Discrepancy between data availability checks, data reads, and size calculations

From https://github.com/libbpf/blazesym/blob/50408a3b55c341ef954465a678ee334956c6399c/src/dwarf/debug_info.rs#L257-261

DW_FORM_block4 => {
    if 2 <= data.len() {                                             // Wrong length check
        let bytes = decode_uword(data);                              // Correct read on insufficiently sized data can cause panics
        let fullsize = bytes as usize + 2;                           // Wrong size calculation
        if fullsize <= data.len() {
            Some((AttrValue::Bytes(&data[2..fullsize]), fullsize))   // Wrong buffer inference causing data corruption here and down the line because of wrong reported length
        } else {
            None
        }
    } else {
        None
    }
}

Picture book example of why I have been pushing for fixing this mess that is having one function read a bunch of bytes and then having the caller adjust pointer and offsets and sizes based on implicit knowledge. Three times we hard code the number 2 here. All of them wrong. It should be in exactly a single line of code (either explicit or implicit). If it was copied and pasted wrongly there that would be noticed by a test and it would be clear that there is a single problem and not land mines scattered all over the place. Here some stuff blows up on some of the inputs, if at all, others only down the line with a data corruption, if it gets noticed at all. Such unnecessary bugs couldn't happen had we spared a thought on proper API design.

Compare performance to `llvm-symbolizer`

We should capture some numbers on how blazesym compares to llvm-symbolizer in terms of performance (symbolization time + memory usage).

Niceify `Debug` representation of various types

Now that we migrated to tracing (but not only for that), we should make a bit more of an attempt to niceify our Debug impls some more. E.g., currently we have a bunch of _non_exhaustive attributes show up and types such as our normalization meta data is printed as Binary(Binary {...}), which seems like duplicate information.

A good first start would be to try and look at everything showing up in traces directly and looking weird or being reasonably shortenable:

$ RUST_LOG=trace RUST_LOG_SPAN_EVENTS=full cargo test -- --nocapture
> 2023-06-08T17:28:42.272177Z  INFO symbolize{src=Process(Process { pid: Slf, _non_exhaustive: () }) addrs=[93873688344336, 93873692388576]}: blazesym::symbolize::symbolizer: close time.busy=391ms time.idle=34.5µs

But note that in some cases adjusting the Debug impl may not be the right call, if otherwise useful information is lost. Instead, it may be more appropriate to just print individual fields instead of relying on Debug altogether (check e.g., 7222551). Make a judgement call.

This issue specifically is mostly about Debug impls itself, though, which could show up in traces in the future or may just be accessible from user code.

Support DWARF v3

I just played around attempting to test DWARF v2 support. It turns out at least on gcc, that basically implies usage of extensions from later standards. So what ends up happening is that when a binary is created with v2 DWARF, it contains v3 bits and we don't support that.

There exists the -gstrict-dwarf flag, but it seemingly does nothing (?). The bottom line is that supporting DWARF v2 but explicitly not v3 could still mean that we have big gaps in what we actually support.

I didn't see the above issue with clang (it did not appear to be emitting v3 bits when asked for v2), though there we see other issues...

Support `debuginfod`

We should consider supporting debuginfod (https://sourceware.org/elfutils/Debuginfod.html) for pulling binaries/debug information based on build IDs (and whatever else there may be).

Invalid source files are not handled properly

When I create a symbolizer, pointing it to a non-existent source, it will just silently not do anything.

  let file = Path::new("/blahblahblubb").to_path_buf();  // File does not actually exist
  let features = [
      SymbolizerFeature::DebugInfoSymbols(true),
      SymbolizerFeature::LineNumberInfo(true),
  ];
  let sources = [SymbolSrcCfg::Elf {
    file_name: file,
    base_address: 0,
  }];
  let symbolizer = BlazeSymbolizer::new_opt(&features).unwrap();

  let results = symbolizer
      .symbolize(&sources, XXXX]);
  // here results will just always be empty; no indication that a source was not even present

We most definitely want to inform users about an invalid (non-existent) file name. I can't come up with a single reason why it would be sane to treat a non-existent file as expected and just silently do the wrong thing.

Provide better error messages in ResolverMap

ResolverMap rethrows errors returned by resolvers to callers. It is difficult to diagnose. Adding filenames and other information to error messages will be useful.

Infer build ID section based on section type

We should address the following TODO:

blazesym/src/normalize/normalizer.rs

Lines 66 to 70 in d4ff9f0

 /// Attempt to read an ELF binary's build ID. 

 // TODO: Currently look up is always performed based on section name, but there 

 // is also the possibility of iterating notes and checking checking 

 // Elf64_Nhdr.n_type for NT_GNU_BUILD_ID, specifically. 

 fn read_build_id(path: &Path) -> Result<Option<Vec<u8>>> {

Should be a straight forward and isolated change, easy to test.

inspect: Proper file offset reporting for DWARF

Currently file offsets are not reported correctly when using DWARF. DWARF does not carry file offset information from what I understand, but we can bridge the gap by looking up the address of the symbol and then mapping that address to a file offset. The problem is that we need the complete ELF file for doing so. Even without true split-DWARF support, there is no requirement a DWARF file contain ELF bits necessary for running, so we may need to require additional information (path to the ELF file) from the user.

enhance blazecli with more functionality

blazecli is a command line interface to blazesym. Please refer to https://github.com/libbpf/blazesym/tree/main/cli.

Currently we have only hooked up whatever functionality we needed so far (which was symbolization of ELF and in a process). That means there is plenty of other stuff that we can add. Some examples (list is not necessarily exhaustive):

~~Easiest would be the addition of the symbolize dwarf sub-command. Should be not much more difficult than 03c0174~~
We want an inspect sub-command that exposes the functionality of the similarly named blazesym module
Similarly, normalization functionality would be a nice addition.
Build ID reporting would be nice to have as well, see https://docs.rs/blazesym/0.2.0-alpha.4/blazesym/helper/fn.read_elf_build_id.html

Support breakpad format?

We may want to consider adding support for breakpad files (also).

Unnecessary rebuilds in build script

I noticed that our benchmark files are repeatedly generated every time we benchmark. Given the size of the artifacts used, that process takes a while. It's also not expected, but it's not clear why exactly that is (it's possible that the dependencies we have cannot properly be expressed in build scripts at all). We should see if that can be fixed so that we only re-generate those files if something changed.

Add `*_sorted` variants of symbolization APIs

We should add *_sorted variants similar to what we have for the address normalization APIs

blazesym/src/normalize/normalizer.rs

Lines 358 to 378 in 21498b6

 /// Normalize `addresses` belonging to a process. 

 /// 

 /// Normalize all `addrs` in a given process. The `addrs` array has to 

 /// be sorted in ascending order or an error will be returned. 

 /// 

 /// Unknown addresses are not normalized. They are reported as 

 /// [`Unknown`] meta entries in the returned [`NormalizedUserAddrs`] 

 /// object. The cause of an address to be unknown (and, hence, not 

 /// normalized), could have a few reasons, including, but not limited 

 /// to: 

 /// - user error (if a bogus address was provided) 

 /// - they belonged to an ELF object that has been unmapped since the 

 /// address was captured 

 /// 

 /// The process' ID should be provided in `pid`. To normalize addresses of the 

 /// calling processes, `0` can be provided as a sentinel for the current 

 /// process' ID. 

 /// 

 /// Normalized addresses are reported in the exact same order in which the 

 /// non-normalized ones were provided. 

 pub fn normalize_user_addrs_sorted(

to the symbolization logic here:

blazesym/src/symbolize/symbolizer.rs

Lines 208 to 210 in 21498b6

 /// Symbolize the given list of user space addresses in the provided 

 /// process. 

 fn symbolize_user_addrs(&self, addrs: &[Addr], pid: Pid) -> Result<Vec<Vec<SymbolizedResult>>> {

That would mean introducing and implementing such a new function, adding tests, and exposing it through C bindings.

Support remote symbolization

Symbolization is a potentially resource intensive process and it may not be feasible to perform it on the very system where addresses are recorded. Embedded devices, for example, with limited disk space and CPU capacity, cannot afford to perform symbolization on the device itself: debug information can be large and would be prohibitive to disk space usage and so it is unlikely to be stored on the device itself and the process of symbolization is likely to impact other running applications negatively, would be taking excessive amounts of time, or both.

For that and other reasons, we'd like to support remote (or off-device) symbolization. The below (preliminary) API proposal flushes out the idea somewhat.

The local side normalizes a list of addresses using the normalize_addresses function:

pub type Address = usize;

mod address_meta {
    use super::*;

    /// A GNU build ID.
    type BuildId = String;


    /// Meta information about a Linux kernel address.
    #[derive(Clone, Debug)]
    pub struct Kernel {
        /// The kernel's release string (i.e., roughly what `uname -r` reports).
        ///
        /// This is a free-form string.
        pub release: String,
        /// The kernel binary's build ID, if available.
        pub build_id: Option<BuildId>,
        /// The struct is non-exhaustive and open to extension.
        #[doc(hidden)]
        pub _non_exhaustive: (),
    }


    /// Meta information about a Linux kernel module address.
    #[derive(Clone, Debug)]
    pub struct KernelModule {
        /// The name of the kernel module.
        pub name: String,
        /// The kernel module's version string.
        ///
        /// This is a free-form string. It may resemble bits of `modinfo`'s
        /// `vermagic` field.
        pub version: String,
        /// The kernel's release string (i.e., roughly what `uname -r` reports).
        ///
        /// This is a free-form string.
        pub kernel_release: String,
        /// The kernel module's build ID, if available.
        pub build_id: Option<BuildId>,
        /// The struct is non-exhaustive and open to extension.
        #[doc(hidden)]
        pub _non_exhaustive: (),
    }

    /// Meta information about a user space binary (executable or shared object).
    #[derive(Clone, Debug)]
    pub struct Binary{
        /// The canonical absolute path to the binary, including its name.
        pub path: PathBuf,
        /// The binary's build ID, if available.
        pub build_id: Option<BuildId>,
        /// The struct is non-exhaustive and open to extension.
        #[doc(hidden)]
        pub _non_exhaustive: (),
    }

    /// Meta information about an address that could not be determined to be
    /// belonging to a specific component.
    #[derive(Clone, Debug)]
    pub struct Unknown {
        /// The struct is non-exhaustive and open to extension.
        #[doc(hidden)]
        pub _non_exhaustive: (),
    }
}


/// Meta information for an address.
#[derive(Clone, Debug)]
#[non_exhaustive]
pub enum AddressMeta {
    Kernel(address_meta::Kernel),
    KernelModule(address_meta::KernelModule),
    Binary(address_meta::Binary),
    Unknown(address_meta::Unknown),
}


/// A type capturing normalized addresses along with captured meta data.
#[derive(Clone, Debug)]
pub struct NormalizedAddresses {
    /// Normalized addresses along with an index into `meta` for retrieval of
    /// the corresponding [`AddressMeta`] information.
    addresses: Vec<(Address, usize)>,
    /// Meta information about the normalized addresses.
    meta: Vec<AddressMeta>,
}


/// Normalize `addresses` belonging to either a process or the kernel.
///
/// If the provided addresses belong to a process, its PID should be provided in
/// `pid`. For kernel addresses, `pid` may be `None`.
///
/// Normalized addresses are reported in the exact same order in which the
/// non-normalized ones were provided.
pub fn normalize_addresses<A>(addresses: A, pid: Option<u32>) -> Result<NormalizedAddresses, Error>
where
    A: IntoIterator<Item = Address>,
{
    // ...
}

The resulting normalized addresses together with information about their owners have to be conveyed to the remote for the actual symbolization to happen. The transfer of this information is outside of blazesym‘s purview and a responsibility of the user. For Rust users, we will provide serde derives for convenient serialization & deserialization.

On the remote system, blazesym‘s existing BlazeSymbolizer can be used to perform the symbolization using the newly added symbolize_normalized method:

/// A trait for resolving meta information for an address to a [`SymResolver`] to
/// use for the actual symbolization.
pub trait AddressMetaResolver {
    /// The type of [symbol resolver](SymResolver) returned by the
    /// `resolve_address_meta` method.
    type Resolver: SymResolver;

    /// Resolve the provided [`AddressMeta`] to a [symbol resolver](SymResolver) to use.
    fn resolve_address_meta(&self, address_meta: &AddressMeta) -> Result<Self::Resolver, Error>;
}

/// BlazeSymbolizer provides an interface to symbolize addresses with
/// a list of symbol sources.
pub struct BlazeSymbolizer {
   // ...
}

impl BlazeSymbolizer {
    // ...

    /// Symbolize a list of normalized addresses with associated meta
    /// information.
    ///
    /// Please refer to [`normalize_addresses`] for information on how to
    /// normalize addresses.
    ///
    /// The function returns one `Vec<SymbolizedResult>` for each address passed
    /// in, in the order they were passed in. Multiple `SymbolizedResult`
    /// candidates may be present in case an address is ambiguous owing to
    /// compiler optimizations.
    pub fn symbolize_normalized<R>(
        &self,
        addresses: &NormalizedAddresses,
        address_meta_resolver: R,
    ) -> Result<Vec<Vec<SymbolizedResult>>, Error>
    where
        R: AddressMetaResolver,
    {
        // ...
    }
}

The API should also allow us to enable debuginfod support, by having an implementor of AddressMetaResolver that speaks the corresponding protocol and fetches debug information from a service using it.

Move C API into separate crate?

I haven't prototyped any of it or thought it through completely, but I think we should consider moving everything making up the C API into a separate (in-workspace) crate, blazesym-c (or similar). That would allow us to version the two independently and while I don't think we will make use of that capability anytime soon, it seems cleaner and potentially useful down the line.
Focusing the C bits elsewhere has other benefits such as easily linkable documentation, reduced compile time of the main crate, etc.

Remove `symbolize::Builder::enable_debug_syms()`?

https://docs.rs/blazesym/0.2.0-alpha.6/blazesym/symbolize/struct.Builder.html#method.enable_debug_syms

This entire property seems rather...useless. In the current design "debug_syms" are only available for ELF (if present). We should probably move it into the Elf source object instead. But also we had some thoughts on decoupling ELF & DWARF and making the latter a first class thing, which may make more sense for split DWARF cases.

Correctly handle `STT_GNU_IFUNC` ELF symbols

See discussion here.

https://sourceware.org/glibc/wiki/GNU_IFUNC and https://www.airs.com/blog/archives/403 have some information.

I haven't fully read through everything, but this may be an easy task. I suspect that the loader does all the magic for us and we'd just need to have sure to handle STT_GNU_IFUNC in addition to STT_FUNC, but we'd need to double check.

DWARF source code line reporting not working

Source code line reporting does not appear to be working properly for DWARF. Assuming #67, I add

--- tests/blazesym.rs
+++ tests/blazesym.rs
@@ -54,4 +54,5 @@ fn symbolize_dwarf() {

     let result = results.first().unwrap();
     assert_eq!(result.symbol, "factorial");
+    assert_eq!(result.line_no, 7);
 }

And the assertion fails:

---- symbolize_dwarf stdout ----
thread 'symbolize_dwarf' panicked at 'assertion failed: `(left == right)`
  left: `0`,
 right: `7`', tests/blazesym.rs:56:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Yet, the file seems to contain the proper information:

$ llvm-dwarfdump --name=factorial data/test-dwarf.bin
data/test-dwarf.bin:    file format elf64-x86-64

0x000000a3: DW_TAG_subprogram
              DW_AT_external    (true)
              DW_AT_name        ("factorial")
              DW_AT_decl_file   ("data/test-stable-addresses.c")
              DW_AT_decl_line   (7)
              DW_AT_decl_column (0x01)
              DW_AT_prototyped  (true)
              DW_AT_type        (0x000000d3 "unsigned int")
              DW_AT_low_pc      (0x0000000002000100)
              DW_AT_high_pc     (0x000000000200012b)
              DW_AT_frame_base  (DW_OP_call_frame_cfa)
              DW_AT_GNU_all_tail_call_sites     (true)
              DW_AT_sibling     (0x000000d3)

Support `DebugFission` format?

See https://gcc.gnu.org/wiki/DebugFission. Unclear whether it's more than split DWARF and whether it's covered by gimli already. Needs some more research.

Edit: As per a discussion, DebugFission seems to be the "codename" for dwo-style split DWARF. DebugFission-dwp then is for dwp-style split DWARF.

CU's start address should be in strict order: [6] 0x84700 [7] 0x84700

When using this lib's C version, I meat this panic error

thread '<unnamed>' panicked at 'CU's start address should be in strict order: [6] 0x84700 [7] 0x84700', src/dwarf.rs:802:9
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
fatal runtime error: failed to initiate panic, error 5
Aborted

It seems caused by

blazesym/src/dwarf.rs

Line 898 in 47afb3d

"CU's start address should be in strict order: [{}] 0x{:x} [{}] 0x{:x}",

, how to solve this?

version: 885b4d5

API for single address symbolization

We may want to expose an API for symbolizing a single address. It would be more convenient than dealing with a Vec in some cases and it would not necessitate an allocation.

cc @salvatorebenedetto

Audit and minimize unsafe usage

If we want to make any claims about memory safety of the crate, we need to have confidence in the unsafe functionality we use, as it escapes most guarantees that Rust otherwise provides (including memory safety and data race freedom), otherwise we may end up getting hammered with CVEs, which would all but destroy any hopes of getting wider traction with the community.

This is obviously a longer term effort, but I think we roughly should:

identify the unsafe operations we perform (I've started that with #39)
minimize their number with some transformations (e.g., in #34 we simply replaced existing functionality with safe counterparts)
audit them and add SAFETY comments as to why we think they are safe

Symbolization of addresses in APKs

It's already covered in the "roadmap", we would like to support symbolization of addresses in Android APK files.

Support pre-populating caches (pre-parsing data structures etc.)

Currently we only populate caches as part of the symbolization (or inspection) process. However, it may be beneficial to add support for pre-populating caches, so that symbolization (or inspection...) can be as fast as possible, even on the first run. This is more of an advanced use case, but it could be useful in some setups.

A bit of a complication may be that for DWARF at least, parsing of data structures is on-demand. So we only cache data structures necessary to satisfy a specific request. But with pre-population there is no request. So presumably we'd need to parse everything. But I'd think that's fine.

Unsoundness in C enum handling

The handling of all enums in our C API bindings is unsound. E.g.,

blazesym/src/c_api.rs

Lines 303 to 310 in 9fe143a

 match x.feature { 

 blazesym_feature_name::LINE_NUMBER_INFO => { 

 SymbolizerFeature::LineNumberInfo(unsafe { x.params.enable }) 

 } 

 blazesym_feature_name::DEBUG_INFO_SYMBOLS => { 

 SymbolizerFeature::DebugInfoSymbols(unsafe { x.params.enable }) 

 } 

 }

C code could set any value, including those not representable by Rust. Discussed in many places (while also being common sense...), among others here: rust-lang/rust#36927

List supported features

We should have some kind of outline of what features we support and which ones are planned/work in progress in the README, so that users don't have to rely on trial and error to figure that out (and/or comb through the code).

Unsoundness in `ElfParser::get_all_program_headers()` function

blazesym/src/elf/parser.rs

Lines 566 to 575 in fde2052

 pub fn get_all_program_headers(&self) -> Result<&[Elf64_Phdr], Error> { 

 self.ensure_phdrs()?; 

 let phdrs = unsafe { 

 let me = self.backobj.as_ptr(); 

 let phdrs_ref = (*me).phdrs.as_mut().unwrap(); 

 phdrs_ref 

 }; 

 Ok(phdrs) 

 }

The function creates a mutable reference from a shared context. That means we can end up with multiple "exclusive" references to the same object, which is not allowed.

Design multi-threading support

We should come up with a way to support multi-threading where beneficial, to speed things up. See #93 for when we removed earlier support for that in DWARF parsing code, which wasn't accessible by users.

README snippets seem to be outdated

As mentioned in #27, it appears as if the snippets illustrating the crate's API in the README are broken. We should probably update them to work against the new API and make sure to test everything in CI.

	let dw_unit = sections.unit(header).with_context(\|\| {
	format!(
	"failed to retrieve DWARF unit for unit header @ {}",
	format_offset(header.offset())
	)
	})?;

	/// Attempt to read an ELF binary's build ID.
	// TODO: Currently look up is always performed based on section name, but there
	// is also the possibility of iterating notes and checking checking
	// Elf64_Nhdr.n_type for NT_GNU_BUILD_ID, specifically.
	fn read_build_id(path: &Path) -> Result<Option<Vec<u8>>> {

	/// Normalize `addresses` belonging to a process.
	///
	/// Normalize all `addrs` in a given process. The `addrs` array has to
	/// be sorted in ascending order or an error will be returned.
	///
	/// Unknown addresses are not normalized. They are reported as
	/// [`Unknown`] meta entries in the returned [`NormalizedUserAddrs`]
	/// object. The cause of an address to be unknown (and, hence, not
	/// normalized), could have a few reasons, including, but not limited
	/// to:
	/// - user error (if a bogus address was provided)
	/// - they belonged to an ELF object that has been unmapped since the
	/// address was captured
	///
	/// The process' ID should be provided in `pid`. To normalize addresses of the
	/// calling processes, `0` can be provided as a sentinel for the current
	/// process' ID.
	///
	/// Normalized addresses are reported in the exact same order in which the
	/// non-normalized ones were provided.
	pub fn normalize_user_addrs_sorted(

	/// Symbolize the given list of user space addresses in the provided
	/// process.
	fn symbolize_user_addrs(&self, addrs: &[Addr], pid: Pid) -> Result<Vec<Vec<SymbolizedResult>>> {

	match x.feature {
	blazesym_feature_name::LINE_NUMBER_INFO => {
	SymbolizerFeature::LineNumberInfo(unsafe { x.params.enable })
	}
	blazesym_feature_name::DEBUG_INFO_SYMBOLS => {
	SymbolizerFeature::DebugInfoSymbols(unsafe { x.params.enable })
	}
	}

	pub fn get_all_program_headers(&self) -> Result<&[Elf64_Phdr], Error> {
	self.ensure_phdrs()?;

	let phdrs = unsafe {
	let me = self.backobj.as_ptr();
	let phdrs_ref = (*me).phdrs.as_mut().unwrap();
	phdrs_ref
	};
	Ok(phdrs)
	}