m4b / goblin Goto Github PK
View Code? Open in Web Editor NEWAn impish, cross-platform binary parsing crate, written in Rust
License: MIT License
An impish, cross-platform binary parsing crate, written in Rust
License: MIT License
This will make them accessible to consumers who use don't use std
(because why not), and will also remove the warnings for those who use std
but not endian_fd
(binary loaders, dryad), and will also make the api more usable in general.
Both functions do not verify that fileoff and fileoff+filesize are within bytes range, which may result in a panic for invalid files.
Due to the log
crate, std
is required. To fix this, remove the use_std
feature from log
.
The idea is that in a non-pure setting, you'll typically always want endian_fd reading, which means we can use the generic Reader trait from std.
This will enable unit testing the endian fd readers by passing Cursor'd byte arrays.
We may be able to drop the no_endian_fd feature flag if this pans out the way I think it can.
Similar to ELF's symbol iterator, we have a result on invocation, and elements are resultness, because we know the size beforehand.
Import {
name: "ORDINAL 0",
dll: "WS2_32.dll",
ordinal: 0,
offset: 62264,
rva: 0,
size: 4
},
this ("ORDINAL 0") is basically a stupid hack, but it makes working with Import much nicer. Alternatively, could make an enum, but, i dunno that just annoys me for some reason
I was investigating my options for parsing EXE files to determine what environment to auto-fill in my experimental game launcher (ie. DOSBox, Wine, Wine+qemu-user, Mono, etc.) and I managed to trigger some panics in goblin.
====================
TESTING WITH GOBLIN:
====================
unknown magic: ./hello_owatcom_com.com
Parse error: ./hello_owatcom_os2v2.exe => Invalid magic number: 0x1
pe: ./hello_pacific.exe
Parse error: ./hello_owatcom_dos.upx.exe => requested range [309100590..309100594) from object of len 6881
Parse error: ./hello_owatcom_dos4g.exe => Invalid magic number: 0x1
Parse error: ./hello_owatcom_windows.exe => Invalid magic number: 0x0
pe: ./hello_mingw32.exe
pe: ./hello_csharp_exe_itanium.exe
pe: ./hello_owatcom_win95.exe
Parse error: ./hello_owatcom_dos4g.upx.exe => Invalid magic number: 0x1
elf: ./hello_gcc.x86
unknown magic: ./hello_djgpp.upx.coff.exe
unknown magic: ./hello_owatcom_com.upx.com
unknown magic: ./hello_dev86.upx.com
unknown magic: ./hello_dev86.com
pe: ./hello_mingw64.exe
Parse error: ./hello_djgpp.exe => Invalid magic number: 0x0
PANICKED on hello_mingw32.upx.exe
Parse error: ./hello_owatcom_dos.exe => Invalid magic number: 0x20
PANICKED on hello_owatcom_win95.upx.exe
pe: ./hello_owatcom_nt.exe
Parse error: ./hello_owatcom_win386.exe => Invalid magic number: 0x0
Parse error: ./hello_djgpp.upx.exe => Invalid magic number: 0x0
Parse error: ./hello_owatcom_dos4gnz.exe => Invalid magic number: 0x1
PANICKED on hello_mingw64.upx.exe
Parse error: ./hello_owatcom_os2.exe => Invalid magic number: 0x0
pe: ./hello_csharp_exe_arm.exe
pe: ./hello_csharp_exe_x64.exe
pe: ./hello_csharp_exe_x86.exe
elf: ./hello_gcc.x86_64
PANICKED on hello_owatcom_nt.upx.exe
Parse error: ./hello_pacific.upx.exe => requested range [309100590..309100594) from object of len 4527
Here's a backtrace which appears to represent all of the panics:
ssokolow@monolith test_exes [rusty-core] % RUST_BACKTRACE=1 ./pe-test/target/debug/goblin-test hello_owatcom_nt.upx.exe
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', /checkout/src/libcore/option.rs:329
stack backtrace:
0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
at /checkout/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
1: std::sys_common::backtrace::_print
at /checkout/src/libstd/sys_common/backtrace.rs:71
2: std::panicking::default_hook::{{closure}}
at /checkout/src/libstd/sys_common/backtrace.rs:60
at /checkout/src/libstd/panicking.rs:355
3: std::panicking::default_hook
at /checkout/src/libstd/panicking.rs:371
4: std::panicking::rust_panic_with_hook
at /checkout/src/libstd/panicking.rs:549
5: std::panicking::begin_panic
at /checkout/src/libstd/panicking.rs:511
6: std::panicking::begin_panic_fmt
at /checkout/src/libstd/panicking.rs:495
7: rust_begin_unwind
at /checkout/src/libstd/panicking.rs:471
8: core::panicking::panic_fmt
at /checkout/src/libcore/panicking.rs:69
9: core::panicking::panic
at /checkout/src/libcore/panicking.rs:49
10: <core::option::Option<T>>::unwrap
at /checkout/src/libcore/macros.rs:21
11: goblin::pe::import::SyntheticImportDirectoryEntry::parse
at /home/ssokolow/.cargo/registry/src/github.com-1ecc6299db9ec823/goblin-0.0.10/src/pe/import.rs:125
12: goblin::pe::import::ImportData::parse
at /home/ssokolow/.cargo/registry/src/github.com-1ecc6299db9ec823/goblin-0.0.10/src/pe/import.rs:158
13: goblin::pe::PE::parse
at /home/ssokolow/.cargo/registry/src/github.com-1ecc6299db9ec823/goblin-0.0.10/src/pe/mod.rs:80
14: goblin::parse
at /home/ssokolow/.cargo/registry/src/github.com-1ecc6299db9ec823/goblin-0.0.10/src/lib.rs:276
15: goblin_test::run
at ./pe-test/src/goblin.rs:17
16: goblin_test::main
at ./pe-test/src/goblin.rs:36
17: __rust_maybe_catch_panic
at /checkout/src/libpanic_unwind/lib.rs:98
18: std::rt::lang_start
at /checkout/src/libstd/panicking.rs:433
at /checkout/src/libstd/panic.rs:361
at /checkout/src/libstd/rt.rs:57
19: main
20: __libc_start_main
21: <unknown>
While this renders it unsuitable for my project (the mere fact that Goblin is capable of dying at an unwrap
(when the other PE parser I've tried so far simply used Result
to indicate a parse failure) indicates that using it in my project would cause me more worry than simply writing my own MZ/NE/PE parser with Nom), I thought you'd want to know so you can fix the problem for others.
If you want to re-create my test binaries, the source materials are in the test_exes folder of ssokolow/game_launcher and build.sh
contains instructions for the simplest, easiest way to install the requisite packages on a *buntu Linux 14.04 LTS machine like mine.
To reiterate what build.sh
says, all compilers are optional, so producing just the binaries which caused panics here should only require apt-get install upx-ucl mingw-w64
and then downloading and unpacking OpenWatcom.
This deeply annoys me:
let peek = goblin::peek(&mut fd)?;
if let Hint::Unknown(magic) = peek {
println!("unknown magic: {:#x}", magic)
} else {
let bytes = { let mut v = Vec::new(); fd.read_to_end(&mut v)?; v };
match peek {
Hint::Elf(_) => {
I think there's an architectural problem and an ergonomics problem here.
Unknown
variant - we already peeked to make sure the magic is good! - everything else is just a parse error for that respective file formatSo, what I want is my cake and eat it too:
I want the peek to ensure the magic is correct, and route to the correct binary parser, and return this result, without the Unknown
variant, without temporarily allocations (or the full fd read is passed through).
@philipc @endeav0r you seem to be using goblin::Object
as clients, does this bother you?
Anyone who happens to be watching/reading this, I'm open to proposals how to fix this make it nicer.
Afaics, being flexible w.r.t. the bytes + reading is going to be tricky; first thing that comes to my mind is some kind of closure style or an inout, like:
// has no Unknown variant, and is also totally me just randomly typing stuff
let object: Option<Result<Object>> = Object::parse_and_fill(fd, &mut bytes);
This would be a breaking change, but I think its important to get right sooner rather than later
Hello,
I ran into a problem, while parsing notes in core dump of my test program. The error occurs from bad alignment. My alignment is equal to 0.
match alignment {
4 => bytes.gread_with::<Nhdr32>(offset, ctx.le)?.into(),
// this is a guess; i haven't seen gcc/clang compilers emit 64-bit notes, and i don't have any non gcc/clang compilers
8 => bytes.gread_with::<Nhdr64>(offset, ctx.le)?.into(),
_ => return Err(error::Error::Malformed(format!("Notes has unimplemented alignment requirement: {:#x}", alignment)))
}
But readelf works perfectly fine and parses core dump correctly. I looked into source code of readelf and found this:
/* NB: Some note sections may have alignment value of 0 or 1. gABI
specifies that notes should be aligned to 4 bytes in 32-bit
objects and to 8 bytes in 64-bit objects. As a Linux extension,
we also support 4 byte alignment in 64-bit objects. If section
alignment is less than 4, we treate alignment as 4 bytes. */
if (align < 4)
align = 4;
else if (align != 4 && align != 8)
{
warn (_("Corrupt note: alignment %ld, expecting 4 or 8\n"),
(long) align);
return FALSE;
}
As I see, match should be like this:
match alignment {
0 ... 4 => bytes.gread_with::<Nhdr32>(offset, ctx.le)?.into(),
// this is a guess; i haven't seen gcc/clang compilers emit 64-bit notes, and i don't have any non gcc/clang compilers
8 => bytes.gread_with::<Nhdr64>(offset, ctx.le)?.into(),
_ => return Err(error::Error::Malformed(format!("Notes has unimplemented alignment requirement: {:#x}", alignment)))
}
Hello.
Is it possible to generate such a file that can be correctly parsed both with PE and ELF readers? Even if the executable would do nothing like int main() {}
.
@m4b is this something you would be interested in having in goblin
?
So I've been holding off on error libraries, but Failure
genuinely looks exciting and cool.
I am ok with dynamic allocation, since the parser allocates already, and parsing binaries generally won't be in a hot loop, and if it is, it'll likely be dwarfed by io reads anyway.
This might also be an opportunity to provide better error messages because errors in goblin aren't so great. but this is mostly because of scroll error messages sucking pretty hard (but that's because it supports no-std).
For gimli-rs/object#45, we need to be able to parse all the formats when using #[no_std]
. Currently, a lot of the parsing requires allocation. Completely avoiding allocations is too much work for now, and probably not a good use of time. And anyway, we can still use the alloc
crate with no_std
(but this does require building with nightly currently).
So I propose we add an alloc
feature that is midway between no_std
and std
. This feature will cover everything that uses allocations, and so mostly what will be left in std
will be things that use std::fs
or std::io
.
I've done enough to verify that this approach works, so assuming this is acceptable, now I just need to clean it up and submit a PR.
One question I have is about the endian_fd
feature. What exactly is this meant to cover? The readme says it 'parses according to the endianness in the binary', but it doesn't cover code such as
https://github.com/m4b/goblin/blob/master/src/elf/section_header.rs#L490. Currently it gates the entire mach/pe/archive formats, and it also requires 'std'. I want to relax this restriction as part of adding the alloc
feature, but I'm not sure which parts of mach/pe/archive should still require it.
similar to symbol, add relocation iterators, for maximum laziness. This will give immediate perf results for clients reading large binaries, but not needing the relocations
Some well-formed PE may have its export data directory with an empty table of NumberOfFunctions
(as well as NumberOfNames
, AddressOfFunctions
, etc.), for example the Export Data Directory
of apisetschema.dll
(a dll for API redirection from Windows 7) on my machine Windows7.SP1
:
But the PE parser of goblin will refuse such a PE, that is because of some checks, for example when parsing the AddressOfNames
:
let name_pointer_table_offset = &mut utils::find_offset_or(export_directory_table.name_pointer_rva as usize, sections, &format!("Cannot map export_directory_table.name_pointer_rva ({:#x}) into offset", export_directory_table.name_pointer_rva))?;
because export_directory_table.name_pointer_rva
will be zero then the including function parse
returns immediately with an Err(_)
.
The same with AddressOfNameOrdinals
and AddressOfFunctions
tables, a quick-and-dirty fix might be checking the value of export_directory_table.number_of_name_pointers
and of export_directory_table.address_table_entries
with 0
before parsing offsets of these tables:
let mut export_name_pointer_table: ExportNamePointerTable = Vec::with_capacity(number_of_name_pointers);
let mut export_ordinal_table: ExportOrdinalTable = Vec::with_capacity(number_of_name_pointers);
if number_of_name_pointers > 0 {
let name_pointer_table_offset = &mut utils::find_offset_or(export_directory_table.name_pointer_rva as usize, sections, &format!("Cannot map export_directory_table.name_pointer_rva ({:#x}) into offset", export_directory_table.name_pointer_rva))?;
for _ in 0..number_of_name_pointers {
export_name_pointer_table.push(bytes.gread_with(name_pointer_table_offset, scroll::LE)?);
}
let export_ordinal_table_offset = &mut utils::find_offset_or(export_directory_table.ordinal_table_rva as usize, sections, &format!("Cannot map export_directory_table.ordinal_table_rva ({:#x}) into offset", export_directory_table.ordinal_table_rva))?;
for _ in 0..number_of_name_pointers {
export_ordinal_table.push(bytes.gread_with(export_ordinal_table_offset, scroll::LE)?);
}
}
let mut export_address_table: ExportAddressTable = Vec::with_capacity(address_table_entries);
if address_table_entries > 0 {
let export_address_table_offset = utils::find_offset_or(export_directory_table.export_address_table_rva as usize, sections, &format!("Cannot map export_directory_table.export_address_table_rva ({:#x}) into offset", export_directory_table.export_address_table_rva))?;
let export_end = export_rva + size;
let offset = &mut export_address_table_offset.clone();
for _ in 0..address_table_entries {
let rva: u32 = bytes.gread_with(offset, scroll::LE)?;
if utils::is_in_range(rva as usize, export_rva, export_end) {
export_address_table.push(ExportAddressTableEntry::ForwarderRVA(rva));
} else {
export_address_table.push(ExportAddressTableEntry::ExportRVA(rva));
}
}
}
Clients should never have to import, and impl error routes for scroll when they use goblin (unless of course they also use scroll, but that is orthogonal).
It is, in effect, in goblin an internal library.
Comparing goblin
's implementation and my current working memory of dyld
, I have two observations:
LC_MAIN
and LC_UNIXTHREAD
both provide entrypoint locations, but the environments provided by each are not interchangeable. (See dyld.cpp
, dyldStartup.s
.) Last time I needed to know the entrypoint of a Mach-O executable, I had to know which kind it was; struct MachO
should probably retain this distinction.LC_UNIXTHREAD
thread state? You know, the arch-specific thread states which are not currently handled by goblin
? Good news: they don't matter in the slightest. dyld
uses only the instruction pointer, and the rest are entirely discarded. This makes them unusable in practice, and thus they're always zero.It would pass out references to Sym
s, which are byte-casted from a backing &[u8]
; and have a new -> Result
api (which validates the bounds). We'd need two for elf32 and elf64 probably; or we can initialize as 32 or 64 bit (or pass a container context, which has less boolean blindness). This will be tricky and annoying I think due to type name punning stuff, so probably easiest to just add two typed versions and re-export it
Similar to strtab, I would also want it to implement Index
, so it can be literally drop-in replaced in code that previously used a &[Sym]
.
Another approach could just provide a newtype wrapper on &[Sym]
that validates the backing bytes and a count provided, and then Deref
s to a &[Sym]
so we get indexing for free.
Lots of options.
If we want to get fancy-pantsy, we might be able generify it to both ELF and Mach symbols, and have it return references via something like:
get::<Symbol>(index) -> Result<&Symbol>
but that might not be worth the effort.
Hey @m4b, I saw you added structure definitions for 32 bit ELF files. Is there any plan to implement Elf::from_fd
for 32 bits, too? I only had a brief look at it, but it looks like you'd need to duplicate most of the code for the 64 ELF files.
tests, more, add them.
Many overflow issues should disappear once upgraded to latest scroll
/cc @sanxiyn
I wrote some code to read the PDB70 info out of a PE using goblin, to implement the equivalent of symstore.exe
. I found that CodeviewPDB70DebugInfo::signature
wasn't really that useful as raw bytes, since it's intended to be a GUID
in little-endian byte order. I wound up pulling in byteorder
and writing a little function like:
fn sig_to_uuid(sig: &[u8; 16]) -> Result<Uuid, Error> {
let mut rdr = Cursor::new(sig);
Ok(Uuid::from_fields(rdr.read_u32::<LittleEndian>()?,
rdr.read_u16::<LittleEndian>()?,
rdr.read_u16::<LittleEndian>()?,
&sig[8..])?)
}
...but it seems likely that anyone touching this data would need the same thing. Since you're already using scroll
here it ought to be trivial to do this. I don't think the uuid
crate is a particularly big dependency (and it's no-std by default, you have to enable the use_std
feature explicitly).
What about rlib
support used by Rust itself. It's mostly an ar
afiak and the ar
crate supports Linux variant of it (no support for macOS version for some reason).
This will essentially make the entire crate zero-allocation, lazy, and parallelizable.
See lazy_transducer documentation for information if anyone wants to tackle this.
I've recently embedded C into a Rust project that needed the dl_iterate_phdr()
interface.
From the OpenBSD manual:
SYNOPSIS
#include <link.h>
int
dl_iterate_phdr(int (*callback)(struct dl_phdr_info *, size_t, void*),
void *data);
DESCRIPTION
The dl_iterate_phdr() function iterates over all shared objects loaded
into a process's address space, calling callback for each shared object,
passing it information about the object's program headers and the data
argument.
The interface is somewhat portable, but there are slightly different semantics across platforms.
I would have written my interfacing code in Rust, but decided against it due to the need to define (and keep in sync) the required ELF structs.
Since goblin is fundamentally concerned with these structures, I wonder if goblin would be a good place to implement a rust interface to dl_iterate_phdr()
?
Thanks
Currently it's tied to std and requires non-std
using consumers to do without the strtab (redox, for example). Currently this isn't much of an issue for most people, but in the future will just be nice to have the strtab not tied to std (which theoretically it shouldn't need to be).
In goblin::mach::symbols::Nlist
, the str_x
field has type usize
. Since this represents an external format, it should have a host-independent type.
For comparison, LLVM's corresponding data structure uses u32
for this field: https://github.com/llvm-mirror/llvm/blob/4604874612fa292ab4c49f96aedefdf8be1ff27e/include/llvm/BinaryFormat/MachO.h#L964
When using goblin with error chains, if I want to match the error with things like elf.dynstrtab.get()
, I need to handle the scroll::error::Error
error. This would require me to also include the scroll
crate.
It would be nice if this became a goblin error instead, so I don't need to explicitly include another crate to handle this error.
Nonstandard configurations needs CI coverage so people (like me) don't break it by accident.
make api
runs:
cargo build --no-default-features
cargo build --no-default-features --features="std"
cargo build --no-default-features --features="elf32"
cargo build --no-default-features --features="elf32 elf64"
cargo build --no-default-features --features="elf32 elf64 std"
cargo build --no-default-features --features="elf32 elf64 endian_fd"
cargo build --no-default-features --features="archive"
cargo build --no-default-features --features="mach64"
cargo build --no-default-features --features="mach32"
cargo build --no-default-features --features="mach64 mach32"
cargo build --no-default-features --features="pe32"
cargo build --no-default-features --features="pe32 pe64"
cargo build
Is this the right list for Travis to build? Should Travis call make api
, or should this get inlined into .travis.yml
?
Should cargo test
work in any configuration besides --default-features
? It doesn't now, but it could be fixed and tested going forwards.
load_command::CommandVariant::LoadWeakDylib isn't handled, which may result in a panic when retrieving imports.
Hmm ok filing an issue since I don't quite understand the interaction here...
If I compile goblin 8bbbcf5 directly with cargo +nightly build --no-default-features --features alloc
it works fine.
If I try to compile the object crate with the goblin dependency declared as follows:
[dependencies.goblin]
git = "https://github.com/m4b/goblin"
default-features = false
features = ["alloc", "endian_fd", "elf32", "elf64", "mach32", "mach64", "pe32", "pe64", "archive"]
Using cargo +nightly build --no-default-features
or cargo +nightly build -v --no-default-features --features goblin/alloc
, I get:
Compiling goblin v0.0.16 (https://github.com/m4b/goblin#8bbbcf5d)
error[E0432]: unresolved import `alloc::btree_map`==========> ] 10/12: goblin
--> /Users/gz/.cargo/git/checkouts/goblin-d0c041c0a85ca4ca/8bbbcf5/src/archive/mod.rs:15:12
|
15 | use alloc::btree_map::BTreeMap;
| ^^^^^^^^^ Could not find `btree_map` in `alloc`
error: aborting due to previous error
For more information about this error, try `rustc --explain E0432`.
error: Could not compile `goblin`.
To learn more, run the command again with --verbose.
The following diff gz@a6369ac fixes the problem.
It's time to get Real Serious ™️ and add logging to goblin.
There are a number of places in goblin where things have "gone wrong", but not enough that we shouldn't parse.
Refactoring dyn
to check the index of DT_NEEDED
to fix a bug found while fuzzing, ref #27, we should continue parsing, but the client only receives a None
, which is fine for clients, but we may also want to know why we received none, hence a warn!
would be appropriate.
There are also many times when debugging, e.g., #28, I just need to see the execution state at the point of failure, which is precisely what debug!
is for.
extern log
debug!
for extremely verbose stuff, info
maybe, and warn!
where the binary is malformed some way, e.g., in first example.see #14
This might be more tedious.
Also should revert the archive header name back to original byte array, to keep repr(C)
, which should allow for implementation to be derive(Pwrite, Pread)
E.g.:
impl fmt::Debug for Rel {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let sym = r_sym(self.r_info);
let typ = r_type(self.r_info);
write!(f,
"r_offset: {:x} r_typ: {} r_sym: {}",
self.r_offset,
typ,
sym
)
}
}
should use a debug_struct
so can pretty print, etc. It just looks crappy now :/
greadelf
can produce data like:
Displaying notes found in: .note.ABI-tag
Owner Data size Description
GNU 0x00000010 NT_GNU_ABI_TAG (ABI version tag)
OS: Linux, ABI: 2.6.24
Displaying notes found in: .note.gnu.build-id
Owner Data size Description
GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring)
Build ID: 42f22997b0796cdd2f49d3f3bd148081b8fe2845
Displaying notes found in: .note.gnu.gold-version
Owner Data size Description
GNU 0x00000009 NT_GNU_GOLD_VERSION (gold version)
Version: gold 1.11
I want this data -- particularly NT_GNU_BUILD_ID
-- for collating with external debugging data.
My understanding is that the linker (usually?) consolidates all the note sections into a single PT_NOTE
segment, and that the segment remains parseable even if the section headers are stripped.
I think the PT_NOTE
segment is a series of target-endian structs like:
struct Note<'a> {
namesz: u32,
descsz: u32,
type: u32,
name: &'a [u8], // NUL terminated string, where `namesz` includes the terminator
// padding such that namesz + padding % 4 == 0
desc: &'a [u8], // arbitrary data of length `descsz`
// padding such that descsz + padding % 4 == 0
}
The meaning of type
depends on name
, meaning that if I want to determine that value, I need to find a note having both name == b"GNU\0"
and type == NT_GNU_BUILD_ID == 3
.
For the simple case, where you're reading a binary of native endianness, it would be nice if the API allowed doing this without copying memory. @nrc has a very simple elf parser that uses his zero crate to acheve this. The API winds up looking like:
pub fn parse_header<'a>(input: &'a [u8]) -> Header<'a>
Your current implementation is very close, but it does clone the resulting Header
:
Line 110 in 7d21ea4
It would be even nicer if the API returned an error for non-native endianness so consumers could fall back to from_fd_endian
.
Tracking issue for fuzz related stuff.
We'll start using a corpi now. In particular, I'd like to see PE and Mach backends fuzzed extensively, I'm sure they have more bugs.
/cc @sanxiyn
I'm getting panics on this line while trying to parse a particular executable. Unfortunately, this executable is proprietary so I can't share it, and I don't know enough about the PE format to understand what's going on here.
I did however write a script to run through the PE executables on my machine, which found that a random GDAL distribution I had laying around includes a curl.exe
that causes the exact same panic. curl
is something I can share, so steps to reproduce are:
$ wget -q https://s3.willglynn.com/goblin/curl.exe
$ RUST_BACKTRACE=1 cargo run --example rdr curl.exe
Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
Running `target/debug/examples/rdr curl.exe`
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/libcore/option.rs:335
stack backtrace:
0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
1: std::panicking::default_hook::{{closure}}
2: std::panicking::default_hook
3: std::panicking::rust_panic_with_hook
4: std::panicking::begin_panic
5: std::panicking::begin_panic_fmt
6: rust_begin_unwind
7: core::panicking::panic_fmt
8: core::panicking::panic
9: <core::option::Option<T>>::unwrap
10: goblin::pe::import::ImportLookupTableEntry::parse
11: goblin::pe::import::SyntheticImportDirectoryEntry::parse
12: goblin::pe::import::ImportData::parse
13: goblin::pe::PE::parse
14: goblin::parse
15: rdr::run
16: rdr::main
17: __rust_maybe_catch_panic
18: std::rt::lang_start
19: main
$ git rev-parse --short HEAD
1595f19
Come to think of it, I bet my original executable statically links libcurl
, so even though these executables are from totally different environments, that might be a common thread.
See #14
While when we are building crate / running tests, we use release mode in Fedora.
Do you care about overflow?
For example, you can trigger one by replacing 8 bytes from 0x20 (e_phoff) of ELF-64 binary with 0xff repeated 8 times.
$ RUST_BACKTRACE=1 cargo run --example rdr -- elf
thread 'main' panicked at 'attempt to add with overflow', /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/scroll-0.5.0/src/greater.rs:140
stack backtrace:
10: <[u8] as scroll::greater::TryOffsetWith<Ctx>>::try_offset
at /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/scroll-0.5.0/src/greater.rs:140
11: scroll::greater::Gread::gread_with
at /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/scroll-0.5.0/src/greater.rs:69
12: goblin::elf::program_header::std::ProgramHeader::parse
at src/elf/program_header.rs:142
13: goblin::elf::impure::Elf::parse
at src/elf/mod.rs:156
14: goblin::parse
at src/lib.rs:273
Goblin assumes that the integers inside the headers have the same byte order as the current platform. This obv. fails when reading ELF files meant for architectures with different endianess. Do you have any plans to add (or accept PRs) for byte order aware reading of headers?
Like incoming mach parser, add "zero-copy" implementation to elf.
E.g.:
https://github.com/m4b/goblin/blob/better_mach/src/mach/mod.rs#L21-L33
This will require removal of try_from
api for taking owned fd
and creating struct, as well as updating the Strtab
's to use the lifetime of the Elf
struct, and a couple more optimizations we can perform.
In addition, might be nice to add Exports, Imports, and relocations for lazy parsing, but that can be a future issue.
e.g., EHDR_SIZE
-> SIZEOF_EHDR
Right now some of the debug prints are hard to read because they spit out the byte arrays as arrays instead of something more byte readable. In particular this affects things like section names.
I'm trying to use goblin as the object file loader in the gimli crate. Basically all it needs to do is parse the ELF/Mach-O header and return the data for sections with a given name.
However, I'm having trouble getting Mach-O sections, because Segment::sections()
returns sections that have the lifetime of the segment, instead of the data. That is, Segment::sections()
is defined as:
impl<'a> Segment<'a>
pub fn sections<'b>(&'b self) -> error::Result<Vec<Section<'b>>> {
...
}
}
but I want:
impl<'a> Segment<'a>
pub fn sections<'b>(&'b self) -> error::Result<Vec<Section<'a>>> {
...
}
}
Fixing this will probably require changing the scroll::Gread
trait, but I'm having too much trouble understanding how that works to be able to fix it myself.
For reference, here's how I'm trying to call it:
fn macho_get_section<'a>(macho: &mach::MachO<'a>, section_name: &str) -> Option<&'a [u8]> {
let segment_name = "__DWARF";
let section_name = macho_translate_section_name(section_name);
for segment in &*macho.segments {
if let Ok(name) = segment.name() {
if name == segment_name {
if let Ok(sections) = segment.sections() {
for section in sections {
if section_name == parse_section_name(§ion.sectname[..]) {
return Some(section.data);
}
}
}
}
}
}
None
}
---- iter_symbols stdout ----
thread 'iter_symbols' panicked at 'called `Result::unwrap()` on an `Err` value: Malformed("LoadCommandHeader: LC_UNKNOWN size: 1207959552 has size larger than remainder of binary: 8464")', src/libcore/result.rs:906:4
note: Run with `RUST_BACKTRACE=1` for a backtrace.
---- parse_sections stdout ----
thread 'parse_sections' panicked at 'called `Result::unwrap()` on an `Err` value: Malformed("LoadCommandHeader: LC_UNKNOWN size: 1207959552 has size larger than remainder of binary: 8464")', src/libcore/result.rs:906:4
It can parse them without dying but the filename delimiters are different (uses #
), so it doesn't really parse them at all.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.