A limitation of the --wrap
method, is that it cannot intercept calls to a function inside the same compilation unit as where the original function is defined. This can be solved by moving to-be-intercepted functions into their own .cpp files, but that hurts the readability of the original code, requires changes to the original code when adding wrapped functions in the testsuite, and makes it a bit more fragile to breaking the wrapping when the original code is changed. So I'd like to see this limitation lifted. I've been digging into the ld
sources to find a way around this, and this issue presents a couple of options for this. Below is a more generic writeup for handling wrapping, not limited to the PowerFake usage necessarily.
How ld does linking
In a gross oversimplification of the tremendously complex linking process, here's what ld
does when it links an executable. I've mostly looked at linking elf object files into elf executables (specifically elf64-x86-64
), but I think most of this will be similar for other targets as well (and the elf target is what we're using in almost all cases anyway).
Resolving symbols
- For each object file in turn, each global (exported) symbol in the file's symbol table is considered (
elf_x86_64_relocate_section()
) and entered into a global (in-memory) symbol table (that starts out empty). Note that the symbol table also contains undefined symbol entries, for symbols that are referenced but not defined.
- This starts by looking up any existing symbol by the same name in the global symbol table (
_bfd_elf_merge_symbol()
). If the name is not in the global table yet, it is simply added. If it is already there, the existing symbol and the new symbol are merged (_bfd_elf_merge_symbol()
and _bfd_generic_link_add_one_symbol()
). This merging handles this like a strong symbol replacing a weak symbol (or a weak symbol being discarded when there is already a strong symbol), or a strong symbol replacing a previously undefined symbol, raising an error when trying to merge two strong symbols, etc.
- In any case, the resulting symbol is associated with the symbol table entry in the current file. This resulting symbol may be the same symbol (typical for strong symbols, or weak symbols that are not defined yet), but it can also point to a different symbol (typical for an undefined symbol entry which was already defined in a previous object file).
- For example, consider processing a
UNDEF foo()
entry first (from a file that references but does not define foo()
). This creates a new UNDEF foo()
entry in the global symbol table, which is associated with the entry in the current file's symbol table. Then, in another file that actually defines foo()
, a DEF foo()
entry is processed. This looks in global table, finds the existing UNDEF
entry, and merges it with the new DEF
entry by overwriting the existing UNDEF
entry with the new DEF
entry. This also causes the (UNDEF
) entry in the first file, to be associated with this new, merged, DEF
symbol, so it can be found later.
Resolving relocations
- When you write a function call in the code, the compiler generates a dummy instruction, e.g.
call 0x0
. It then also leaves an instruction for the linker, saying "Put the address of the global symbol foo
at this address (i.e. after the call
)". This instruction is called a relocation.
- In the relocation, "the address of the global symbol
foo
" is not so explicit. Instead, a relocation refers to an entry in the file's symbol table.
- If the compilation unit does not define
foo()
itself, the symbol table contains an UNDEF foo()
entry. After resolving symbols, as described above, the relocation is resolved by looking at the associated file symbol table entry, which has an associated global symbol table entry, which is now a DEF foo()
entry that tells the linker where foo()
is actually defined.
- If the compilation unit does define
foo()
itself, the symbol table contains a DEF foo()
entry. In most cases, the associated global symbol table entry will be this same DEF foo()
entry, so the linker resolves relocations for foo()
to the definition in this file itself. There can be exceptions, e.g. when foo()
is weakly defined, then the actual foo()
to be used by the relocation can still be in a different file.
Implementing --wrap
To implement the --wrap
option, the linker changes the second step in the resolving symbols procedure. When it "looks up any existing symbol by the same name in the global symbol table", and the name to look up was passed to --wrap
, it actually does a lookup for __wrap_<name>
instead. Similarly, when it has to look up __real_<name>
, it looks up <name>
instead. Simple, but quite powerful.
Except that it can only do this for UNDEF
entries in the file's symbol table. Consider what would happen otherwise: There is a DEF foo()
entry in one file. The global table lookup uses __wrap_foo
instead, and finds an existing entry DEF __wrap_foo()
, which is the wrapper to be used. Trying to merge these two entries will fail, since both are strong definitions. If you would instead discard the DEF foo()
(as if it were weak) and associate the DEF __wrap_foo()
entry with the file symbol table entry, then relocations (calls to foo()
) in the same compilation unit would correctly resolve to __wrap_foo()
. However, you would have discarded the original foo()
entry, so you can no longer access it through __real__foo()
.
I guess the linker could have also (in addition to looking up __wrap_foo()
and associating the result with the current entry) put the original DEF foo()
entry in the global symbol table, without associating that entry with any symbol table entry in the current file (but putting it out there to be associated with other entries in other files later), but maybe this didn't seem relevant, or maybe this has a ton of unexpected side effects (the linker is horrendously complex after all, I'm just showing the supersimplified version of it here).
Diverting local calls
This analysis does suggest two possible ways you can still divert a call to a locally defined function to some other function:
- Make the locally defined function weak, so it can be overridden by something else (without using
--wrap
)
- Make the relocations point to an
UNDEF foo()
symbol that can be wrapped, and have a second DEF foo()
symbol with the actual definition. This is essentially what you do when you move the definition of foo()
into its own source file, but I think this could be done inside a single .o
file as well.
In addition, Greg Carter shows that you can also use e.g. -Wl,--defsym,foo=__wrap_foo
as a linker option to forcibly replace the foo
function with the wrapper. I haven't full investigated how this works in the linker internals, but I believe that this approach loses access to the original symbol (unless you duplicate it under another name by modifying the object files, as suggested below), so I haven't investigated this option much further.
How to wrap local calls
So, how can you then actually achieve --wrap
for local calls? The above suggests some ingredients, below I'll mix those into a couple of different (but similar) approaches.
A downside of all below approaches is that they require inspecting and modifying the object files in the build. A solution where you would not need to inspect the object files at all (or maybe just the object files that contain the wrappers, to get a list of wrapped functions) and handle everything by just adding compiler and/or linker flags would be ideal, but I haven't been able to figure out a way to allow this.
1. Weakening symbols, without --wrap
- Define the original symbols as weak, or use
objcopy --weaken-symbol
to do so after compilation (to prevent having to modify the original source files).
- Define the wrapper using the same name as the original, so it will replace the original.
- Do not use
--wrap
anymore.
- This is essentially what Peter Huewe proposes here, except they also suggest using
--globalize-symbol
to ensure the symbol is global, but I think that's only needed for non-exported (i.e. globals with the static
keyword) symbols, and making those global might end up creating conflicts that were not previously present, so this must be done with care).
- Con: This looses access to the original function, so you cannot really "wrap" an existing symbol, only replace it.
- Con: Must know which symbol is the original and which is the wrapper (to not modify the wrapper).
- Con: If the original function has weak versions, the processed .o files can no longer be used to link the original executable, without adding the wrappers (since now all versions of the original name are weak, so a different version might be chosen than before making the one strong version weak).
2. Weakening symbols and adding a __real_
version, without --wrap
- Like above: Use
objcopy --weaken-symbol
to mark the original as weak
- Use
objcopy --add-symbol
to add a new symbol called __real_<name>
with the address (i.e. pointing to the same bit of compiled code). This new symbol should ideally be a perfect copy (same section, size, visibility, flags, weakness, etc.), and the copy should be made in all compilation units that have the function (except where the wrapper is defined), so that if the existing symbol already has multiple copies (e.g. a weak and strong version), it will still resolve as without these changes.
- Like above: Define the wrapper using the same name as the original, so it will replace the original.
- Call
__real_<name>
from the wrapper to access the original.
- This is essentially what Javier Escalada proposes here, except they do not make a perfect copy of the symbol.
- Con: Wrapper is named the same as original, which can be confusing.
- Con: Build process must know where wrappers are defined (to not modify the wrapper).
- Con: Unsure if
objcopy
can create a proper identical copy, so might require building a custom command or script to parse and modify the elf file.
- Con: If the original function has weak versions, the processed .o files can no longer be used to link the original executable, without adding the wrappers (since now all versions of the original name are weak, so a different version might be chosen than before making the one strong version weak).
3. Splitting symbols, with --wrap
- For each definition of the original symbol, add an identical copy under the same name, and modify the original one to be an
UNDEF
instead. This result in two symbols by the same name, where relocations point to the UNDEF one and the DEF one points to the implementation, allowing to use --wrap
as normal.
- Con: It does not seem
objcopy
can do this, so this probably requires building a custom command or script to parse and modify the elf file.
- Con: Having a duplicate name in the symbol table of an .o file might confuse tools? The base specification for the ELF format does not discuss uniqueness of names in the symbol table at all, it seems.
- Pro: Wrapper is named differently, making it potentially clearer.
- Pro: Build process can treat all object files equally, since wrappers can be distinguished from the original by their name.
4. Reimplementing --wrap
- For each definition of the original symbol, add an identical copy named
__real_<name>
and replace the original entry with an UNDEF __wrap_<name>
(so that existing relocations now point to the wrapper).
- For undefined reference to the original symbol, also replace that with an
UNDEF __wrap_<name>
entry.
- Con: It does not seem
objcopy
can do this, so this probably requires building a custom command or script to parse and modify the elf file.
- Pro: Wrapper is named differently, making it potentially clearer.
- Pro: Build process can treat all object files equally, since wrappers can be distinguished from the original by their name.
- Pro: No duplicate names in the .o file symbol table.
- Pro: No dependency on the linker's
--wrap
implementation.
- Con: The processed .o files can no longer be used to link the original executable, without adding the wrappers.
I'm inclined to further investigate the last option (if we need to modify .o files with custom tooling anyway, might as well do the entire wrap thing ourselves, which might also simplify things because we no longer need to comply with the linker's requirements on __wrap_
naming). However, the fact that these files are no longer usable as part of a regular build is a bit annoying, and might make option 3 more suitable (if it works, I haven't tried it yet A quick manual edit using https://elfy.io suggests this indeed works). Option 1 is not feasible, since it does not allow calling the original function, and option 2 feels a bit fragile when it comes to existing weak functions.