Git Product home page Git Product logo

Comments (5)

appujee avatar appujee commented on May 23, 2024

Should we create a similar bug for binutils ?

from abi-aa.

MaskRay avatar MaskRay commented on May 23, 2024

I think some binutils aarch64 maintainers watch this repository. This issue tracker may be the best place to discuss the possible change. The owners of this repository have a better idea which binutils aarch64 maintainers should be tagged:)

from abi-aa.

nsz-arm avatar nsz-arm commented on May 23, 2024

this prevents plt hooking at runtime (not just lazy binding). e.g. glibc ld.so supports LD_PROFILE and LD_AUDIT even with bind-now (then plt got is not readonly). but i think there are external tools that hook plts like ltrace (although i think it only places breakpoints on the plt, but this means it has to know the plt layout). so this is not an obvious change (might need additional elf marking).

from abi-aa.

smithp35 avatar smithp35 commented on May 23, 2024

From the perspective of just the sysvabi64 document I'd like to write down what the minimal requirements are for the PLT sequences. I think that the two extreme approaches are:

  • A minimalist document the calling conventions for PLT[0] and PLT[N]. No requirements on the instruction sequences, or requirements for uniform size of PLT[N]. This would maximise the freedom for linker implementers, but make it harder for disassembly/debugging tools as they would need to handle more possible implementations.
  • A maximalist documentation of the calling conventions and at least some of the heuristics used by disassemblers/debuggers to recognise PLT entries. This has the opposite properties of the first.

I think our approach so far has tended towards the maximalist as we've provided some dynamic tags such as DT_AARCH64_BTI_PLT and DT_AARCH64_PAC_PLT https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#642dynamic-section

I personally tend towards the minimalist approach in the specifications to provide more freedom for implementers who may choose different trade-offs.

I'm wondering if there is a set of properties that we represent with a combination of dynamic tags so that we don't have to keep introducing new ones. For example:

  • DT_AARCH64_PLT_SIZE = <size of each PLT entry, with a special value for variable sized entries>
  • DT_AARCH64_NOLAZY =
  • DT_AARCH64_PAC_PLT = <PLT entry expects .got.plt entries to be signed>
  • DT_AARCH64_OS_PLT = <PLT has OS specific behaviour described by one or more tags between DT_LOOS and DT_HIOS>

I think DT_AARCH64_BTI_PLT may not be required as DT_AARCH64_PLT_SIZE should work for that purpose.

For this particular case we have documented the calling convention for PLT[0], but not PLT[N] which is strongly implied by PLT[0]. I think this could be made explicit by stating that when lazy loading is permitted, ip0 (x16) contains the address of the .got.plt entry corresponding to PLT[N], contents unspecified otherwise.

As an aside there are other possibilities for the PLT entries that could help:

  • For small ELF files where the .got.plt is within 1 MiB of the of the .plt then adr and ldr could be used to save an adrp.
  • For BTI it doesn't have to embedded in the PLT entry itself. There could be a thunk/stub for every indirect branch to a PLT entry that direct branches to the PLT which does not have a BTI. This reduces the amount of BTIs at the expense of a slightly more expensive sequence for indirect calls.

from abi-aa.

Wilco1 avatar Wilco1 commented on May 23, 2024

A larger issue is that PLTs have become less efficient with the added BTI making them 20 bytes and span multiple fetch blocks. In principle we don't need BTI in PLTs. To reduce PLT uses we could always create function addresses via a GOT load. Canonical PLTs still need a BTI. An extra thunk just containing BTI would add significant overhead, so the thunk needs to load the address from the GOT and branch (making them non-lazy).

It would be feasible to remove the ADD x16 by slightly changing the PLT: the default (unlinked) GOT entry could point to a branch associated with the PLT entry which then branches to PLT[0]. So x16 contains a unique address relating to the PLT entry. The branch can be at the begin/end of the PLT sequence (eg. ADRP/LDR/BR/B) or placed after all PLTs (which would allow using BTI for canonical PLTs).

from abi-aa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.