Git Product home page Git Product logo

Comments (17)

smithp35 avatar smithp35 commented on June 29, 2024 1

OK I see where you are coming from. The safest assumption is that what is not callee-saved by the callee must be caller-saved. That would indeed imply that p0~p15 would need saving when calling a regular function.

I'll reopen this as I think more work is needed here.

from abi-aa.

kunalspathak avatar kunalspathak commented on June 29, 2024

.NET issue that describes SVE support: dotnet/runtime#93095

from abi-aa.

smithp35 avatar smithp35 commented on June 29, 2024

I agree that the sentence can be clarified.

Assuming the confusion hasn't been resolved already, there's a couple of other parts that may help parsing the text:
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#22terms-and-abbreviations

Routine, subroutine
A fragment of program to which control can be transferred that, on completing its task, returns control to its caller at an instruction following the call. Routine is used for clarity where there are nested calls: a routine is the caller and a subroutine is the callee.

https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#3scope

Obligations on the called routine to preserve the program state of the caller across the call.

Combining this with the original text, the it is referring to the callee.

from abi-aa.

a74nh avatar a74nh commented on June 29, 2024

Combining this with the original text, the it is referring to the callee.

Thanks! That was the conclusion we came to after reading around the issue elsewhere. But it would be nice for it to be clearer.

from abi-aa.

smithp35 avatar smithp35 commented on June 29, 2024

#267 to update wording.

from abi-aa.

kunalspathak avatar kunalspathak commented on June 29, 2024

Just to be clear, here is my understanding. @rsandifo-arm @smithp35 - please correct if I missed anything.

Terminology

  • callee-save: Registers that should be saved/restored by the callee in its prolog/epilog
  • caller-save: Registers that should be saved/restored by the caller around the call it makes
  • sve method: Method that takes sve/predicate arguments or returns sve/predicate result
  • regular method: Method that neither take sve/predicate arguments nor returns sve/predicate result
A()
{
prolog:
   save callee-save registers
   ...
   ...
   save caller-save registers
   B();
   restore caller-save registers
   ...   
   ...
epilog:
   restore callee-save registers
}
  • A to B, read it as method A calls method B
  • prolog/epilog of A: callee-save of method A
  • before/after call to B: caller-save by method A

Float/Scalable registers

Scenario# A to B prolog/epilog of A before/after call to B
1 regular to regular bottom 64-bits v8-v15 1 v0-v7, v16-v31, top 64-bits v8-v15 1
2 regular to sve bottom 64-bits v8-v15 1 z0-z7, z24-z31
3 sve to regular z8-z23 v0-v7, v16-v31, top 64-bits v8-v15 1
4 sve to sve z8-z23 z0-z7, z24-z31

1 : This is same specification we have for NEON and only applicable when registers are in use or live

Predicate registers

Scenario# A to B prolog/epilog of A before/after call to B
1 regular to regular NA p0-p15
2 regular to sve NA p0-p3
3 sve to regular p4-p15 p0-p15
4 sve to sve p4-p15 p0-p3

from abi-aa.

smithp35 avatar smithp35 commented on June 29, 2024

I'm going to use the official terminology of caller-save instead of callee_trash.

Just to be sure, apologies if this was already clear, caller-save and callee-save are more like responsibilities to save than they are requirements to save. For example a callee only needs to save a callee-save register if it uses the register. A caller only needs to save a caller-save register before a call if there is a live value in the register that the caller needs to access after the call.

This is my reading of the document. I'm not a SVE expert like @rsandifo-arm so if I've got this wrong please go with his answer/correction rather than mine. I'm more of a linker than a compiler person.

I found it easier to describe when not considering the different call scenarios as there is only a caller and a callee and the responsibilities of the caller don't change if the callee is sve or regular.

Function type callee-save caller-save
regular bottom 64-bits of v8-v15 v0-v7, v16-v31, top 64-bits of v8-15
sve z8-z23 v0-v7, v16-v31 (*)

(*) z16-z23 are extensions of v16-v23 so these are both callee and caller saved.

Function type callee-save caller-save
regular - p0-p3
sve p4-15 p0-p3

I do hope I've got this right, if I haven't and it isn't a silly mistake then we may need more clarifications.

from abi-aa.

kunalspathak avatar kunalspathak commented on June 29, 2024

I found it easier to describe when not considering the different call scenarios

That's how I wanted it to be, but I wanted to be explicit about the situation. For e.g. in your table, for "regular" function type, under "caller-save", the way I interpret is if a "regular" function is a caller, what registers it need to save/restore across a function call. But that depends on what type of function it is calling. If it is a "regular" function, it needs to save/restore v0-v7, v16-v31, top 64-bits of v8-15, but if it is a sve function, it needs to just save/restore z0-z7, z24-z31, because the sve function (which will be callee in this case) will be responsible for preserving z8~z23. Same goes with other combination.

Also, for "regular" function type, if it is calling "regular" function, then it should save/restore entire p0~p15, while if it is calling "sve" function, it should preserve just p0~p3, because p4~p15 will be preserved by the "sve" function (which is callee in this case).

Note: When I say caller should preserve across function call, I mean only the registers that are live across the call. So, in my table, out of the registers mentioned in "callee-trash" column, only the registers that are live across the call will be preserved by the caller.

I do hope I've got this right

I feel the same :)

from abi-aa.

kunalspathak avatar kunalspathak commented on June 29, 2024

then we may need more clarifications

Regardless of if we get this or not, I think the document needs a clear way of stating these requirements, something equivalent of how we are having this information in the table. Lot of time is being spent by multiple people in trying to interpret couple of lines of the document.

from abi-aa.

pmsjt avatar pmsjt commented on June 29, 2024

Functions without SVE types in the signature don't have to save any SVE state. If they had to, then existing function would not be legal anymore. The only things function without SVE types in the signature must worry about are:

  • If they want to preserve SVE state across calls they make, they may need to save them. This will depend on whether the callee takes SVE parameters or not. Callees with SVE parameters will, themselves, preserve a lot of registers so the caller may not need to save anything. When calling a function that doesn't have SVE types in the signature, you must assume all SVE state will be trashed.
  • If they need stack-bound local SVE variables, wither because the function uses more SVE variables than there are registers, or because some variable has address-taken, then you must allocate space in the stack for them.
  • The only thing they might have to save in the prolog and restore in the epilog are D8->D15 (lower 64bits of Neon registers Q8->Q15). This is not new - this is the existing Neon callee-saved rule, but the compiler must take into consideration that using Z8->Z15 means D8->D15 will be affected. If a function that doesn't have SVE types in the signature uses SVE but avoids Z8->Z15 then it doesn't have to save anything in the prolog or restore it in the epilog.

from abi-aa.

tannergooding avatar tannergooding commented on June 29, 2024

the responsibilities of the caller don't change if the callee is sve or regular.

There is a lot of nuance here and it is easy for developers to miss considerations.

A callee x is responsible for saving (typically in the prologue) and restoring (typically in the epilogue) the callee-save set of its own calling convention a

A caller x is also responsible for saving (typically before the call) and restoring (typically after the call) the caller-save set of the calling convention b for callee y

Thus, if conventions a and b match (sve x->sve y -or- regular x->regular y), then this is relatively simple as you only have to consider the context of the individual methods x and y because the callee-save for a is the inverse mask to the caller-save for a

However, if conventions a and b do not match (sve x->regular y -or- regular x->sve y), then the caller-save set becomes more interesting as the callee-save for a will typically not be an inverse of the caller-save for b. Instead, they will have a union of some registers. This means that the caller x must also consider any registers that are disjoint.

The simplest example of this is that for a regular call, none of P0-P15 are considered callee-save. Thus a regular method is free to trash any and all predicate registers without consideration. However, P4-P15 are considered callee-save for an sve call and thus must save P4-P15 is they are used.

What this means is that for regular x->regular y, x is free to trash any predicate registers. If it has a predicate register that needs to remain "live" across the call to y, it must save/restore them.

For sve x->sve y, x is free to trash P0-P3, but must save and restore P4-P15 if they are used. It must only save P0-P3 across the call to y if they need to remain live.

However, for regular x->sve y the sets differ and x now only has to save P0-P3 because y must be saving/restoring P4-P15.

It gets very interesting for sve x->regular y however, because the regular call (y) is free to trash any of P0-P15. This means that not only does x need to save the normal set of P0-P3 if it's using them and needs them to remain live across the call, it must also assume that y will trash P4-P15 and is now responsible for saving them across the call boundary (because any prior sve caller could itself be using them and expected x to have saved them).

from abi-aa.

smithp35 avatar smithp35 commented on June 29, 2024

Thanks for the additional points. This has somewhat spiralled from the meaning of it :-) in a couple of sentences. I'll discuss with my colleagues to see if there is a better way of describing this.

from abi-aa.

kunalspathak avatar kunalspathak commented on June 29, 2024

I have updated #266 (comment) to use the terminology of "caller-save" instead of "callee-trash".

from abi-aa.

smithp35 avatar smithp35 commented on June 29, 2024

Looking at the table that you have updated I think it is best not to try and enumerate the caller-save registers and caller-save registers in the same table.

The callee-save registers are a requirement for a function to preserve the values of registers across the call, so that the values of these registers on entry to the function are the same as the values on return. This requirement is invariant of the caller, or whether there are any calls at all. This looks right in your table.

The set of caller-save registers are determined per call (a function could call both regular and sve functions). They are the registers that are not guaranteed to be preserved by the function being called (registers not in the callee-saves of the function being called).

Function Type Callee-saves
regular bottom 64-bits v8-v15
SVE z8-z23, p4-p15
Called function type Caller Save registers for call
regular All registers not in {bottom 64-bits of v8-v15} *
sve All registers not in {z8-z23, p4-p15}
  • In practice this means all SVE state including predicate registers, it is going to be hard to work out that SVE values are always going to be within bottom 64-bits of v8-v15.

I've got more registers that need to be saved when calling regular functions than your table entries for caller-save.

Hope I haven't made any mistakes, I'm hoping that we can find the right wording to improve the AAPCS over the next few weeks.

from abi-aa.

kunalspathak avatar kunalspathak commented on June 29, 2024

All registers not in {bottom 64-bits of v8-v15} *

I assume that includes p0-p15 (might be better to clarify)

from abi-aa.

smithp35 avatar smithp35 commented on June 29, 2024

I've edited my * comment to "In practice this means all SVE state including predicate registers". Hopefully that should cover it.

from abi-aa.

kunalspathak avatar kunalspathak commented on June 29, 2024

I've got more registers that need to be saved when calling regular functions than your table entries for caller-save.

Yes, I realized it and have updated #266 (comment) accordingly.

from abi-aa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.