Git Product home page Git Product logo

Comments (23)

marhel avatar marhel commented on August 29, 2024

Well, if you look at the cycle counts in the M68000UM, you'll see several different tables in various formats, with footnotes and so on. So generating comprehensible and compact output would be a lot of work I think.

Also the cycle counts are not tabulated anywhere in memory, but are just parameters to macros, and hard coded as Cycles() return values from op-funs. So there's work to be done even if you wanted to output something in a very flat format such as;

add.b.w Dn, Dn 4
add.b.w Dn, An 8
add.b.w Dn, (An) 12

or even simpler, by just listing the function name (which encodes the same info) and the cycle count...

from r68k.

emoon avatar emoon commented on August 29, 2024

Yeah I guess so.. just trying to figure out if there is a way to automate the process :)

from r68k.

marhel avatar marhel commented on August 29, 2024

If we need/want the cycle counts in the disassember metadata about each operation, we could consider moving cycle counts to the op_entry macro, or a separate table, which has the side effect of making this data available for other purposes (like yours) as well. But even with a table, some cycle counts, like for shifts, depend on actual parameters and need some kind of footnote anyway.

from r68k.

emoon avatar emoon commented on August 29, 2024

I have actually thought about adding cycle counts to Capstone also.

from r68k.

emoon avatar emoon commented on August 29, 2024

Ok I got a crazy idea now :)

Take this table https://github.com/ezrec/vasm/blob/master/cpus/m68k/opcodes.h

And from it generate a big source file, compile it and then run it through r68k and collect all the timings :)

from r68k.

marhel avatar marhel commented on August 29, 2024

I... don't quite follow. :) Perhaps due to looking at that table on my phone. What would be the difference compared to recording the output from a run through all opcodes? Also, just executing an op will not generate all possible timings (shift, trap cases, etc)

from r68k.

emoon avatar emoon commented on August 29, 2024

In this case I would have slightly more control and know if there are ranges for the op (for example lsl is #1-8) Also I'm not interested in a complete table but mostly when you do regular program (trap isn't often used in 'regular' code)

from r68k.

marhel avatar marhel commented on August 29, 2024

Performance analysis / optimization support?

from r68k.

emoon avatar emoon commented on August 29, 2024

This would be a table you use when investigating code or writing new code.

from r68k.

emoon avatar emoon commented on August 29, 2024

I could of course do it manually but figured it would be easier to generate it :)

from r68k.

emoon avatar emoon commented on August 29, 2024

Here is a better looking example (text itself needs to be done manually of course)

https://gist.github.com/emoon/5d19bdcee4ff3ca8bcf030b4347d7e1d

from r68k.

marhel avatar marhel commented on August 29, 2024

Ooh, very nice! I'm seeing a Dash docset ;)

from r68k.

emoon avatar emoon commented on August 29, 2024

Yeah I think that would be nice. I guess the question is how much info should one include? In my case I just want what it does and cycles but for more complete opcode and the bits needs to be there also.

from r68k.

emoon avatar emoon commented on August 29, 2024

I figured out a way to do this. It's a bit manual but I take help of by vasm by calling it also :) here is the code https://github.com/emoon/68k_cycle_table_gen It actually doesn't use r68k at all so I just define it myself and use the assembler to verify which instructions that are legal. Right now it will print out something like this for add

     Running `target/debug/68k_cycle_table_gen`
| add.w    | Dn | An | (An) | (An)+ | -(An) | d(An) | d(An,Dn) | xxx.W | xxx.L |
|----------|----|----|------|-------|-------|-------|----------|-------|-------|
| Dn       | 4  | 8  |  12  |  12   |  14   |  16   |    18    |  16   |  20   |
| An       | 4  | 8  |  *   |   *   |   *   |   *   |    *     |   *   |   *   |
| (An)     | 8  | 12 |  *   |   *   |   *   |   *   |    *     |   *   |   *   |
| (An)+    | 8  | 12 |  *   |   *   |   *   |   *   |    *     |   *   |   *   |
| -(An)    | 10 | 14 |  *   |   *   |   *   |   *   |    *     |   *   |   *   |
| d(An)    | 12 | 16 |  *   |   *   |   *   |   *   |    *     |   *   |   *   |
| d(An,Dn) | 14 | 18 |  *   |   *   |   *   |   *   |    *     |   *   |   *   |
| xxx.W    | 12 | 16 |  *   |   *   |   *   |   *   |    *     |   *   |   *   |
| xxx.L    | 16 | 20 |  *   |   *   |   *   |   *   |    *     |   *   |   *   |
| d(Pc)    | 12 | 16 |  *   |   *   |   *   |   *   |    *     |   *   |   *   |
| d(Pc,Dn) | 14 | 18 |  *   |   *   |   *   |   *   |    *     |   *   |   *   |
| #xxx     | 8  | 12 |  16  |  16   |  18   |  20   |    22    |  20   |  24   |

from r68k.

emoon avatar emoon commented on August 29, 2024

Just a small update on this.

I generate this now: https://gist.github.com/emoon/5d19bdcee4ff3ca8bcf030b4347d7e1d

So the way it works:

  1. Descriptions has been taken from the 68k manuals. May be copyright but I doubt anyone really cares so :)
  2. I generate permutations of of possible combinations of instructions pairs according to the specs here https://github.com/emoon/68k_documentation_gen/blob/master/src/main.rs#L263
  3. Generate those to files then using vasm to compile the files (in parallel using Rayon)
  4. I gather up all the output files (generated assembly) for the cases the compiler said it was ok and generate a new file of valid (binary) 68k code.
  5. I then use Musashi to emulate the code and report back the cycle count for each instruction and then I feed that into the table.

The table needs work and some things are missing (asr/lsr/ror/etc) but still fairly good.

I went with Musashi as I didn't know the state of r68k as a crate. Is that possible to use now? Some of these numbers generated in the tables doesn't look fully correct (jsr) but it might be me doing something wrong also.

from r68k.

marhel avatar marhel commented on August 29, 2024

Nifty!

Yes, r68k has been converted to a lib now (the CPU now lives in r68k-emu, the assembler, disassembler and SRecord support in r68k-tools, and some common constants in r68k-common).

There's still some work to do regarding creation of a new instance of r68k (a Core instance), as we still need to explicitly insert the real instruction set instead of a fake testing one, and I'd like to inject the memory implementation as a type parameter. This is currently hardcoded to be LoggingMem which is really only practical for testing (as it records each and every memory access) but should allow any implementation of the AddressBus trait. Especially since I think memory mapped devices should be implemented behind an implementor of that trait. I don't want the user to have to specify too many type parameters either, that wouldn't be ergonomic. I really haven't looked at the lib from the perspective of a user enough.

But once the r68k assembler is completed, I guess you could acheive the same thing without needing to shell out to vasm. But it's also a "one time thing" I guess, once the tables are good enough, they're done.

from r68k.

emoon avatar emoon commented on August 29, 2024

Yeah I guess you are right that this is more of a "one time thing" It's kinda weird that Musashi reports 4 cycles for move.l #x,dn but maybe I'm doing something wrong :)

from r68k.

marhel avatar marhel commented on August 29, 2024

Ah, I think I know what your issue is! Vasm has a few optimizations, and I just tested and it seems that it optimizes move.l #x,Dn to moveq #x,Dn as long as #x fits in a byte. You'd want to use the -no-opt flag to vasm to avoid such surprises.

from r68k.

emoon avatar emoon commented on August 29, 2024

Ah yes :) that is very likely the case. Thanks for pointing that out.

from r68k.

marhel avatar marhel commented on August 29, 2024

One could also imagine a 'snippet cycle cost analyzer' feature in the assembler, where you give it a snippet, and it steps that through the emulator, and outputs a listing file annotated with the cycle cost of each line, as well as a total cost. Trivial in the linear case at least, but less so with loops and branches.

Also, the emulator could potentially have some kind of performance instrumentation execution mode so that it could run a piece of code, including loops or anything, and then output cycle counts and how many times it hit each line, how many times it took a branch (or not) etc. This would help finding hotspots and where most of the time is spent in a loop and so on.

from r68k.

emoon avatar emoon commented on August 29, 2024

Yeah I'm planning to code a new assembler at some point. It will likely have some feature like that. For loops you can try to detect "basic blocks" (meaning loops that and what goes into them) and show an approximate cost (excluding external calls as if you have jsr it can count for the call cost but not what goes into it)

Something I want in the assembler is register allocation. A simple allocator can be found over here https://github.com/deplinenoise/deluxe68 it's quite manual but still fairly useful when writing assembly code.

from r68k.

marhel avatar marhel commented on August 29, 2024

Mm, but at some point along that line you end up with C :)

And then when the translation from the language to the assembly becomes non-trivial/non-obvious, instruction cycle counts starts becoming less useful. Cost per basic block might still be a useful metric, even though you don't really care about the individual instructions in the block anymore - unless you need to figure out the reason for an extreme block cost.

In my own high-level goals, where r68k is a piece of a larger puzzle, an more featureful assembler isn't that important (I need to use existing tools, or I will reinvent too many wheels along the way), but it was just too much fun not to do a simple one now :)

from r68k.

emoon avatar emoon commented on August 29, 2024

Mm, but at some point along that line you end up with C :)

Depends. Deluxe68 for example doesn't emit any register spilling on the stack unless you tell it to. If you have too many registers it will fail to compile and you will have to deal with it.
In C you can have any number of variables and hope the compiler figures it out. Here you write code without any surprises.

Also depending if you write to 000 or 060 you may care more or less. 000 for example is quite slow so you need to take quite good care that an likely want to know the cycle count in detail. 060 is a different beast where you can pair instructions to have them execute in one cycle (has it has two integer pipes)

For me when writing assembly I want to be in full control but it's a bit tedious to keep track in the head all the time in which register you put a variable and having even a basic register allocator helps with that and also prevents you from using a register you though would be free.

from r68k.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.