Git Product home page Git Product logo

Comments (12)

kemenaran avatar kemenaran commented on May 12, 2024 1

As for the reverse-engineering process, I usually go as this:

Using BGB (for dynamic analysis)

  1. Compile the game (make all). This generates the game from 1. the (partially) disassembled sources, 2. the dumped binary banks. It produces game.gbc (a compiled rom identical to the original) and, more important, game.map, the debug symbols.
  2. Open game.gbc in the BGB emulator, which has a nice debugger.
  3. Open the debugger, and jump to the 0000:0150 address. You'll see a function named Start. BGB knows the name of this function from the debug symbols.

From there the goal is to pick a function or a memory location, understand what it does, so we can label it in the disassembled code.

  1. Pick a function instruction (for instance call label_0A43)
  2. Jump to this function (e.g. by placing a beakpoint)
  3. Understand what is does. For this you can either:
  • read the assembly,
  • see what memory location it reads or changes,
  • observe the values changing in the memory viewer while the game runs,
  • use the debugger to replace the function by a nop, and see what change occur in the game
  1. Open the assembly source (src/main.asm, or src/bank1.asm) or memory map (src/constants/*), and label the code or memory you identified the purpose of.
  2. Rince and repeat.

Using awake (for static analysis)

Awake is a static GameBoy assembly explorer, specialy tuned for exploring ZeldaGB and ZeldaDX. While still in experimental stage, it allows to identify functions, loops, and to jump easily from functions to function.

I'm currently writing some improvement to this tools, so that it can read debug symbols (otherwise no functions are labelled), and label functions from within the explorer. So this is still experimental.

from ladx-disassembly.

kemenaran avatar kemenaran commented on May 12, 2024 1

@Ayymoose @Drenn1 @Xkeeper0 btw, I just merged some improvements to this project:

  • Added these "Disassembling HOWTOs" to the README.md file (plus some additional infos);
  • Simplified the src directory organization: it should be easier to follow how sources are laid out.

This should make it easier to understand how to contribute. If the project structure or tools weren't that clear before, please take a new look !

from ladx-disassembly.

kemenaran avatar kemenaran commented on May 12, 2024

Link's Awakening is quite large, it seems :)

All 61 banks seem to be holding code or data. Plus there are some additional graphics and routines for the DX edition.

from ladx-disassembly.

Ayymoose avatar Ayymoose commented on May 12, 2024

Thanks for your reply. I was also wondering why there is so much dead code in the disassembly? For example, in bank0.asm there are quite a few labels whose sole purpose is just to execute nop instructions. Also in bank20.asm, the starting label at the top is perforated with nops between instructions. Is this to implement some kind of dela or for some hardware register to increment or something like that? I'm just very curious.

from ladx-disassembly.

Stewmath avatar Stewmath commented on May 12, 2024

That's not actually code, it's data being misinterpreted as code. Not that I know what the data represents, though. Separating the data from the code is one of the more time-consuming tasks. There may be specialized emulators that can help with that, but personally I didn't go with that approach in my disassembly. (It makes little difference if the final goal is to label everything by hand, anyway.)

The files in the "disassembled-banks" folder are just for reference, anyway, they're not actually being assembled.

from ladx-disassembly.

Ayymoose avatar Ayymoose commented on May 12, 2024

I thought it was actual code so I forked this repository and tried to decipher the disassembly myself. I assumed that the data would not be mixed in with code. Why does this happen?

Also could you tell me your approach for analysing the disassembly or how you started out this whole project? For me, I basically whipped out the Z80 opcode table and memory map from the manual and tried to follow through the code (main.asm) at the top but it quickly gets confusing because some labels are referenced but I cannot find the code that follows in any of the files.

from ladx-disassembly.

Xkeeper0 avatar Xkeeper0 commented on May 12, 2024

There's no way to tell the difference between code and data. The Game Boy will start executing from a certain location; the only way to tell is to actually track what code is being run or to "guess" at possible execution paths, both of which have pitfalls in that you might run into a glitch or unexpected path that throws everything off.

E: It's also possible that some ROM data is actually both, for example using some block of code for pseudo-randomness.

from ladx-disassembly.

kemenaran avatar kemenaran commented on May 12, 2024

Actually there are some ways to tell the difference between code and data.

First we can assume most banks contain either code or data (and not mixed content). This is not always true, but it helps. Then we can attempt to convert a whole binary bank to PNG, and see if we recognize sprites in the resulting picture. For this you can use the gfx.py script in this repository.

  1. Take a binary bank you want to look at from bin/banks (dumped from the original rom). For instance the bank 2F.
  2. Copy it somewhere, and rename it to add a .2bpp file extention
  3. Run gfx.py to convert it to png: ./gfx.py png bank_2F_BC000.bin.2bpp
  4. Look at the resulting bank_2F_BC000.bin.png.

If you recognize pictures and sprites in the resulting png picture, congratulation, you found a gfx data bank! But if it all looks garbled, this is probably a code bank, or a bank that contains other data (like dungeon maps, or ennemies stats).

If you recognize pictures, you can now even move the PNG file into the src/gfx directory. Then edit the main.asm file to tell "To compile bank 2F, instead of importing the binary bank from bin/banks/bank_2F_BC000.bin, use the data from src/gfx/bank_2F.png`. When compiling the ROM, the Makefile will convert the png file back to a 2bpp binary file, and inject it into the ROM.

Once this is done, you can even start splitting this large PNG file into some smaller fragments, sprite-per-sprite (have a look at src/gfx to see some already extracted sprites – and also that much works still needs to be done :) )

Another possibility is to read some existing documentation about the banks. See for instance this bank-map, made by @devdri)

from ladx-disassembly.

Stewmath avatar Stewmath commented on May 12, 2024

It is, of course, possible to differentiate between code and data. You just can't rely on a computer to do it for you. Not 100%.

from ladx-disassembly.

Ayymoose avatar Ayymoose commented on May 12, 2024

Thanks for all your replies, it has cleared some things up for me. I was wondering if any of you have identified the contents of all 61 banks?

from ladx-disassembly.

kemenaran avatar kemenaran commented on May 12, 2024

I was wondering if any of you have identified the contents of all 61 banks?

Not everything yet, but it's coming together ! (see the updated ROM map)

I labeled and extracted all the graphics banks. Now the missing part is to identify data (dungeons, etc), the remaining of the code, and the audio files.

from ladx-disassembly.

kemenaran avatar kemenaran commented on May 12, 2024

I'm closing this issue, as the original question was answered. Thanks for the discussion!

from ladx-disassembly.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.