Git Product home page Git Product logo

wcdatool's Introduction

Watcom Disassembly Tool (wcdatool)

Tool to aid disassembling DOS applications created with the Watcom toolchain.

Donations

I'm striving to become a full-time developer of Free and open-source software (FOSS). Donations help me achieve that goal and are highly appreciated.

Buy Me A Coffee   Donate via PayPal   Donate via Ko-fi

Watcom Toolchain

Many DOS applications of the 90s, especially games, were developed using the Watcom toolchain. Examples are DOOM, Warcraft, Syndicate, Mortal Kombat, just to name a few.

Most end-users probably never have heard of Watcom, but might remember applications displaying a startup banner reading something like this: DOS/4G(W) Protected Mode Run-time [...]. DOS/4G(W) was a popular DOS extender bundled with the Watcom toolchain, allowing DOS applications to run in 32-bit protected mode.

Nowadays, the Watcom toolchain is open source and lives on as Open Watcom / Open Watcom v2 Fork.

Why create another disassembly tool?

The idea for this tool emerged when I discovered that one of my all-time favorite games, Mortal Kombat, was mainly written in Assembler (more or less directly ported from the arcade version) and was released unstripped (i.e. executable contains debug symbols). I tried using various decompilation/disassembly tools on it, only to discover that none seemed to be capable of dealing with the specifics of Watcom-based applications.

Thus, I began writing my own tool. What originally started out as mkdecomptool specifically for Mortal Kombat is now the general-purpose Watcom Disassembly Tool (wcdatool).

Note that while wcdatool performs the tasks it is designed for quite well, it is not intended to compete with or replace high-end tools like IDA Pro or Ghidra.

Current state and future development

Wcdatool is work in progress. You can tell from looking at the source code - there's tons of TODO, TESTING, FIXME, etc. flying around. Also, it is relatively slow as performance has not been the main focus (Cython might be utilized in the future to increase performance).

Nevertheless, it works quite well in its current state - you'll get a well-readable, reasonably structured disassembly output (objdump format, Intel syntax). Check out issues #9 and #11 for games other than Mortal Kombat that wcdatool worked nicely for thus far. Please note that wcdatool works best when used on executables that contain debug symbols. If you come across other unstripped Watcom-based DOS applications that may be used for further testing and development, please let me know.

The next major goal is to cleanly rewrite the disassembler module and transition from static code disassembly to execution flow tracing (e.g. Mortal Kombat 2 executable contains code within its data object, which is neither discovered nor processed with the current approach).

Output sample

Output sample for Fatal Racing (FATAL.EXE) - the left side shows the reconstructed source files, the right side shows a portion of formatted disassembly:

Screenshot

How to use it

There are multiple ways to use wcdatool, but the following instructions should get you started. Don't let the amount of information provided below discourage you, the tool is easier to use than it might seem. The instructions assume that you are using Linux. For Windows users, the easiest way to go is to use Windows Subsystem for Linux (WSL):

  1. Requirements:

    Wcdatool: Python (>=3.6.0), wdump (part of Open Watcom v2), objdump (part of binutils)
    (both wdump and objdump need to be accessible via PATH)

    Open Watcom v2: gcc -or- clang (for 64-bit builds), DOSEMU -or- DOSBox (for wgml utility)
    (only relevant if Open Watcom v2 is built from sources; the project also provides pre-compiled binaries)

  2. Clone wcdatool's repository (-or- download and extract a release):

    # git clone https://github.com/fonic/wcdatool.git
    
  3. Download, build and install Open Watcom v2 (-or- download and install pre-compiled binaries):

    # cd wcdatool/OpenWatcom
    # ./1_download.sh
    # ./2_build.sh
    # ./3_install_linux.sh /opt/openwatcom /opt/bin/openwatcom
    

    NOTE: these scripts are provided for convenience, they are not part of the Open Watcom v2 project itself

  4. Copy the executables to be disassembled to wcdatool/Executables, e.g. for Mortal Kombat:

    # cp <source-dir>/MK1.EXE wcdatool/Executables
    # cp <source-dir>/MK2.EXE wcdatool/Executables
    # cp <source-dir>/MK3.EXE wcdatool/Executables
    

    NOTE: file names of executables are used to locate corresponding object hint files (see step 5)

  5. Create/update object hint files in wcdatool/Hints (optional; skip when just getting started):

    Object hints may be used to manually affect the disassembly process (e.g. force decoding of certain regions as code/data, specify data decoding mode, define data structs, add comments). Please refer to included object hint files for Mortal Kombat, Fatal Racing and Pac-Man VR for details regarding capabilities and syntax.

    NOTE: hint files must be stored as wcdatool/Hints/<name-of-executable>.txt (case-sensitive, e.g. wcdatool/Executables/MK1.EXE -> wcdatool/Hints/MK1.EXE.txt) to be picked up automatically by the included scripts

  6. Let wcdatool process all provided executables (for the example executables listed in step 4, this will take ~3min. and generate ~1.5GB worth of data):

    # wcdatool/Scripts/process-all-executables.sh
    

    -or- Let wcdatool process a single executable:

    # wcdatool/Scripts/process-single.executable.sh <name-of-executable>
    

    -or- Run wcdatool manually (use --help to display detailed usage information or see below):

    # python wcdatool/Wcdatool/wcdatool.py -od wcdatool/Output -wao wcdatool/Hints/<name-of-executable>.txt wcdatool/Executables/<name-of-executable>
    

    NOTE: it is completely normal and expected for wcdatool to produce LOTS of warnings; ignore those when just getting started (see step 8 for details)

  7. Have a look at the results in wcdatool/Output:

    • File <name-of-executable>_zzz_log.txt contains log messages (same as console output, but without coloring/formatting)
    • Files <name-of-executable>_disasm_object_x_disassembly_plain.asm contain plain disassembly
    • Files <name-of-executable>_disasm_object_x_disassembly_formatted.asm contain formatted disassembly
    • Folder <name-of-executable>_modules contains formatted disassembly split into separate files (this attempts to reconstruct the application's original source files if corresponding debug information is available)

    NOTE: if you are new to assembler/assembly language, check out this x86 Assembly Guide

  8. Refine the output by analyzing the disassembly, updating the object hints and re-running wcdatool (i.e. loop steps 5-8):

    • Identify and add hints for regions in code objects that are actually data (look for ; misplaced item comments, (bad) assembly instructions and labels with ; access size comments)
    • Identify and add hints for regions in data objects that are actually code (look for call/jmp instructions in code objects with fixup targets pointing to data objects)
    • Check section Possible object hints of wcdatool's console output / log file for suggestions (not guaranteed to be correct, but likely a good starting point)
    • The ultimate goal here is to eliminate all (or at least most) warnings issued by wcdatool. Each warning points out a region of the disassembly that does currently seem flawed and therefore requires further attention/investigation. Note that there is a cascading effect at work (e.g. a region of data that is falsely intepreted as code may produce bogus branches, leading to further warnings), thus warnings should be tackled one (or few) at a time from first to last with wcdatool re-runs in between

    NOTE: this is by far the most time-consuming part, but crucial to achieve good and clean results (!)

Wcdatool usage information

Usage: wcdatool.py [-wde|--wdump-exec PATH] [-ode|--objdump-exec PATH]
                   [-wdo|--wdump-output PATH] [-wao|--wdump-addout PATH]
                   [-od|--output-dir PATH] [-cm|--color-mode VALUE]
                   [-id|--interactive-debugger] [-is|--interactive-shell]
                   [-h|--help] FILE

Tool to aid disassembling DOS applications created with the Watcom toolchain.

Positionals:
  FILE                            Path to input executable to disassemble
                                  (.exe file)

Options:
  -wde PATH, --wdump-exec PATH    Path to wdump executable (default: 'wdump')
  -ode PATH, --objdump-exec PATH  Path to objdump executable (default:
                                  'objdump')
  -wdo PATH, --wdump-output PATH  Path to file containing pre-generated wdump
                                  output to read/parse instead of running
                                  wdump
  -wao PATH, --wdump-addout PATH  Path to file containing additional wdump
                                  output to read/parse (mainly used for object
                                  hints)
  -od PATH, --output-dir PATH     Path to output directory for storing
                                  generated content (default: '.')
  -cm VALUE, --color-mode VALUE   Enable color mode (choices: 'auto', 'true',
                                  'false') (default: 'auto')
  -id, --interactive-debugger     Drop to interactive debugger before exiting
                                  to allow inspecting internal data structures
  -is, --interactive-shell        Drop to interactive shell before exiting to
                                  allow inspecting internal data structures
  -h, --help                      Display usage information (this message)

How to contact me

If you want to get in touch with me, give feedback, ask questions or simply need someone to talk to, please open an Issue here on GitHub. Make sure to leave an email address if you prefer personal/private contact.

Last updated: 08/12/23

wcdatool's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wcdatool's Issues

Question about recompiling

So is this code in a state to be recompiled once it has been decompiled? I'm not really an assembler programmer so I can't immediately tell.

Was surprised to see the output ad the line numbers/binary code as well-- is there a way to disable that? I assume you can't recompile the code with that in the code.

Also, I've noticed comments about "bad code" or "misplaced" code-- I assume this is something your script does? Would that affect trying to recompile the code?

Broker pipe error

running any of the shell scripts causes a broken pipe error on VS Code:

[21644:0115/132121.563:ERROR:broker_win.cc(56)] Error reading broker pipe: The pipe has been ended. (0x6D)
[6072:0115/132122.561:ERROR:broker_win.cc(56)] Error reading broker pipe: The pipe has been ended. (0x6D)
[9756:0115/132123.310:ERROR:broker_win.cc(56)] Error reading broker pipe: The pipe has been ended. (0x6D)

How to disassemble Pac-Man VR

So I have a non-MK app that I'm trying to decompile, I managed through IDA to identify that it was in fact compiled through Watcom. I've tried multiple softwares to try to decompile it into comprehensive code, but to no avail. Some lines ARE in fact clearly readable in other softwares, but there's a lot of garbage inbetween those sections, and I don't know how to clean that up. Did some searching and lo and behold, I found your project and was overjoyed that something was created specifically for Watcom applications, but I just can't for the life of me get it running, and the install instructions are incredibly confusing.

For example, you never specified where to put objdump. I tried to run your included BAT files for compiling Open Watcom 2.0 from source, but that didn't seem to work correctly either. I tried transferring the files I already had from a seperate installation of OW 1.9 from the official site, but that didn't seem to do anything either. Just would really like some clarification on how to actually install/use this, thanks.

How to disassemble Fatal Racing / Whiplash

Hi :)

This looks really interesting as I am trying to RE Fatal Racing/Whiplash and there were WATCOM when running strings on FATAL.EXE.

It did look promising but it crashed

FATAL.EXE_zzz_log.txt

The error isn't in the log file but here is a snippet copied from the terminal:

Building data maps for data objects:
Building initial data maps from object and modules...
Data map for object 2 has 1 entries
Extending data maps based on size information in structure...
Traceback (most recent call last):
  File "/home/deevus/projects/reverse-engineering/wcdatool/Scripts/../Wcdatool/wcdatool.py", line 241, in <module>
    retval = main()
  File "/home/deevus/projects/reverse-engineering/wcdatool/Scripts/../Wcdatool/wcdatool.py", line 210, in main
    disasm = disassemble_objects(wdump, fixrel, cmd_args.objdump_exec, outfile_template)
  File "/home/deevus/projects/reverse-engineering/wcdatool/Wcdatool/modules/main_disassembler.py", line 2290, in disassemble_objects
    insert_data_map_item(object["data map"], OrderedDict([("start", sitem["start"]), ("end", sitem["end"]), ("type", type_), ("mode", mode), ("source", "structure")]))
  File "/home/deevus/projects/reverse-engineering/wcdatool/Wcdatool/modules/main_disassembler.py", line 1372, in insert_data_map_item
    raise TypeError("ins_item[\"%s\"] must be type %s or None, not %s" % (key, key_type.__name__, type(ins_item[key]).__name__))
TypeError: ins_item["end"] must be type int or None, not NoneType

Hello fellow assembler...

I was told about this and thought it was cool after spending the last 5 years working directly with TMS34010 assembly code and writing a few huge MK revisions for arcades(mortalkombatplus.com).

I'm not sure this would be of help for my arcade projects, but maybe if I ever decide to port MK Trilogy over the Midway's Wolf Unit. :)

Freezes when disassembling MKTRIL.exe?

Piqued my interest when I saw in your readme that this could disassemble MK Trilogy for DOS in a readable format. When I try to run it on MKTRIL.EXE, it goes for about a minute or two an then gets stuck at "Generating plain disassembly for object 1...". I can see in task manager nothing is going on because the CPU usage for python goes from 30% to 0%. Is the current version of this not able to disassemble MK Trilogy? Is it something you can fix?

edit: note that I'm using the original EXE that comes on the MK Trilogy CD, not the modified community patch or gog versions.

edit2: I do see objdump.exe tasks stopping and starting. I also see Windows Defender anti virus running at 25-30% CPU. Wondering if it's completely grinding Python to a halt for some reason.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.