Git Product home page Git Product logo

uxmal / reko Goto Github PK

View Code? Open in Web Editor NEW
2.0K 72.0 237.0 854.61 MB

Reko is a binary decompiler.

Home Page: https://uxmal.github.io/reko

License: GNU General Public License v2.0

C# 63.16% HTML 0.06% XSLT 0.03% Assembly 0.14% C 0.48% CSS 0.01% C++ 1.02% Makefile 0.01% Batchfile 0.01% Yacc 0.02% JavaScript 0.01% Pascal 1.29% CMake 0.12% PHP 0.02% Python 0.04% TSQL 0.01% OpenEdge ABL 33.52% SourcePawn 0.07% Dockerfile 0.01%
decompiler dotnet decompiler-engine decompile reverse-engineering x86 x86-64 m68k arm aarch64

reko's Introduction

reko - a general purpose decompiler.

Cirrus CI Build Status GitHub workflow status Join us on Discord Join the chat at https://gitter.im/uxmal/reko

Reko (Swedish: "decent, obliging") is a decompiler for machine code binaries. This project is freely available under the GNU General Public License.

The project consists of front ends, core decompiler engine, and back ends to help it achieve its goals. A command-line, a Windows GUI, and a ASP.NET front end exist at the time of writing. The decompiler engine receives inputs from the front ends in the form of either individual executable files or decompiler project files. Reko project files contain additional information about a binary file, helpful to the decompilation process or for formatting the output. The decompiler engine then proceeds to analyze the input binary.


Byte map view of a loaded ARM binary executable

Decompiled view of a loaded ARM binary executable

Reko has the ambition of supporting decompilation of various processor architectures and executable file formats with minimal user intervention. For a complete list, see the supported binaries page.

Please note that many software licenses prohibit decompilation or other reverse engineering of their machine code binaries. Use this decompiler only if you have legal rights to decompile the binary (for instance if the binary is your own.)

Downloading Reko

Official releases are published every few months on Github and SourceForge. Users who can't or won't build Reko themselves can download the output of the Cirrus CI integration builder or the Github Actions integration builder. Naturally you can build the project from the sources: see "Hacking" below.

Installing Reko

The following prerequisite software must be installed on your machine first:

Download an appropriate installer and run it on the target machine.

After installation, you can proceed by either downloading binaries directly from the integration build server, or by building Reko from sources (see Hacking below).

Documentation

To get acquainted with Reko's various features, you can read the user's guide. If you're interested in the internal workings of the project, see the wiki.

Getting support

You can report any issues you encounter or ask any Reko-related question on the issue tracker. You can also try the Reko Gitter.im chatroom. Reko is built by volunteers' efforts on their spare time, so adjust your response-time expectations accordingly.

Hacking

To build reko, start by cloning https://github.com/uxmal/reko. You can use an IDE or the command line to build the solution file Reko-decompiler.sln. Reko requires the .NET 6.0 SDK to compile. If you are an IDE user, use a recent version of Visual Studio 2022. If you wish to build using the command line, use the command

dotnet msbuild -p:Platform={platform} -p:Configuration={config} -v:m -t:build_solution -m ./src/BuildTargets/BuildTargets.csproj

Replace {config} with either Debug or Release, and {platform} with x64 or x86.

Note: please let us know if you still are not able to compile, so we can help you fix the issue.

If you're interested in contributing code, see the road map for areas to explore. The Wiki has more information about the Reko project's internal workings. Please consult the style guide.

Warnings and errors related to WiX

You will receive warnings or errors when loading the solution in Visual Studio if you haven't installed the WiX toolset on your development machine. You can safely ignore the warnings; the WiX toolset is only used when making MSI installer packages. You will not need to build an installer if you're already able to compile the project: the build process copies all the necessary files into a single directory. If you do want to build an MSI installer with the WiX toolchain, you can download it here: http://wixtoolset.org/releases/

Errors related to CMake in Visual Studio

Depending on what you do Visual Studio might try to rebuild NativeProxy which depends on CMake. You can either install CMake and make sure it's added to your PATH or disable the project in Visual Studio.

Having CMake installed as part of Visual Studio is sufficient to run msbuild from the Developer Command Prompt but not when building from inside VS, unless you've added that to your global PATH. Installing CMake externally allows you to add it to PATH during the installation.

NOTE: there is an issue in certain versions of Visual Studio that can manifest itself when loading the project. You'll notice it if Visual Studio is stuck "Running Background Tasks" and won't let you build the project. A workaround is to right click the "NativeProxy" project in the solution explorer and choose "Unload Project". The project will then be able to load and build correctly. This issue doesn't occur when building from the command line.

How do I start Reko?

The solution folder Drivers contains the executables that act as user interfaces. The subdirectory WindowsDecompiler contains the GUI client for the Windows Forms user interface. The subdirectory AvaloniaShell contains the GUI client for the cross-platform Avalonia user interface (still under construction). CmdLine is a command line driver.

Recent versions

See the release log for the latest releases.

reko's People

Contributors

a2intl avatar blindmatrix avatar chostelet avatar claunia avatar ermshiperete avatar gbody avatar gitter-badger avatar gregoral avatar kalmalyzer avatar lukas-dresel avatar lumitoluma avatar mefisto94 avatar mewmew avatar mjunix avatar nemerle avatar ptomin avatar samb avatar shandianchengzi avatar slartibardfast avatar smx-smx avatar starwort avatar superusercode avatar teaalltr avatar throwaway96 avatar uxmal avatar vladrassokhin avatar wesinator avatar wildptr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

reko's Issues

Providing additional information about entities (Procedure/Block/Instruction/Memory) ?

Considering that heuristics are used in many places, and the fact that sometimes humans actually "know better" ๐Ÿ˜„ what would be a good way of passing usable additional information to the engine ?

Non-exhaustive list of things we might want to do manually:

  • Marking procedure as does_not_return
  • Marking procedure as library_code - thus it's code will not be decompiled
  • Adding procedure preconditions ( assume ds==0xC000, assume sp==bp, etc )
  • Adding block preconditions ( essentially same as procedure preconditions )
  • Adding jump targets ( when automated indirect jump resolution fails )
  • marking a piece of undecoded binary as data ( primitive and complex types ),
  • marking a piece of undecoded binary as code ( essentially adding given address to scanner queue)
  • marking data/code as undecoded in cases where scanner made some mistakes

This should help solving #9

TextView should autoscroll when selecting

  1. Go to code view or disassembly view, both of which use TextView,
  2. Click in the middle of teh screen.
  3. Drag the mouse downwards to the bottom of the screen

Result: nothing happens

Expected :autoscrolling to occur.

Reko doesn't seem to identify assignments correctly

I have written a rather trivial example of doing only assignment statement to different data type sizes to check if Reko identifies them correctly now. However, it seems that its not recognizing the code in the function at all and returns different errors with different compilation types of the sample (Debug/Release). It seems that Reko has problems dealing with debug versions of C/C++ applications compiled with VS for some reason. It could be for the extra code added for the debug version.

38 compile error in VS 2012 (pro & express)

downloaded a zip snapshot today

accepted "update" option. select "build solution". 38 compiler errors. I think you need better build instructions?

Error 13 The type or namespace name 'NUnit' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 23
Error 14 The type or namespace name 'TestFixtureAttribute' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 31
Error 15 The type or namespace name 'TestFixture' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 31
Error 16 The type or namespace name 'TestAttribute' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 42
Error 17 The type or namespace name 'Test' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 42
Error 18 The type or namespace name 'TestAttribute' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 49
Error 19 The type or namespace name 'Test' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 49
Error 20 The type or namespace name 'TestAttribute' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 56
Error 21 The type or namespace name 'Test' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 56
Error 22 The type or namespace name 'TestAttribute' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 63
Error 23 The type or namespace name 'Test' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 63
Error 24 The type or namespace name 'TestAttribute' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 71
Error 25 The type or namespace name 'Test' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 71
Error 26 The type or namespace name 'TestAttribute' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 79
Error 27 The type or namespace name 'Test' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 79
Error 28 The type or namespace name 'TestAttribute' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 89
Error 29 The type or namespace name 'Test' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 89
Error 30 The type or namespace name 'TestAttribute' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 97
Error 31 The type or namespace name 'Test' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 97
Error 32 The type or namespace name 'TestAttribute' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 105
Error 33 The type or namespace name 'Test' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 105
Error 34 The type or namespace name 'TestAttribute' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 113
Error 35 The type or namespace name 'Test' could not be found (are you missing a using directive or an assembly reference?) C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\MsPrintfFormatParser.Tests.cs 113
Error 42 Metadata file 'C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\bin\Debug\Reko.Environments.Win32.dll' could not be found C:\Users\beebob\Desktop\reko-master\src\ImageLoaders\MzExe\CSC
Error 62 Metadata file 'C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\bin\Debug\Reko.Environments.Win32.dll' could not be found C:\Users\beebob\Desktop\reko-master\src\ImageLoaders\OdbgScript\CSC
Error 63 Metadata file 'C:\Users\beebob\Desktop\reko-master\src\ImageLoaders\MzExe\bin\Debug\Reko.ImageLoaders.MzExe.dll' could not be found C:\Users\beebob\Desktop\reko-master\src\ImageLoaders\OdbgScript\CSC
Error 65 Metadata file 'C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\bin\Debug\Reko.Environments.Win32.dll' could not be found C:\Users\beebob\Desktop\reko-master\src\WindowsItp\CSC
Error 66 Metadata file 'C:\Users\beebob\Desktop\reko-master\src\ImageLoaders\MzExe\bin\Debug\Reko.ImageLoaders.MzExe.dll' could not be found C:\Users\beebob\Desktop\reko-master\src\WindowsItp\CSC
Error 67 Metadata file 'C:\Users\beebob\Desktop\reko-master\src\ImageLoaders\OdbgScript\bin\Debug\Reko.ImageLoaders.OdbgScript.dll' could not be found C:\Users\beebob\Desktop\reko-master\src\WindowsItp\CSC
Error 68 error C2275: 'LPSTR' : illegal use of this type as an expression C:\Users\beebob\Desktop\reko-master\src\Samples\win32api\win32api.c 8
Error 69 error C2146: syntax error : missing ';' before identifier 'next' C:\Users\beebob\Desktop\reko-master\src\Samples\win32api\win32api.c 8
Error 70 error C2065: 'next' : undeclared identifier C:\Users\beebob\Desktop\reko-master\src\Samples\win32api\win32api.c 8
Error 72 error C2065: 'next' : undeclared identifier C:\Users\beebob\Desktop\reko-master\src\Samples\win32api\win32api.c 9
Error 73 error C2100: illegal indirection C:\Users\beebob\Desktop\reko-master\src\Samples\win32api\win32api.c 9
Error 74 error MSB3073: The command "nmake /f makefile /C /D Configuration=Debug" exited with code 2. C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.MakeFile.Targets 38
Error 75 Metadata file 'C:\Users\beebob\Desktop\reko-master\src\Environments\Win32\bin\Debug\Reko.Environments.Win32.dll' could not be found C:\Users\beebob\Desktop\reko-master\src\UnitTests\CSC
Error 76 Metadata file 'C:\Users\beebob\Desktop\reko-master\src\ImageLoaders\MzExe\bin\Debug\Reko.ImageLoaders.MzExe.dll' could not be found C:\Users\beebob\Desktop\reko-master\src\UnitTests\CSC
Error 77 Metadata file 'C:\Users\beebob\Desktop\reko-master\src\ImageLoaders\MzExe\bin\Debug\Reko.ImageLoaders.MzExe.dll' could not be found C:\Users\beebob\Desktop\reko-master\src\Drivers\MonoDecompiler\CSC

Type reconstruction fails in function signature

When decompiling fib.exe, we get:

void fn00401050(word32 dwArg00, word32 * dwArg04, word32 dwArg08)
 {
     byte * ecx_18 = dwArg04;
     byte * edi_21 = dwArg04 - 0x00000001 + dwArg08;
     if (dwArg04 <u edi_21)
     {
         uint32 esi_26 = dwArg00;
         do
         {
            uint32 edx_33 = SLICE(esi_26 *u 0xCCCCCCCD, word32, 32) >>u 0x00000003;
            *ecx_18 = (byte) (esi_26 - edx_33 * 0x00000005 * 0x00000002) + 0x30;
            ecx_18 = ecx_18 + 1;
             esi_26 = edx_33;
         }
         while (ecx_18 <u edi_21);
    }
    *ecx_18 = 0x00;
    return;
 }

The dwArg04 parameter should be a byte *, not a word32 *

NE loader doesn't add entry points automatically

The recently added New Executable loader is capable of parsing the entry point section of NE header, but it doesn't do anything with the information. It should pass on the entry points to the Reko scanner. For DLLs we mustn't forget the WEP. Most of the groundwork has already been lain, I just ran out of time last week.

M68K embedded coprocessor instructions

68020 F-Line instructions are used for coprocessor communication. The cpid embedded inside the given instruction can be:
0 - MMU
1 - 68881/2 FPU
2-7 - hardware dependent

Thus all those instructions potentially interact with a different pieces of hardware, so things are getting a bit hairy now ๐Ÿ˜„

When encountering a F-line instruction reko should:

  • query the Hardware definition from current Environment
  • get appropriate Disassembler given a cooprocessor Id,
  • use the retrieved disassembler to decode instruction

To make things even more 'fun', depending on the cooprocessor, some instructions use arbitrary number of additional instruction words, from cpScc docs:

If the coprocessor requires additional information to evaluate the condition, the instruction
can include extension words to provide this information. The number of these extension
words, which follow the word containing the coprocessor condition selector field, is
determined by the coprocessor design.

I think for now we'll have to just mark those as illegal ?

Function code preview bug

Here are the steps to reproduce the bug

  1. Click on a any function resolved by Reko (for simplicity chose the first one in the list)
  2. In the code preview window, try to drill down in other functions resolved within this function, if it doesn't exist just move to another function in the tree view and try step 2 again.
  3. Now go back and click on the first function you chose from step 1 in the functions tree view on the left side. You will notice that the code preview window is not updated and have yo chose another function and then go back to the function chosen in step 1 in order to see the code output again.

disable next phase button after user edits

After a user edit - especially after changing a function signature - Reko will be in a inconsistent state. It is not easy to reconcile this inconsistent state as it would require an analysis from scratch.

Instead, after a user edit, the next phase button will be disabled.The restart decompilation should keep all user edits and restart the process anew.

Dead code elimination removes stack allocated structures incorrectly

The following code fragment

Foo_struct_t s;
s.x = 3;
bar (&stack_obj);

is decompiled to

fn0010000 (fp - 0x0000001C);

Dead code elimination removes the assignment because it can't prove that it is live. To fix this Reko must defer dead code elimination until type analysis determines what arguments the called function has and then that knowledge is propagated to the callee.

C2Xml - Provide a way for specifying additional attributes

It seems it would be advantageous to extend accepted ansi-c syntax by adding cpp++11's [[]] attributes
I have built a simple utility that can produce a list of augmented amiga library prototypes, it would be great if this information could be passed to reko.
As an example:

//...............
[[reko_location(reg_D0)]] ULONG SetExcept([[reko_location(reg_D0)]] ULONG newSignals,[[reko_location(reg_D1)]] ULONG signalMask);
[[reko_location(reg_D0)]] APTR SetFunction([[reko_location(reg_A1)]] struct Library * library,[[reko_location(reg_A0.W)]] LONG funcOffset,[[reko_location(reg_D0)]] APTR funcEntry);
[[reko_location(reg_D0)]] struct Interrupt * SetIntVector([[reko_partial_location(reg_D0,"0:4")]] ULONG intNumber,[[reko_location(reg_A1)]]  struct Interrupt * interrupt);
[[reko_location(reg_D0)]] ULONG SetSignal([[reko_location(reg_D0)]] ULONG newSignals,[[reko_location(reg_D1)]] ULONG signalMask);
[[reko_location(reg_D0)]] ULONG SetSR([[reko_location(reg_D0)]] ULONG newSR,[[reko_location(reg_D1)]]  ULONG mask);
[[reko_partial_location(reg_D0,"0:8")]] BYTE SetTaskPri([[reko_location(reg_A1)]] struct Task * task,[[reko_partial_location(reg_D0,"0:8")]] LONG priority);
//...............

Only disassemble scanned bytes

Currently disassembly view will try to display all bytes as instructions ( sometimes causing crashes ), maybe we could display all bytes as byte definition instructions until they are marked as code by the scanner ?

Preparing for release of reko v0.5.2.0

Folks,

With all the new features that have come in the past month, it's soon time to make a release. It's going to be a minor release (0.5.1 => 0.5.2) but will have things like:

  • Improved code structuring code (CFG => if/while/select)
  • Enhanced support for platform details (specifically AmigaOS, but more generally, a framework for users specifying their target environments in finer detail than what comes "out of the box"
  • Support for specifying out-of-band characetristics using C++11 attributes in header files.
  • Various bugfixes.

I wanted to poll y'all to see if there are any other features/bugfixes that you'd want tackled for this release.

Condition code handling needs rethought

Reko fails the following sequence

test esi,esi 
jbe Foo

which after SSA transformation and alias expansion turns into

SZO = esi & esi
CZ = SZO
C = 0
CZ = C
branch Test (CZ, ULE)

And expression propagation turns this to

branch (ffalse)

Arrow navigation walks off the memory viewer

1.Scroll to middle of memory viewer.
2.Use up arrow key to move selection up to top of view
3.Press arrow again

Result:
Selection disappears from screen

Expected:
memory viewer to scroll one line, showing hte selection.

"Next Phase" is confusing, consider changing the text as phases progress

The "Next Phase" button on the toolbar is confusing as it isn't clear what it will do. Consider changing the text of the button depending on what phase you're in.

To do this, the QueryStatus() mechanism needs to be expanded to accept a new text string. Suggestion: use a new enumeration, MenuStatus.SetText, which when set looks at the Text property of the MenuStatus and uses that to update the UI element.

Project format needs to change to accomodate user-provided signatures

Realisitically, most users will be providing signature definitions in a C-like syntax:

int foo([reko:reg("eax")] arg1)

The current project file format doesn't support this. It assumes all signature definitions are fine-grained expressions in XML:

<signature><ret><reg>eax</reg><ret><arg name="arg1"><reg>eax</reg></arg></signature>

No rational human will put up with that in the long run.

To do: extend the file format to support both kinds of definitions.

Add a simple .gitattributes file and normalize all sources line endings ?

This should help somewhat with running tests on linux, since some files have windows line endings and use raw-literals thus embedding \r\n in the expected output ( not sure of the proper c# nomenclature for those @" ... " things ๐Ÿ˜„ )

something like :

*        text=auto
*.cs     text diff=csharp

Implement text range selection in TextView

Recall that the RichEdit Control was unsuitable for reko, and Windows Forms has no other way of displaying rich text. TextView is a poor man's version of GtkTextView which seems to serve adequately. The problem is that there is no way for users to mark a text selection. This precludes copying snippets of code, which is a fairly reasonable expectation. Today reko cheeses out by providing a "Copy All" verb, but it certainly could be better.

To implement this, TextView needs the usual assortment of Selection range plumbing: where the selection starts (the anchor), where it ends. A TextSpan must be able to perform hit-detection based on an X-coordinate and return the index position within the span that was selected. TextView needs a CurrentSelection that returns a TextSelection object consisting of all the TextSpans, both partially and fully selected. TextSpans, when rendering, must know what portion of them is selected and what is not so that they render colors correctly.

finish win16 API description

Today Reko has no knowledge of the win16 API. It needs a metadata file to support friendly names for the Windows API calls the subject program is making

Discussion: how aggressive should reko be with constant propagation?

@halsten gave me a binary that was decompiled by reko in an amusing way. In pseudo-C, we have:

int foo() {
    return 0;
}

and the call site:

    // other code elided
   global_var = foo();

The compiler settings @halsten used generated the following code:

_foo proc
    xor eax,eax
    ret
    endp

and the call site:

    ;; other code elided
    call _foo
    mov ds:[0x12341234],eax

When reko does its global "registers trashed" analysis, it discovers that at the end of _foo, the EAX register is always 0. This constant 0 is then propagated to all call sites of _foo, so that they become (in asm format, reko uses its internal RTL format as you may be aware):

    call _foo
    mov dword ptr ds:[0x12341234],0   ; <-- Constant 0 propagated

After the registers modified pass, reko performs a liveness analysis to see what values are live-in and live-out of procedures. Since all the callers of _foo will have a 0 constant propagated, EAX will never be live-out from _foo. Therefore, reko assumes that EAX is dead at the exit of _foo, and changes the code to:

_foo proc
    ret
    endp

When reko later translates to C we get

void foo() {
}

and

    foo();
    g_dw12341234 = 0;

To an external observer, the behavior of the decompiled program is the same as the original program; they both modify the same global variable and set it to the same value. However, the decompiled code is not the same as the source code.

So why does reko do this aggressive constant propagation? One of the binaries I have been training on -- which alas I can't share due to legal encumberances -- is a x86 real-mode binary which does this:

some_func proc 
    ;;; lots of stuff not relevant to the discussion
    push 0x1234
    pop ds    ; Set DS to 0x1234
   ret

and the (only) calling function looks something like this:

calling_func proc
    push 0x1234
    pop ds
    ;; lots of code using DS = 0x1234
    push cs
    pop ds              ; DS no longer is 0x1234
    ;; code using altered value of DS
    call some_func
    ;;; after the call, DS is back to 0x1234
    ;; code that relies on DS being 0x1234 follows
    ;;...then later, a computed goto appears
   mov bx,[some_global_var]
   add bx,bx
   jmp ds:[bx+0x5678]   ;; jump to the bx'th offset

Rather than the usual PUSH DS // stuff // POP DS idiom for saving and restoring a register value, the programmers relied on some_func returning a constant value in DS, and a few instructions later there is a computed goto where the entries in the jump table is referred to using DS.

If reko didn't know that some_proc returned a constant 0x1234 in DS, it couldn't resolve the jump table -- basically just has to give up or ask the oracle (read "user") for help. By doing the constant propagation, reko can resolve the jump table by itself.

The question is whether reko should be doing the aggressive constant propagation, given that it may result in the decompiled files not matching the original source.. My position is that as long as the program's behavior is still correct, it's OK to do this. So much information is lost in the compilation process anyway that you will never get 100% source recovery anyway.

If a user really wants to preserve the original signature, they can override this either in the GUI or in the *.dcproj file.That is, they can force the signature of _foo in the first example to be int foo(), and reko should oblige by generating the expected

int foo() {
    return 0;
}

I'd like to hear what other people think of this. Is my position acceptable here, or should reko behave differently?

Capstone integration

Several people have voiced an interest in integrating Reko with Capstone.NET. Alas, I don't have the time for this just now, but it seems to me that it would be a perfect way to get to learn the codebase. A while back I created a branch, called "capstone", which should be used for this kund of integration.

Let's use this thread to communicate on how this would be done.

Implement the reko console

Reko has a console window that does nothing, This would be a great place to implement a scripting language to allow repetitive tasks to be carried out without using UI gestures.

Some ideas for a scripting language:
<address> scan [<name>] [<reg>=<value>]*
Scans a procedure starting at <address>, giving it an optional name and optionally a set of assumed register / value pairs.

<address1>[,<address2>] type <type-specification>
<address1>[l <byte-length>] type <type-specification>
Set the type of the global variable at <address1>-<address2> to <type-specification>. If the type is finite in size i.e. not an array of undetermined size, there is no need to specify address2 or a length.

[<address1>[,<address2>]] search <byte pattern>
Find all addresses within the optional address range that match the byte pattern. This could be used to feed scan; i.e.
0134,0200 search(#55 8b EC#) | scan
would find a list of addresses that start with the sterotypical push ebp; mov ebp, esp signature and feed them to the procedure scanner.

<address1[,<address2>][l <length>] wb #00#
<address1[,<address2>][l <length>] wh #0000#
<address1[,<address2>][l <length>] ww #00000000#
<address1[,<address2>][l <length>] wq #0000000000000000#
Writes the byte(halfword, word, doubleword) into the address range specified.

Real-mode MSDOS code - switch reconstruction ?

While trying out reko on one of dcc's test files (BENCHFN.EXE) I've encountered a known problem with BuildMapping.

Namely, trying to reconstruct a 'switch', the engine tries to lookup the index table contents from the expression:

mov     bl, BYTE PTR[bx+0x4F9]

Now in 16bit mode the 0x4F9 is actually the data segment relative address of the index table -(ds:0x4F9).
Shouldn't 0x4F9 be disassembled/interpreted as a (ds<<4 + 0x4F9) expression, and only if segment value is 'known' i.e.

  • proven to be constant on all paths reaching the given instruction
  • provided by the user at the instruction/block/function level

it can then be converted into a proper Address instance ?

General mechanism for handling delay slots in architectures like SPARC, MIPS

Reko needs a way to cope with delay slots like those in SPARC and MIPS processors. The RtlClass enumeration has the Delay member that these architectures should use when rewriting instructions to RTL. The next step is to make the Scanner aware of delay slots and act appropriately when it encounters such instructions.

Decompiled window should scroll and allow navigation

Add a scroll bar to the decompiler window, allowing scrolling about.
Support the use of keyboard to scroll the window as well.
Right-clicking on a line in the decompiler window should at least show "Mark procedure entry", and possibly also "Show memory" which would navigte to the memory control.

Code preview coloring for resolved functions is not working

The functions resolved by Reko when viewed in the code preview window, they do not have coloring anymore, so this was lost with this update. Can we please get it back as well as implementing the coloring for output code. For example, can I have the ability to chose different coloring scheme just like VS have?

Change code view to display procedure metadata

Today the procedure view only displays IR or finished C code. It would be beneficial if the real estate on the right side were better used. I am a fan of direct editing and avoiding modal dialogs wherever possible.

Imagine the code view as a two pane view. Left side will be code as today. Right side will have a tab control with at least the following tabs:

  • procedure name and signature (editable) . Renaming procedures should be "free" since the procedure name doesn't affect the workings of Reko. Changing the procedure signature does have consequences - you will need to restart Reko after changing procedure signatures.
  • Registers used / modified. This is lower level information. Obviously changing Registers will affect the procedure signature and will also require a restart.
  • procedure characteristics such as [[noreturn]]

doesn't document where the applications are built to

"A command-line, a Windows GUI, and a ASP.NET front end exist at the time of writing. "

So where do they actually get built to? Please update the main description to indicate.

Or better yet, just provide pre-built binaries for people who just want to see what the tool is capable of, without spending an hour installing VS + trying to compile

How to add Platform options ?

I'd like to make use of additional information when ?constructing? Platform instances:

For example, adding option to select different kickstart versions when using AmigaOSPlatform.
Depending on the kickstart version, reko should load different "exec.funcs" definitions, type libraries etc.

I think this would require a change in the way ImageLoaders and UI work:

  • ImageLoader creates Platform instance
  • Ui checks if Platform instance provides options
  • if yes
    • Ui shows a Platform options dialog filled with default values
    • User changes the options/accepts them/ options are applied.

Other option would be to add Platform setting to the menu ?

Things we might want to add in far off future:

Selectable hardware definition modules:

  • MSDOS era PC hardware i.e VGA module, would extend the memory map with video-memory area, add smart translators for OUT/IN instructions, provide video interrupt conversion routines (INT 0x13) etc,
  • Base board configuration modules ( assign selected peripheral to given IO space )

OS module configuration:

  • Selecting OS version ( AmigaOS kickstart revision etc)

TextView should support annotation

An annotation would be visible as a little icon next to the annotated item. When clicked, the annotation should open up and display an editor for annotations. For disassembly window, these could be:

  • register value assumptions ("assume eax is 0x3000 at this point" )
  • register / memory type assumptions ("assume eax is of type int (*__cdecl)(double, char *)" )
  • General comments, which should be emitted in the final C code.

Heuristics scanning roadplan

Reko almost always fails to recover all of the code of a binary. Being able to do so in general would be tantamount to being able to distinguish code from data, and therefore crack the Halting Problem. Even though this is impossible, Reko can make a good effort at recovering as much as it can automatically to save human users the bother of wallowing through code dumps.

In Reko, the scanner is the stage which discovers code and rewrites it to an architecture-neutral RTL form. Today the scanner is recursive descent, with some hacks to manage jump tables. It should be possible to substitute another scanner that does more analyis of the binary to try to find other code that it can't prove to be reachable using the original recursive descent.

Heuristic scanning is the term I'm using for this approach. I'm writing a new scanner implementation, in some parts based on Static Disassembly of Obfuscated Binaries.

Some of this is already in place, but now I need to implement an efficient Trie to collect statistics on the binary that has been scanned so far, or even use statistics from a corpus of existing binaries. Once the trie is implemented, and the corresponding InstructionComparer, it can collect statistics on commonly encountered basic block and procedure beginnings. These will be used in the implementation of the heuristic determination of function locations.

Assigning the correct function addresses

So far reko doesn't recognize the function's EP for a PE file correctly. By default in most cases of PE files, the EP is set to 401000 however, reko doesn't respect that and assigns the function addresses to its own logic. Am unaware of how it is doing it internally. But, I expect that if I chose function 401000 from the left side panel in the GUI then I should be redirected to the main EP of the executable instead of having to drill down and that takes a lot of time.

Import stubs should be marked as not callable

Currently reko will sometimes want to decompile the import stubs of a PE executable. This leads to extra code that really shouldn't show up. Reko should mark it as "non-decompilable", or library code, and not emit it.

Improve numeric formatting

Currently reko emits numeric constants as 0x[nn..n] where [nn...n] is a string of hexadecimal digits whose length is always the same as the number of nybbles that fit in the datum size in question. That means that a 8-bit byte being assigned looks like this:

c = 0x01

while a 64-bit word assignment looks like this

c = 0x00000000000000001

which makes it hard to read.

We still want to maintain this form of output in the intermediate code dumps as it makes diagnostics easier, but the final output should look more like what a C programmer would have written. Therefore:

  • For character constants, 8- and 16-bit, reko should emit a character constant according to C rules:
c1 = 'a';  // instead of 0x65
c2 = '\n' // instead of 0x10
c3 = '\xFF'; // instrad of 0xFF
wc4 = L'\uFEFF'; // instead of 0xFEFF
  • For numeric values, use an entropic measurement to see if the numeric representation has less entropy in hexadecimal rather than decimal, and if so, emit them in hex. Unsigned numbers will have a bias towards hexadecimal while signed numbers will have a bias toward decimal.
w1 = 0x7FFFFFFF;
w2 = 10000;  // not 2710
w3 = 1000;   // Not 3E8

Mips PE Executable

Cannot decompile mips exe.
Error message: unsupported machine type 0x0166 in PE header.

I read somewhere that reko support MIPS partially. Is it true?

Wrong link in the 'About' window

In the 'About' window there is a link:
https://github/uxmal/reko
(github without .com)
screenshot_1
Right link:
https://github.com/uxmal/reko

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.