Git Product home page Git Product logo

lifting-bits / mcsema Goto Github PK

View Code? Open in Web Editor NEW
2.6K 112.0 343.0 157.95 MB

Framework for lifting x86, amd64, aarch64, sparc32, and sparc64 program binaries to LLVM bitcode

Home Page: https://www.trailofbits.com/expertise/mcsema

License: GNU Affero General Public License v3.0

CMake 8.05% C++ 53.57% Shell 1.19% C 5.14% Python 31.01% GDB 0.53% Makefile 0.06% Batchfile 0.08% Dockerfile 0.37%
x86 x86-64 aarch64 llvm llvm-ir llvm-bitcode ida binary-analysis sparc sparc64

mcsema's People

Contributors

aiethel avatar alessandrogario avatar artemdinaburg avatar burntfalafel avatar computerality avatar dbwodlf3 avatar dguido avatar ekilmer avatar erupmi avatar fkil avatar garretreece avatar hugin avatar josh2059 avatar krx avatar kumarak avatar kylemiles avatar meme avatar memto avatar mewmew avatar mike-myers-tob avatar moshekaplan avatar pgoodman avatar sdasgup3 avatar sineaggi avatar thestr4ng3r avatar tkmru avatar vanhauser-thc avatar volpino avatar yu-chenchang avatar yuki256 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mcsema's Issues

Failures on i386 (ubuntu 14.04)

Artem,

Thanks for those fixes, we have now got bitcode for both zlib and thttpd.

Not sure if this interests you, but 64 of the build tests fail:

[==========] 502 tests from 1 test case ran. (40422 ms total)
[  PASSED  ] 438 tests.
[  FAILED  ] 64 tests, listed below:
[  FAILED  ] ModuleTest.ABS_F
[  FAILED  ] ModuleTest.ENTER
[  FAILED  ] ModuleTest.F2XM1
[  FAILED  ] ModuleTest.FABS
[  FAILED  ] ModuleTest.FBLD
[  FAILED  ] ModuleTest.FBSTP
[  FAILED  ] ModuleTest.FCHS
[  FAILED  ] ModuleTest.FCLEX
[  FAILED  ] ModuleTest.FCMOVB
[  FAILED  ] ModuleTest.FCMOVBE
[  FAILED  ] ModuleTest.FCMOVE
[  FAILED  ] ModuleTest.FCMOVNB
[  FAILED  ] ModuleTest.FCMOVNBE
[  FAILED  ] ModuleTest.FCMOVNE
[  FAILED  ] ModuleTest.FCMOVNU
[  FAILED  ] ModuleTest.FCMOVU
[  FAILED  ] ModuleTest.FCOM
[  FAILED  ] ModuleTest.FCOMIP_STFr
[  FAILED  ] ModuleTest.FCOMI_STFr
[  FAILED  ] ModuleTest.FCOMP
[  FAILED  ] ModuleTest.FCOMPP
[  FAILED  ] ModuleTest.FCOMP_F32m
[  FAILED  ] ModuleTest.FCOMP_F64m
[  FAILED  ] ModuleTest.FCOMP_STFr
[  FAILED  ] ModuleTest.FCOM_F32m
[  FAILED  ] ModuleTest.FCOM_F64m
[  FAILED  ] ModuleTest.FCOM_STFr
[  FAILED  ] ModuleTest.FCOS
[  FAILED  ] ModuleTest.FDECSTP
[  FAILED  ] ModuleTest.FFREE
[  FAILED  ] ModuleTest.FICOMP_16m
[  FAILED  ] ModuleTest.FICOMP_32m
[  FAILED  ] ModuleTest.FICOM_16m
[  FAILED  ] ModuleTest.FICOM_32m
[  FAILED  ] ModuleTest.FINCSTP
[  FAILED  ] ModuleTest.FINIT
[  FAILED  ] ModuleTest.FISTTP_16m
[  FAILED  ] ModuleTest.FISTTP_32m
[  FAILED  ] ModuleTest.FISTTP_64m
[  FAILED  ] ModuleTest.FLDENV
[  FAILED  ] ModuleTest.FLDL2E
[  FAILED  ] ModuleTest.FLDL2T
[  FAILED  ] ModuleTest.FLDLG2
[  FAILED  ] ModuleTest.FLDLN2
[  FAILED  ] ModuleTest.FLDPI
[  FAILED  ] ModuleTest.FNCLEX
[  FAILED  ] ModuleTest.FNINIT
[  FAILED  ] ModuleTest.FNOP
[  FAILED  ] ModuleTest.FNSAVE
[  FAILED  ] ModuleTest.FPATAN
[  FAILED  ] ModuleTest.FPREM
[  FAILED  ] ModuleTest.FPREM1
[  FAILED  ] ModuleTest.FRSTOR
[  FAILED  ] ModuleTest.FSAVE
[  FAILED  ] ModuleTest.FSINCOS
[  FAILED  ] ModuleTest.FSQRT
[  FAILED  ] ModuleTest.FTST
[  FAILED  ] ModuleTest.FUCOMIP_STFr
[  FAILED  ] ModuleTest.FUCOMI_STFr
[  FAILED  ] ModuleTest.FXAM
[  FAILED  ] ModuleTest.FXRSTOR
[  FAILED  ] ModuleTest.FXSAVE
[  FAILED  ] ModuleTest.FXTRACT
[  FAILED  ] ModuleTest.LEAVE

64 FAILED TESTS

[100%] Built target run_testSemantics

Zlib issue on i386.

After producing the control flow graph using IDAPro the cfg_to_bc crashes.

The troublesome command is:

cfg_to_bc -i zlib.cfg -driver=mcsema_main,main,2,return,C -o zlib.bc

The assertion failure is

cfg_to_bc: /home/cib/Repositories/mcsema/mc-sema/../llvm-3.5/include/llvm/ADT  /SmallVector.h:145: const T& llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::operator[](unsigned int) const [with T = llvm::MCOperand; <template-parameter-1-2> = void; llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::const_reference = const llvm::MCOperand&]: Assertion `begin() + idx < end()' failed.
make: *** [ida_mcsema] Aborted (core dumped)

The cfg file in question is here:

http://www.csl.sri.com/~iam/zlib.cfg.zip

stderr causing problems.

We are trying to lift a simple helloworld.c example on ubuntu 14.0 i386 where we print to
stderr rather than the implicit stdout.

bitcode_from_cfg/cfg_to_bc -i version0.cfg -driver=mcsema_main,main,2,return,C -o version0.bc
Already have driver for: main
Inserted function: sub_8000000
inserting global data section named data_0x800007d
error:
Line: 90
File: /home/cib/Repositories/mcsema/mc-sema/cfgToLLVM/x86Instrs_MOV.h
Could not find external: stderr

Is there something we can do to make progress here?

We are still trying to figure out where and why our lifted bytecode is SIGSEGV-ing.

Build error Windos7 x86 VS2015 x86

Hi
I am trying to compile mcsema using CMake in Windows 7(x86) with help of visual c++ compiler version 19.00.23506(VS2015) . But unfortunately, I get the following error:

[97%] Built target cfgToLLVM
[98%] Built target peToCFG
[98%] Built target pe-parser-library
NMAKE : fatal error U1073: don't know how to make 'boost\lib\boost_filesystem--mt-gd-1_52.lib' Stop.
NMAKE : fatal error U1077: '"C:\Program Files\Microsoft Visual Studio 14.0\VC\BIN\nmake.exe"' : return code '0x2' Stop.
NMAKE : fatal error U1077: '"C:\Program Files\Microsoft Visual Studio 14.0\VC\BIN\nmake.exe"' : return code '0x2' Stop.

I also look at mcsema/build/boost/lib and see the name of the compiled library for boost_filesystem is boost_filesystem-vc-mt-gd-1_52.lib
any help please?

can't use ida to generate .cfg file

I'm can't use ida to generate cfg file,it always show this error, but bin_descend.exe is fine!!!

when i test dll_test_6.bat, it will crash by use bin_descend.exe, also ida does't work!

how can i solve it???

demo_maze.sh not working on Ubuntu 14.04 LTS

Hello,
I am facing issues while trying mcsema on C code that I compiled and also with the demo_maze.sh (while the sailboat works fine). The errors are given by the bin_descend program.

Here's a complete log of the error

$ ./demo_maze.sh
Using bin_descend to recover CFG
Disassembly not guided by outside facts.
Use :'../../build/mc-sema/bin_descend//bin_descend-p <protobuff>' to feed information to guide the disassembly
Disassembly not guided by outside facts.
Use: -p <protobuff>' to feed information to guide the disassembly
Looking at Object File section: .interp
Found symbol: .interp in .interp
Looking at Object File section: .init
Found symbol: .init in .init
Found symbol: _init in .init
Looking at Object File section: .plt
Found symbol: .plt in .plt
Looking at Object File section: .text
Found symbol: .text in .text
Found symbol: deregister_tm_clones in .text
Found symbol: register_tm_clones in .text
Found symbol: __do_global_dtors_aux in .text
Found symbol: frame_dummy in .text
Found symbol: __libc_csu_fini in .text
Found symbol: __x86.get_pc_thunk.bx in .text
Found symbol: draw in .text
Found symbol: __libc_csu_init in .text
Found symbol: _start in .text
Found symbol: main in .text
Looking at Object File section: .fini
Found symbol: .fini in .fini
Found symbol: _fini in .fini
Looking at Object File section: .rodata
Found symbol: .rodata in .rodata
Found symbol: _IO_stdin_used in .rodata
Found symbol: _fp_hw in .rodata
Looking at Object File section: .eh_frame_hdr
Found symbol: .eh_frame_hdr in .eh_frame_hdr
Looking at Object File section: .eh_frame
Found symbol: .eh_frame in .eh_frame
Found symbol: __FRAME_END__ in .eh_frame
Looking at Object File section: .jcr
Found symbol: .jcr in .jcr
Found symbol: __JCR_LIST__ in .jcr
Found symbol: __JCR_END__ in .jcr
Looking at Object File section: .got
Found symbol: .got in .got
Looking at Object File section: .got.plt
Found symbol: .got.plt in .got.plt
Found symbol: _GLOBAL_OFFSET_TABLE_ in .got.plt
Looking at Object File section: .data
Found symbol: .data in .data
Found symbol: data_start in .data
Found symbol: _edata in .data
Found symbol: __data_start in .data
Found symbol: __dso_handle in .data
Found symbol: maze in .data
Found symbol: __TMC_END__ in .data
Looking at Object File section: .bss
Found symbol: .bss in .bss
Found symbol: completed.6590 in .bss
Found symbol: _end in .bss
Found symbol: __bss_start in .bss
addDataEntryPoints: looking for entry points in: .interp
addDataEntryPointsFromSectionBounds are: 8048154 to 8048167
addDataEntryPoints: skipping non-data section: .init
addDataEntryPoints: skipping non-data section: .plt
addDataEntryPoints: skipping non-data section: .text
addDataEntryPoints: skipping non-data section: .fini
addDataEntryPoints: looking for entry points in: .rodata
addDataEntryPointsFromSectionBounds are: 80488e8 to 804894d
addDataEntryPoints: looking for entry points in: .eh_frame_hdr
addDataEntryPointsFromSectionBounds are: 8048950 to 8048974
addDataEntryPoints: looking for entry points in: .eh_frame
addDataEntryPointsFromSectionBounds are: 8048974 to 8048a04
addDataEntryPoints: looking for entry points in: .jcr
addDataEntryPointsFromSectionBounds are: 8049f10 to 8049f14
addDataEntryPoints: looking for entry points in: .got
addDataEntryPointsFromSectionBounds are: 8049ffc to 804a000
addDataEntryPoints: looking for entry points in: .got.plt
addDataEntryPointsFromSectionBounds are: 804a000 to 804a024
addDataEntryPoints: looking for entry points in: .data
addDataEntryPointsFromSectionBounds are: 804a024 to 804a079
addDataEntryPoints: looking for entry points in: .bss
addDataEntryPointsFromSectionBounds are: 804a07c to 804a080
We have 1 entry points
Calling getFunc on: 8048560
getFunc: Starting at 0x8048560
getFunc: toVisit size is: 1
Processing block: block_0x8048560
8048560:    pushl   %ebp
8048561:    movl    %esp, %ebp
8048563:    pushl   %ebx
8048564:    pushl   %edi
8048565:    pushl   %esi
8048566:    subl    $156, %esp
804856c:    movl    12(%ebp), %eax
804856f:    movl    8(%ebp), %ecx
8048572:    movl    $0, %edx
8048577:    movl    $28, %esi
804857c:    leal    -72(%ebp), %edi
804857f:    leal    134520876, %ebx
Adding local data ref to: 804a02c
8048585:    movl    $0, -16(%ebp)
804858c:    movl    %ecx, -20(%ebp)
804858f:    movl    %eax, -24(%ebp)
8048592:    movl    $0, -44(%ebp)
8048599:    movl    $1, -28(%ebp)
80485a0:    movl    $1, -32(%ebp)
80485a7:    movl    -28(%ebp), %eax
80485aa:    imull   $11, -32(%ebp), %ecx
80485b1:    addl    %ecx, %ebx
80485b3:    movb    $88, (%ebx,%eax)
80485b7:    movl    $0, (%esp)
80485be:    movl    %edi, 4(%esp)
80485c2:    movl    $28, 8(%esp)
80485ca:    movl    %edx, -76(%ebp)
80485cd:    movl    %esi, -80(%ebp)
80485d0:    calll   -645
find_import_name: Doing extra deref
Adding: 0x8048350 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 80485d1
Could not relocate addr for local call at: 80485d0
Assuming address should not be relocated
Found local call to: 8048350
Adding: 0x8048350 as target because its a non-relocateable internal call
80485d5:    movl    %eax, -84(%ebp)
80485d8:    cmpl    $28, -44(%ebp)
80485df:    jge 594
Adding block: 8048837
Adding block: 80485e5
Processing block: block_0x80485e5
80485e5:    movl    -28(%ebp), %eax
80485e8:    movl    %eax, -36(%ebp)
80485eb:    movl    -32(%ebp), %eax
80485ee:    movl    %eax, -40(%ebp)
80485f1:    movl    -44(%ebp), %eax
80485f4:    movsbl  -72(%ebp,%eax), %eax
80485f9:    movl    %eax, %ecx
80485fb:    subl    $114, %ecx
80485fe:    movl    %eax, -88(%ebp)
8048601:    movl    %ecx, -92(%ebp)
8048604:    jg  45
Adding block: 8048637
Adding block: 804860a
Processing block: block_0x804860a
804860a:    jmp 0
Adding block: 804860f
Processing block: block_0x804860f
804860f:    movl    -88(%ebp), %eax
8048612:    subl    $97, %eax
8048615:    movl    %eax, -96(%ebp)
8048618:    je  97
Adding block: 804867f
Adding block: 804861e
Processing block: block_0x804861e
804861e:    jmp 0
Adding block: 8048623
Processing block: block_0x8048623
8048623:    movl    -88(%ebp), %eax
8048626:    subl    $100, %eax
8048629:    movl    %eax, -100(%ebp)
804862c:    je  93
Adding block: 804868f
Adding block: 8048632
Processing block: block_0x8048632
8048632:    jmp 104
Adding block: 804869f
Processing block: block_0x804869f
804869f:    leal    134514931, %eax
Adding local data ref to: 80488f3
80486a5:    movl    %eax, (%esp)
80486a8:    calll   -845
find_import_name: Doing extra deref
Adding: 0x8048360 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 80486a9
Could not relocate addr for local call at: 80486a8
Assuming address should not be relocated
Found local call to: 8048360
Adding: 0x8048360 as target because its a non-relocateable internal call
80486ad:    leal    134514971, %ecx
Adding local data ref to: 804891b
80486b3:    movl    %ecx, (%esp)
80486b6:    movl    %eax, -112(%ebp)
80486b9:    calll   -862
find_import_name: Doing extra deref
Adding: 0x8048360 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 80486ba
Could not relocate addr for local call at: 80486b9
Assuming address should not be relocated
Found local call to: 8048360
Adding: 0x8048360 as target because its a non-relocateable internal call
80486be:    movl    $4294967295, %ecx
80486c3:    movl    $4294967295, (%esp)
80486ca:    movl    %eax, -116(%ebp)
80486cd:    movl    %ecx, -120(%ebp)
80486d0:    calll   -837
find_import_name: Doing extra deref
Adding: 0x8048390 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 80486d1
Could not relocate addr for local call at: 80486d0
Assuming address should not be relocated
Found local call to: 8048390
Adding: 0x8048390 as target because its a non-relocateable internal call
80486d5:    leal    134520876, %eax
Adding local data ref to: 804a02c
80486db:    movl    -28(%ebp), %ecx
80486de:    imull   $11, -32(%ebp), %edx
80486e5:    addl    %edx, %eax
80486e7:    movsbl  (%eax,%ecx), %eax
80486eb:    cmpl    $35, %eax
80486f0:    jne 64
Adding block: 8048736
Adding block: 80486f6
Processing block: block_0x80486f6
80486f6:    leal    134514982, %eax
Adding local data ref to: 8048926
80486fc:    movl    %eax, (%esp)
80486ff:    calll   -932
find_import_name: Doing extra deref
Adding: 0x8048360 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 8048700
Could not relocate addr for local call at: 80486ff
Assuming address should not be relocated
Found local call to: 8048360
Adding: 0x8048360 as target because its a non-relocateable internal call
8048704:    leal    134514992, %ecx
Adding local data ref to: 8048930
804870a:    leal    -72(%ebp), %edx
804870d:    movl    %ecx, (%esp)
8048710:    movl    %edx, 4(%esp)
8048714:    movl    %eax, -124(%ebp)
8048717:    calll   -956
find_import_name: Doing extra deref
Adding: 0x8048360 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 8048718
Could not relocate addr for local call at: 8048717
Assuming address should not be relocated
Found local call to: 8048360
Adding: 0x8048360 as target because its a non-relocateable internal call
804871c:    movl    $1, %ecx
8048721:    movl    $1, (%esp)
8048728:    movl    %eax, -128(%ebp)
804872b:    movl    %ecx, -132(%ebp)
8048731:    calll   -934
find_import_name: Doing extra deref
Adding: 0x8048390 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 8048732
Could not relocate addr for local call at: 8048731
Assuming address should not be relocated
Found local call to: 8048390
Adding: 0x8048390 as target because its a non-relocateable internal call
8048736:    leal    134520876, %eax
Adding local data ref to: 804a02c
804873c:    movl    -28(%ebp), %ecx
804873f:    imull   $11, -32(%ebp), %edx
8048746:    addl    %edx, %eax
8048748:    movsbl  (%eax,%ecx), %eax
804874c:    cmpl    $32, %eax
8048751:    je  84
Adding block: 80487ab
Adding block: 8048757
Processing block: block_0x8048757
8048757:    cmpl    $2, -32(%ebp)
804875e:    jne 59
Adding block: 804879f
Adding block: 8048764
Processing block: block_0x8048764
8048764:    leal    134520876, %eax
Adding local data ref to: 804a02c
804876a:    movl    -28(%ebp), %ecx
804876d:    imull   $11, -32(%ebp), %edx
8048774:    addl    %edx, %eax
8048776:    movsbl  (%eax,%ecx), %eax
804877a:    cmpl    $124, %eax
804877f:    jne 26
Adding block: 804879f
Adding block: 8048785
Processing block: block_0x8048785
8048785:    cmpl    $0, -28(%ebp)
804878c:    jle 13
Adding block: 804879f
Adding block: 8048792
Processing block: block_0x8048792
8048792:    cmpl    $11, -28(%ebp)
8048799:    jl  12
Adding block: 80487ab
Adding block: 804879f
Processing block: block_0x804879f
804879f:    movl    -36(%ebp), %eax
80487a2:    movl    %eax, -28(%ebp)
80487a5:    movl    -40(%ebp), %eax
80487a8:    movl    %eax, -32(%ebp)
80487ab:    movl    -36(%ebp), %eax
80487ae:    cmpl    -28(%ebp), %eax
80487b1:    jne 55
Adding block: 80487ee
Adding block: 80487b7
Processing block: block_0x80487b7
80487b7:    movl    -40(%ebp), %eax
80487ba:    cmpl    -32(%ebp), %eax
80487bd:    jne 43
Adding block: 80487ee
Adding block: 80487c3
Processing block: block_0x80487c3
80487c3:    leal    134515011, %eax
Adding local data ref to: 8048943
80487c9:    movl    %eax, (%esp)
80487cc:    calll   -1137
find_import_name: Doing extra deref
Adding: 0x8048360 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 80487cd
Could not relocate addr for local call at: 80487cc
Assuming address should not be relocated
Found local call to: 8048360
Adding: 0x8048360 as target because its a non-relocateable internal call
80487d1:    movl    $4294967294, %ecx
80487d6:    movl    $4294967294, (%esp)
80487dd:    movl    %eax, -136(%ebp)
80487e3:    movl    %ecx, -140(%ebp)
80487e9:    calll   -1118
find_import_name: Doing extra deref
Adding: 0x8048390 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 80487ea
Could not relocate addr for local call at: 80487e9
Assuming address should not be relocated
Found local call to: 8048390
Adding: 0x8048390 as target because its a non-relocateable internal call
80487ee:    leal    134520876, %eax
Adding local data ref to: 804a02c
80487f4:    movl    -28(%ebp), %ecx
80487f7:    imull   $11, -32(%ebp), %edx
80487fe:    addl    %edx, %eax
8048800:    movb    $88, (%eax,%ecx)
8048804:    calll   -857
Adding: 0x80484b0 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 8048805
Could not relocate addr for local call at: 8048804
Assuming address should not be relocated
Found local call to: 80484b0
Adding: 0x80484b0 as target because its a non-relocateable internal call
8048809:    movl    $1, %eax
804880e:    movl    -44(%ebp), %ecx
8048811:    addl    $1, %ecx
8048817:    movl    %ecx, -44(%ebp)
804881a:    movl    $1, (%esp)
8048821:    movl    %eax, -144(%ebp)
8048827:    calll   -1212
find_import_name: Doing extra deref
Adding: 0x8048370 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 8048828
Could not relocate addr for local call at: 8048827
Assuming address should not be relocated
Found local call to: 8048370
Adding: 0x8048370 as target because its a non-relocateable internal call
804882c:    movl    %eax, -148(%ebp)
8048832:    jmp -607
Adding block: 80485d8
Processing block: block_0x80485d8
80485d8:    cmpl    $28, -44(%ebp)
80485df:    jge 594
Adding block: 8048837
Adding block: 80485e5
Processing block: block_0x8048837
8048837:    leal    134515011, %eax
Adding local data ref to: 8048943
804883d:    movl    %eax, (%esp)
8048840:    calll   -1253
find_import_name: Doing extra deref
Adding: 0x8048360 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 8048841
Could not relocate addr for local call at: 8048840
Assuming address should not be relocated
Found local call to: 8048360
Adding: 0x8048360 as target because its a non-relocateable internal call
8048845:    movl    -16(%ebp), %ecx
8048848:    movl    %eax, -152(%ebp)
804884e:    movl    %ecx, %eax
8048850:    addl    $156, %esp
8048856:    popl    %esi
8048857:    popl    %edi
8048858:    popl    %ebx
8048859:    popl    %ebp
804885a:    retl
Processing block: block_0x80487ee
80487ee:    leal    134520876, %eax
Adding local data ref to: 804a02c
80487f4:    movl    -28(%ebp), %ecx
80487f7:    imull   $11, -32(%ebp), %edx
80487fe:    addl    %edx, %eax
8048800:    movb    $88, (%eax,%ecx)
8048804:    calll   -857
Adding: 0x80484b0 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 8048805
Could not relocate addr for local call at: 8048804
Assuming address should not be relocated
Found local call to: 80484b0
Adding: 0x80484b0 as target because its a non-relocateable internal call
8048809:    movl    $1, %eax
804880e:    movl    -44(%ebp), %ecx
8048811:    addl    $1, %ecx
8048817:    movl    %ecx, -44(%ebp)
804881a:    movl    $1, (%esp)
8048821:    movl    %eax, -144(%ebp)
8048827:    calll   -1212
find_import_name: Doing extra deref
Adding: 0x8048370 as target because its a call target
Symbol not found, maybe a local call
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Could not find reloc ref for: 8048828
Could not relocate addr for local call at: 8048827
Assuming address should not be relocated
Found local call to: 8048370
Adding: 0x8048370 as target because its a non-relocateable internal call
804882c:    movl    %eax, -148(%ebp)
8048832:    jmp -607
Adding block: 80485d8
Processing block: block_0x80487ab
80487ab:    movl    -36(%ebp), %eax
80487ae:    cmpl    -28(%ebp), %eax
80487b1:    jne 55
Adding block: 80487ee
Adding block: 80487b7
Processing block: block_0x8048736
8048736:    leal    134520876, %eax
Adding local data ref to: 804a02c
804873c:    movl    -28(%ebp), %ecx
804873f:    imull   $11, -32(%ebp), %edx
8048746:    addl    %edx, %eax
8048748:    movsbl  (%eax,%ecx), %eax
804874c:    cmpl    $32, %eax
8048751:    je  84
Adding block: 80487ab
Adding block: 8048757
Processing block: block_0x804868f
804868f:    movl    -28(%ebp), %eax
8048692:    addl    $1, %eax
8048697:    movl    %eax, -28(%ebp)
804869a:    jmp 54
Adding block: 80486d5
Processing block: block_0x80486d5
80486d5:    leal    134520876, %eax
Adding local data ref to: 804a02c
80486db:    movl    -28(%ebp), %ecx
80486de:    imull   $11, -32(%ebp), %edx
80486e5:    addl    %edx, %eax
80486e7:    movsbl  (%eax,%ecx), %eax
80486eb:    cmpl    $35, %eax
80486f0:    jne 64
Adding block: 8048736
Adding block: 80486f6
Processing block: block_0x804867f
804867f:    movl    -28(%ebp), %eax
8048682:    addl    $4294967295, %eax
8048687:    movl    %eax, -28(%ebp)
804868a:    jmp 70
Adding block: 80486d5
Processing block: block_0x8048637
8048637:    movl    -88(%ebp), %eax
804863a:    subl    $115, %eax
804863d:    movl    %eax, -104(%ebp)
8048640:    je  41
Adding block: 804866f
Adding block: 8048646
Processing block: block_0x8048646
8048646:    jmp 0
Adding block: 804864b
Processing block: block_0x804864b
804864b:    movl    -88(%ebp), %eax
804864e:    subl    $119, %eax
8048651:    movl    %eax, -108(%ebp)
8048654:    jne 69
Adding block: 804869f
Adding block: 804865a
Processing block: block_0x804865a
804865a:    jmp 0
Adding block: 804865f
Processing block: block_0x804865f
804865f:    movl    -32(%ebp), %eax
8048662:    addl    $4294967295, %eax
8048667:    movl    %eax, -32(%ebp)
804866a:    jmp 102
Adding block: 80486d5
Processing block: block_0x804866f
804866f:    movl    -32(%ebp), %eax
8048672:    addl    $1, %eax
8048677:    movl    %eax, -32(%ebp)
804867a:    jmp 86
Adding block: 80486d5
getFunc: Function recovery complete for  func at 8048560
Calling getFunc on: 8048370
getFunc: Starting at 0x8048370
getFunc: toVisit size is: 1
Processing block: block_0x8048370
8048370:    jmpl    *134520852
find_import_name: Doing extra deref
Found a possible jump table!
Not a jump table: no relocation in JMP32m
Heristic jumptable processing couldn't parse jumptable
pointing to: 0x8048370
    jmpl    *134520852
0614547ebf1eb9af7a6ebb0913b4782c
Failure to make module: Generic error: Line: 914
File: /home/federico/git/mcsema/mc-sema/bin_descend/cfg_recover.cpp
Unable to resolve jump.
Failed to open file demo_maze.cfg
Could not process input module: demo_maze.cfg
../../build/llvm-3.5/bin/opt: demo_maze.bc: error: Could not open input file: No such file or directory
../../build/llvm-3.5/bin/llc: demo_maze_opt.bc: error: Could not open input file: No such file or directory
clang: error: no such file or directory: 'demo_maze.o'
./demo_maze.sh: line 24: ./demo_maze_out.exe: No such file or directory

File Type Identification

Looks like you're using a lookup table for extensions and file types. I poke around at malware some times, and I generally keep those without extension to not accidentally run them. Similarly, when compiling own code, we get a.out which ".out" isn't in there.

Can you incorporate a basic magic bytes check on the file to determine the file type, instead of extension?

Segmentation Fault

Hi,

I am trying to get the llvm bitcode file from x86 object code so that i can run it on either x86-64 or ARM by again compiling the bitcode file for x86-64 or ARM but I am getting segmentation fault when executing the same. The bitcode file when compiled for x86 works fine. I am doing the following steps:

A very simple test code:

$ cat test.c
int main() {
return -5;
}

Compiling it using the following command to get the object file(i have to use -m32 option, as if i dont use the option then bin_descend tool is not able to convert it bitcode file, Also as the main purpose is to get llvm bitcode which is target independent, so it should not matter i think, I may be wrong though):
$ clang -ggdb -m32 -c -o test.o test.c

Using the bin_descend tool to get the CFG
$ bin_descend -d -entry-symbol=main -i=test.o

The command to get .bc file from
$ cfg_to_bc -i test.cfg -driver=main,main,0,return,C -o test.bc

To optimize the code using:
$ opt -O3 test.bc -o test_opt.bc

Now i am compiling the test_opt.bc using llc for target x86-64
$ llc test_opt.bc -o test_opt.s -march=x86-64

using the following command to generate the executable:
$ clang -ggdb test_opt.s

On executing it I am getting segmentation fault

$ ./a.out
Segmentation fault (core dumped)

On trying to debug it using gdb, i m getting the output:

gdb) b main
Breakpoint 1 at 0x4005d0: file test_opt.s, line 9.
(gdb) r
Starting program: /home/mayur/mcsema/mc-sema/tests/mayur/a.out

Breakpoint 1, main () at test_opt.s:9
9 pushq %r14
(gdb) n
12 pushq %rbx
(gdb) n
15 subq $24, %rsp
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
main () at test_opt.s:37
37 movl $0, (%rax)
(gdb)

I m surprised why pushq %r14 is in the assembly, as r14 is an ARM register. Still it goes through that instruction and fails at : movl $0, (%rax)

Can someone please point it out what is going wrong? Or maybe i am doing something wrong in using the commands. Also can someone tell why bin_descend only is able to create cfg from object files of 32 bit. Are 64 bit object files not supported as of now?

build error Ubuntu 14.04 x86_64

Hi,
I have this error during the build, any help please

c++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.8/README.Bugs> for instructions.
make[2]: *** [mc-sema/validator/x86_64/testSemantics/CMakeFiles/testSemantics.dir/testSemantics.auto.cpp.o] Error 4
make[1]: *** [mc-sema/validator/x86_64/testSemantics/CMakeFiles/testSemantics.dir/all] Error 2
make: *** [all] Error 2

imulq support, x86_64?

Tripped over this in cfg_to_bc with an x86_64 binary:

...
doCallPCExtern paramCount  : 1 : strlen
1 : 2be7780
Unsupported!
401092  imulq   $6, -32(%rbp), %rax

Is this a known limitation with 64-bit, or an indicator of a problem
somewhere else? (Am tracking branch thttpd).

kernel32_dll demo error

windows 7 32bit ultmit KN

instrument script step 1, 2 not error but step 3 error

This error

File "C:\Program Files\IDA 6.6\python\idaapi.py", line 601, in IDAPython_ExecScript
execfile(script, g)
File "../../../build/mc-sema/bin_descend/Release//get_cfg.py", line 1108, in
recoverCfg(eps, outf, args.exports_are_apis)
File "../../../build/mc-sema/bin_descend/Release//get_cfg.py", line 744, in recoverCfg
raise Exception("Could not locate entry symbol: {0}".format(name))
Exception: Could not locate entry symbol: DllEntryPoint

Correctly initialize FPU state

When auto-generating a driver, initialize FPU flags from the current FPU state. After a FINIT the FPU is set to native FPU precision (0x3) but we initialize it to 0x0 (single precision)

Porting to latest LLVM

Please let me know if there is any future plan for porting this project to latest llvm.

Different output from the BUILDING.md when doing Demo1.sh

After I run through the release build in BUILDING.md for linux,

jlu@jlu-G551JM:~/Public/mcsema/mc-sema/tests$ ./demo1.sh
Using bin_descend to recover CFG
Disassembly not guided by outside facts.
Use :'../../build/mc-sema/bin_descend//bin_descend-p ' to feed information to guide the disassembly
low : 0 high : 5
low : 0 high : 5
Disassembly not guided by outside facts.
Use: -p ' to feed information to guide the disassembly
TT x86 : i386-unknown-unknown
Looking at Object File section: .text
Found symbol: .text in .text
Found symbol: filler in .text
Found symbol: start in .text
addDataEntryPoints: skipping non-data section: .text
We have 1 entry points
Calling getFunc on: 1
getFunc: Starting at 0x1
getFunc: toVisit size is: 1
Processing block: block_0x1
1: addl $1, %eax
4: retl
getFunc: Function recovery complete for func at 1
Looking up target...
Reading module ...
TT x86 : i386-unknown-unknown
Deserializing functions...
Creating module...
Setting target...
Done setting target
Adding external funcs...
Adding external data...
Adding internal data...
Adding entry points...
Returning modue...
Setting initial triples...
Looking at entry points...
Already have driver for: start
Getting LLVM module...
Converting to LLVM...
Inserted function: sub_1
Adding entry point: demo1_entry
Doing post analysis passes...
in : doPostAnalysis
registering passes
debugging this script\n
CC is clang
./demo1.sh: line 24: clang: command not found
./demo1.sh: line 25: ./demo_driver1.exe: No such file or directory

This is different from the the one in the BUILDING.md since the second line, what is going on

bin_descend: False positive JUMPTABLE entry.

The current jump table algorithm relies heavily on relocation information to find the first and last entity of a table. The add jump table code path is triggered if the operand of a jmp instruction is a relocatable memory location in an executable section (ref: handlePossibleJumpTable). As instructions referring to jump tables may use either positive or negative indexing, both directions are tested to find the beginning and end of the table (ref: addJmpTableEntries with -4 and +4 increments). Each location is tested using the same strategy; if the location is relocatable and if it is in an executable section, mark it as a JUMPTABLE entry. The algorithm works perfectly for code referring to jump tables stored in non-executable sections and its simplicity is quite beautiful.

A problem arises when a jmp instruction is directly succeeded by the jump table it refers to. The hexadecimal representation of the jmp instruction will end with an address (e.g. the address of the jump table) that is relocatable and present in an executable section. If there is no alignment padding (e.g. 0xCC or similar) between the jmp instruction and its jump table the current algorithm will incorrectly add the jump table address itself as a JUMPTABLE entry; which onces parsed as code will terminate the recursive descent disassembler as it fails to interpret the jump table bytes as assembly instructions.

Below are two examples, one which the algorithm is able to handle correctly and one which gives a false positive JUMPTABLE entry.

True positive

bin_descent output:

Found a relocation at: 0x477c23, pointing to: 0x477c28
Detect branch via memory, relocation handled later
Found a possible jump table!
Jump table search ending, can't relocate address: 477c24
Added: 0 functions to jmptable
Added JMPTABLE entry [477c28] => 477c60
Added JMPTABLE entry [477c2c] => 477c5c
Added JMPTABLE entry [477c30] => 477c4c
Added JMPTABLE entry [477c34] => 477c38
Jump table search ending, can't relocate address: 477c38
Added: 3 functions to jmptable
Adding block via jmptable: 477c60
Adding block via jmptable: 477c5c
Adding block via jmptable: 477c4c
Adding block via jmptable: 477c38

Assembly:

; The jump table address doesn't directly precede the jump table as there is a
; 0x90 padding byte in between. This is enough to correctly handle the jump
; table.
.text:00477C20                 jmp     ds:off_477C28[edx*4]
.text:00477C27                 db  90h
.text:00477C28 off_477C28      dd offset loc_477C60
.text:00477C2C                 dd offset loc_477C5C
.text:00477C30                 dd offset loc_477C4C
.text:00477C34                 dd offset loc_477C38

False positive

bin_descent output:

Found a relocation at: 0x477cbc, pointing to: 0x477cc0
Detect branch via memory, relocation handled later
Found a possible jump table!
Added JMPTABLE entry [477cbc] => 477cc0 (*** FALSE POSITIVE ***)
Jump table search ending, can't relocate address: 477cb8
Added: 1 functions to jmptable
Added JMPTABLE entry [477cc0] => 477cfe
Added JMPTABLE entry [477cc4] => 477cf8
Added JMPTABLE entry [477cc8] => 477ce8
Added JMPTABLE entry [477ccc] => 477cd0
Jump table search ending, can't relocate address: 477cd0
Added: 3 functions to jmptable
Adding block via jmptable: 477cc0 (*** FALSE POSITIVE ***)
Adding block via jmptable: 477cfe
Adding block via jmptable: 477cf8
Adding block via jmptable: 477ce8
Adding block via jmptable: 477cd0

Assembly:

; BUG: incorrectly adds the address 477CC0 as a JUMPTABLE entry. It is however
; part of the instruction and should therefore not be added.
.text:00477CB9                 jmp     ds:off_477CC0[edx*4]
.text:00477CC0 off_477CC0      dd offset loc_477CFE
.text:00477CC4                 dd offset loc_477CF8
.text:00477CC8                 dd offset loc_477CE8
.text:00477CCC                 dd offset loc_477CD0

translate_CPUID32() returns nothing

I got the last version of Mcsema and my compiler complains that translate_CPUID32() should return InstTransResult but actually returns nothing. I fixed it by just returning "EndBlock". But is that correct?

Failed to deserialize protobuf module

Hi,

I tried to use the linked_elf_test example but I get the following error when translating the CFG to bitcode:

cfg_to_bc -mtriple=i686-pc-linux-gnu -i linked_elf.cfg -driver=mcsema_main,main,2,return,C -o linked_elf.bc
Looking up target...
Reading module ...
 TT x86 : i386-unknown-unknown 
[libprotobuf ERROR /mcsema-master/mc-sema/protobuf-2.5.0/src/google/protobuf/message_lite.cc:123] Can't parse message of type "Module" because it is missing required fields: external_funcs[0].is_weak
Failed to deserialize protobuf module
Returning modue...
Could not process input module: linked_elf.cfg

I used IDA Pro to get the CFG. However, It seems that it can not handle the printf() call. If I remove the (extern) printf call everything runs fine.

I run everything on Linux except the IDA Pro part (converting the linked_elf to CFG) on Windows.

Build failed on Windows

I get this error almost at the end of the process (97%). I followed the BUILDING.md tutorial.

Linking CXX executable cfg_to_bc.exe
LINK : fatal error LNK1104: cannot open file 'boost_program_options-vc120-mt-s-1_52.lib'
LINK failed. with 1104
NMAKE : fatal error U1077: '"C:\Program Files (x86)\CMake\bin\cmake.exe"' : return code '0xffffffff'
Stop.
NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\BIN\nmake.exe"' : return code '0x2'
Stop.
NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\BIN\nmake.exe"' : return code '0x2'
Stop.

Any idea about how to fix it?

Failing to build on

In ubuntu 15.04, its failing with the following error:

Pin app terminated abnormally due to signal 4.
mc-sema/validator/x86_64/valTest/CMakeFiles/tests.out.dir/build.make:49: recipe for target 'mc-sema/validator/x86_64/valTest/CMakeFiles/tests.out' failed
make[2]: *** [mc-sema/validator/x86_64/valTest/CMakeFiles/tests.out] Error 132
CMakeFiles/Makefile2:7509: recipe for target 'mc-sema/validator/x86_64/valTest/CMakeFiles/tests.out.dir/all' failed

Any clue?

Jump table size exception

Hi,
I am trying to disassemble thttpd 2.26 built using clang 3.8 on Ubuntu 14.04. Using get_cfg.py using -march=x86. I get an Exception: Jump Table Not Size 4.

Starting insn at: 406ead
        inst: movsx   ecx, byte ptr [r8+rdx]
        Bytes: [65, 15, 190, 12, 16]
        inst: mov     eax, 2
        Bytes: [184, 2, 0, 0, 0]
        inst: jmp     ds:off_40F510[rsi*8]; switch jump
        Bytes: [255, 36, 245, 16, 245, 64, 0]
Jump table size not 4!

I tried to ignore this exception which leads to out of order symbols when calling cfg_to_bc, which is obviously not a viable workaround.

Unsupported operation on i386

Artem,

We are still endeavoring to lift out two simple examples to bitcode: zlib and thttpd.
We are using the current version of mcsema:

ce10281.

We are using IDAPro to construct the control flow graphs as you advised.

With thttpd we fail with:

Have a no-op at: 0x805404a
Instruction is: 1 bytes long
Representation: nop
Unsupported!
805404b orpd %xmm0, %xmm1
1825
Failure to convert to LLVM module!

We generated the binary using the Franz multicompiler. If you need the binary, it is here:

http://www.csl.sri.com/~iam/thttpd0.zip

For zlib we crash, and will post a separate issue.

Cheers, Ian & Bruno.

how to transform elf binary in linux x86_64 to llvm IR

Does Binary File Parsing subproject aim to parse elf binary to llvm IR?
I read the build document, and build it on my computer(Ubuntu 14.04.2).
I see the content of demos, they all have first useful command :

nasm -f elf64 -o demo_test1.o demo_test1.asm ( maybe *.c )

you get object file from asm/c file. This does not show me something I want.
I want to know how to get llvm IR from a given elf binary in linux x86_64.

Better support for external functions

Right now, the support for external functions requires loading a document that contains a list of symbols, the number of arguments, whether or not it is noreturn, and its calling convention. This format constrains the external functions that can be supported, and also means that time must be spent trying to build the mapping lists.

I started looking into fixing this stuff after finding that printf ignores the existence of floating-point arguments, rendering most floating point programs hopelessly untestable. Supporting floating point requires keeping track of vector registers and integer register counts separately at the very least (assuming you don't care about long double, which is its own can of worms that I personally don't care much about). Varargs functions are substantially more complicated, though, because one of the uses of the function argument count is to know how many arguments to pass, and varargs by definition doesn't specify this data.

I know how to implement support for the vector registers for fixed-argument functions, but supporting it on variable-argument functions is more difficult. I also ran into some test cases that made clear that the number of arguments for printf is likely to be too low unless you put it absurdly high. Building an inline asm shim for varargs code is of course possible, but obviously limits the naturalness and analyzability of the resulting code. In theory, it should be possible to use some analysis to know how many arguments in use--however, such analysis is not possible at call time. My thoughts are to generate an inline asm shim for varargs code and then use a later pass to eliminate that shim where possible.

LLVM already doesn't try to map directly to C ABI in all cases, which eases the task of generating ABI information. Thus, every LLVM function can be specified in ABI terms as a mapping of each argument to a register or stack slot location, with a few metadata blobs (caller/callee cleanup, varags, noreturn). It should thus be fairly trivial to make a tool that takes LLVM IR and builds definitions from it (which means stddefs for libc and libstdc++ could be fairly easily autogenerated).

Error with cfg_to_bc after recovering cfg with bin_descend

The following case run just fine with bin_descend + cfg_to_bc. But I am recently facing the following issue while making a git pull.

Test case

#include<stdio.h>
int main() {

  printf("test ");
  return 0;

}

$ gcc -O0 test_0_1.c -m64 -c -o test.o
$ mcsema/build/mc-sema/bin_descend/bin_descend -march=x86-64 -d -i=test.o -func-map=../../utils/std_defs.txt -entry-symbol=main

Disassembly not guided by outside facts.
Use :'/home/dsand/Github/mcsema/build/mc-sema/bin_descend/bin_descend-p <protobuff>' to feed information to guide the disassembly
low : 0 high : 1a
low : 0 high : 1a
populateReloMap: Relo mapping: [.text] -> [.rela.text]
low : 0 high : 0
low : 0 high : 0
low : 0 high : 0
low : 0 high : 0
low : 0 high : 6
low : 1a high : 20
low : 0 high : 38
low : 20 high : 58
populateReloMap: Relo mapping: [.eh_frame] -> [.rela.eh_frame]
Disassembly not guided by outside facts.
Use: -p <protobuff>' to feed information to guide the disassembly
 TT x86-64 : x86_64-unknown-unknown 
Looking at Object File section: .text
Found symbol: .text in .text
Found symbol: main in .text
Looking at Object File section: .data
Found symbol: .data in .data
Looking at Object File section: .bss
Found symbol: .bss in .bss
Looking at Object File section: .rodata
Found symbol: .rodata in .rodata
Looking at Object File section: .eh_frame
Found symbol: .eh_frame in .eh_frame
addDataEntryPoints: skipping non-data section: .text
addDataEntryPoints: looking for entry points in: .data
addDataEntryPointsFromSectionBounds are: 0 to 0
addDataEntryPoints: looking for entry points in: .bss
addDataEntryPointsFromSectionBounds are: 0 to 0
addDataEntryPoints: looking for entry points in: .rodata
addDataEntryPointsFromSectionBounds are: 1a to 20
addDataEntryPoints: looking for entry points in: .eh_frame
addDataEntryPointsFromSectionBounds are: 20 to 58
addDataEntryPointsFromSection: Looking at relocation at: 40
relocate_addr: Relocation lives in: .eh_frame
relocate_addr: Offset is: 20
relocate_addr: Relocation lives in: .rela.eh_frame
    getRelocForAddr: Testing 40 vs. 40
relocate_addr: Looking at relocation type: R_X86_64_PC32
.text
relocate_addr: Relocation symbol is: .text
relocate_addr: Address of symbol is: 0
relocate_addr: Symbol is in: [.text], base is: 0
relocate_addr: Final addr is: 0
addDataEntryPointsFromSection: Adding data entry point for: sub_0
We have 1 entry points
Calling getFunc on: 0
getFunc: Starting at 0x0
getFunc: toVisit size is: 1
Processing block: block_0x0
0:  pushq   %rbp
1:  movq    %rsp, %rbp
4:  movl    $0, %edi
decodeBlock: have reloc at: 5
    find_import_for_addr: Testing 5 vs. 5
Found symbol named: .rodata
Address for .rodata is: 0
Skipping symbol since its probably not an import! 2
    find_import_for_addr: Testing 5 vs. f
relocate_addr: Relocation lives in: .text
relocate_addr: Offset is: 0
relocate_addr: Relocation lives in: .rela.text
    getRelocForAddr: Testing 5 vs. 5
relocate_addr: Looking at relocation type: R_X86_64_32
relocate_addr: Original bytes are: 0
.rodata
relocate_addr: Relocation symbol is: .rodata
relocate_addr: Address of symbol is: 0
relocate_addr: Symbol is in: [.rodata], base is: 1a
relocate_addr: Final addr is: 1a
Found a relocation at: 0x5, pointing to: 0x1a
Adding data reference to 0x1a
9:  movl    $0, %eax
e:  callq   0
decodeBlock: have reloc at: f
    find_import_for_addr: Testing f vs. 5
    find_import_for_addr: Testing f vs. f
Found symbol named: printf
Address for printf is: ffffffffffffffff
Adding external code ref: printf
DEBUG : decodeBlock, callpcrel
External call to: printf
13: movl    $0, %eax
18: popq    %rbp
19: retq
getFunc: Function recovery complete for  func at 0
Section: .data
    Minimum: 0
    Maximum: 0
Section: .bss
    Minimum: 0
    Maximum: 0
Section: .rodata
    Minimum: 1a
    Maximum: 20
Adding data section: 1a - 20
Section: .eh_frame
    Minimum: 20
    Maximum: 58
    Found relocation at: 40
relocate_addr: Relocation lives in: .eh_frame
relocate_addr: Offset is: 20
relocate_addr: Relocation lives in: .rela.eh_frame
    getRelocForAddr: Testing 40 vs. 40
relocate_addr: Looking at relocation type: R_X86_64_PC32
.text
relocate_addr: Relocation symbol is: .text
relocate_addr: Address of symbol is: 0
relocate_addr: Symbol is in: [.text], base is: 0
relocate_addr: Final addr is: 0
processDataSection: Recovered function symbol from data section: sub_0
Adding data section: 20 - 58
dumpData : base 40, size, 4

$ /mc-sema/bitcode_from_cfg//cfg_to_bc -ignore-unsupported -mtriple=x86_64-pc-linux-gnu -i test.cfg -o test.bc -driver=mcsema_main,main,raw,return,C

Looking up target...
Reading module ...
 TT x86-64 : x86_64-unknown-unknown 
[libprotobuf ERROR /home/dsand/Github/mcsema/mc-sema/protobuf-2.5.0/src/google/protobuf/message_lite.cc:123] Can't parse message of type "Module" because it is missing required fields: external_funcs[0].is_weak
Failed to deserialize protobuf module
Returning modue...
Could not process input module: test.cfg

Fix FPU Precision

Per the manual, make FPU precision control only applicable to:

FADD, FADDP, FIADD, FSUB, FSUBP, FISUB, FSUBR, FSUBRP, FISUBR, FMUL, FMULP, FIMUL, FDIV, FDIVP, FIDIV, FDIVR, FDIVRP, FIDIVR, and FSQRT

bin_descend failures on linux

We tried processing two "simple" executables built from source.

thttpd

zlib

On both examples bin_descend eventually fails (after puffing up the
map files appropriately). In the case of thttpd we run into
errors like:

Found a possible jump table!
Not a jump table: no relocation in JMP32m
Heristic jumptable processing couldn't parse jumptable
pointing to: 0x8052c97
jmpl *134573844(,%eax,4)
846874d52249a7bbebc53474a5ef6f11
Failure to make module: Generic error: Line: 912

while with zlib (executable being minigzip) we end up with errors like:

Unsupported!
8048a4c nopl (%eax)
1700
Failure to convert to LLVM module!

We tried different compilers including gcc-4.8.2 clang-3.2 and clang-3.4
but to no avail.

We are running this stuff on ubuntu 14.04

cfg_to_bc doesn't support rep

Using git commit a2592d7 (from branch feature_i386_runtime ).

Command: cfg_to_bc -mtriple=i686-pc-linux-gnu -i demo_bomb.cfg -driver=phase1_entry,phase_1,1,return,C -o demo_bomb.bc

Input: demo_bomb.cfg.zip

Output:

Looking up target...
Reading module ...
TT x86 : i386-unknown-unknown
Deserializing functions...
Deserializing functions...
Deserializing functions...
Deserializing functions...
Deserializing functions...
Deserializing functions...
Deserializing functions...
Deserializing functions...
Deserializing functions...
Deserializing functions...
Deserializing data...
Deserializing data...
Deserializing data...
Deserializing data...
Deserializing data...
Deserializing data...
Deserializing data...
Deserializing data...
Deserializing data...
Deserializing data...
Deserializing data...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Deserializing externs...
Creating module...
Setting target...
Done setting target
Adding external funcs...
Adding external data...
Adding internal data...
Adding entry points...
Returning modue...
Setting initial triples...
Looking at entry points...
Already have driver for: phase_1
Getting LLVM module...
Converting to LLVM...
Inserted function: sub_8048b20
Inserted function: sub_80494fc
Inserted function: sub_8048ee8
Inserted function: sub_8048e94
Inserted function: sub_804952c
Inserted function: sub_8049030
Inserted function: sub_8049018
Inserted function: sub_80491fc
Inserted function: sub_80491b0
Inserted function: sub_804917c
inserting global data section named data_0x80480f4
inserting global data section named data_0x80485e0
inserting global data section named data_0x80485e8
inserting global data section named data_0x8048600
inserting global data section named data_0x8049600
inserting global data section named data_0x804ade0
inserting global data section named data_0x804b484
inserting global data section named data_0x804b508
inserting global data section named data_0x804b510
inserting global data section named data_0x804b518
inserting global data section named data_0x804b640
doCallPCtarget address : 8049030
doCallPCtarget address : 80494fc
doCallPCtarget address : 8049030
doCallPCtarget address : 8048ee8
doCallPCtarget address : 80491fc
doCallPCtarget address : 80494fc
doCallPCtarget address : 8048e94
doCallPCtarget address : 8048e94
doCallPCtarget address : 80494fc
doCallPCtarget address : 804952c
doCallPCtarget address : 804952c
doCallPCtarget address : 8048e94
doCallPCtarget address : 8048e94
doCallPCtarget address : 8049030
doCallPCtarget address : 8048ee8
doCallPCtarget address : 8049018
doCallPCtarget address : 8049018
doCallPCtarget address : 80491b0
error:
Line: 695
File: /home/user/Desktop/mcsema/mc-sema/cfgToLLVM/x86Instrs_String.cpp
NIY

Build Error Ubuntu 16.04

I'm getting the following error on Ubuntu 16.04 with clang 3.9, gcc 5.4 as well as gcc 4.8.

E:Unable to load /mcsema/mc-sema/validator/x86_64/valTest/../valTool/val.so: /mcsema/mc-sema/validator/x86_64/valTest/../valTool/val.so: undefined symbol: _ZN10LEVEL_BASE9StringDecB5cxx11Emjc

Other parts of the build process all finished correctly.

compiling mcsema compatible applications

I want to test mcsema on a somehow more useful application then the hello-world-like test files. However, mcsema always runs into unsupported instructions or other errors (I use IDA Pro for CFG generation).

Is there some way to compile a project with specific compiler flags such that no unsupported instructions are emitted by my compiler?

Investigate not using struct.regs

Instead of using a global register context, struct.regs, it might be possible to pass registers directly as arguments to translated functions. This may lead to faster translated code.

More Standard Definitions

Hey,

Tried using this again today. I think i was playing around with this maybe a year or so ago. bin_descend still seg faults on me. get_cfg works, however cfg_to_bc gives the following error:

Could not find external function: __isoc99_scanf

I checked the standard defs file and it's not in there. That said, I also couldn't find any documentation on how to add to that file when need be. Unfortunately, so far I have not been able to have a single successful lifting to LLVM-IR with this tool.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.