dynamorio / dynamorio Goto Github PK

Dynamic Instrumentation Tool Platform

License: Other

CMake 2.92% Perl 1.48% C 79.76% C++ 13.50% Assembly 1.38% Shell 0.40% Python 0.33% Batchfile 0.10% Tcl 0.09% Makefile 0.01% C# 0.01% VBScript 0.01% Raku 0.02%

dynamorio instrumentation linux windows profiling analysis-framework toolkit cache-simulator simulator binary-analysis

dynamorio's Introduction

DynamoRIO

About DynamoRIO

DynamoRIO is a runtime code manipulation system that supports code transformations on any part of a program, while it executes. DynamoRIO exports an interface for building dynamic tools for a wide variety of uses: program analysis and understanding, profiling, instrumentation, optimization, translation, etc. Unlike many dynamic tool systems, DynamoRIO is not limited to insertion of callouts/trampolines and allows arbitrary modifications to application instructions via a powerful IA-32/AMD64/ARM/AArch64 instruction manipulation library. DynamoRIO provides efficient, transparent, and comprehensive manipulation of unmodified applications running on stock operating systems (Windows, Linux, or Android) and commodity IA-32, AMD64, ARM, and AArch64 hardware. Mac OSX support is in progress.

Existing DynamoRIO-based tools

DynamoRIO is the basis for some well-known external tools:

The Arm Instruction Emulator (ArmIE)
WinAFL, the Windows fuzzing tool, as an instrumentation and code coverage engine
The fine-grained profiler for ARM DrCCTProf
The portable and efficient framework for fine-grained value profilers VClinic

Tools built on DynamoRIO and available in the release package include:

The memory debugging tool Dr. Memory
The tracing and analysis framework drmemtrace with multiple tools that operate on both online (with multi-process support) and offline instruction and memory address traces:
- The cache simulator drcachesim
- TLB simulation
- Reuse distance
- Reuse time
- Opcode mix
- Function call tracing
The legacy processor emulator drcpusim
The "strace for Windows" tool drstrace
The code coverage tool drcov
The library tracing tool drltrace
The memory address tracing tool memtrace (drmemtrace's offline traces are faster with more surrounding infrastructure, but this is a simpler starting point for customized memory address tracing)
The memory value tracing tool memval
The instruction tracing tool instrace (drmemtrace's offline traces are faster with more surrounding infrastructure, but this is a simpler starting point for customized instruction tracing)
The basic block tracing tool bbbuf
The instruction counting tool inscount
The dynamic fuzz testing tool Dr. Fuzz
The disassembly tool drdisas
And more, including opcode counts, branch instrumentation, etc.: see API samples

Building your own custom tools

DynamoRIO's powerful API abstracts away the details of the underlying infrastructure and allows the tool builder to concentrate on analyzing or modifying the application's runtime code stream. API documentation is included in the release package and can also be browsed online. Slides from our past tutorials are also available.

Downloading DynamoRIO

DynamoRIO is available free of charge as a binary package for both Windows and Linux. DynamoRIO's source code is available primarily under a BSD license.

Obtaining Help

Use the discussion list to ask questions.

To report a bug, use the issue tracker.

dynamorio's People

Contributors

Stargazers

Watchers

Forkers

ranchoice groleo peterclemenko tobmaps neon77 ang-st yongbol yuede pombreda sigma-random jyizheng tempbottle davidegrayson thekingtat securigo yd0str zerowindow daksunt code4bones shuixi2013 van7hu fanzzbbs connlan fordream sbkiller yhvhvx rzs840707 453483289 zhaowoxin templeblock modulexcite boolking ohio813 wflk lucabongiorni uci-plrg chathurawidanage geek-li safe3 ctrippel vandal zhoubot djmott zofer1 gitcollect avgirl gongfupanada beauby krytarowski krvperera ht13 jizhongqing abioy heccet bscardina rongqinglee ggiraldez awesome-security ababook deki0r raikikon jackbro techlord-rce nsxz arizvisa 3125788 vlkhub nevermoe wanghaoqin bypasscc jin246039 joseph-giron firodj toshipiazza derekbruening simorfo kursh lingochan nimdakey hongyunnchen yannayl francesc-martinez genihoust c0de3 quangnh89 jianghaizhi cupertinodude ajgappmark andryej shalekesan brian-scardina apellegr s-kanev ezhangle nbstar chapering aleden b2ahex qinjuan abuuuu

dynamorio's Issues

build: automatically work around typedef conflicts

From [email protected] on February 16, 2009 14:41:58

We don't build on RHEL3:

If typedef conflicts arise (such as on RHEL3), currently you'll have

to manually resolve by defining one of our DR_DO_NOT_DEFINE_*

defines (DR_DO_NOT_DEFINE_uint, DR_DO_NOT_DEFINE_ushort, etc.).

Eventually we'll have a pre-make step that automates this.

Using a pre-build configure step is going to be the best way around this: issue #19

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=18

support client threads

From [email protected] on February 24, 2009 10:56:06

this was PR 222812

as an alternative (and superset) to custom nudges PR 200067
we should add support for creating new transparent threads that can then
poll files, etc.

from discussion:
communication restrictions: DR or client can't just wait on thread
client thread is less restricted than client executing in app threads:
when it's not communicating w/ rest of client or DR, it can share
libraries w/ app.
usage: communication in ("push"): seems like custom nudge (PR 200067) can
do the same work (can loop in nudge routine: make sure synch routines
consider native: thread will kill itself before returning to code
cache) w/o expectations of supporting many threads + communication
between them.
does it need synch? for push usage, say for adaptive instrumentation,
calling dr_flush may be all it needs to do.
2nd usage: sideline: for that need to revive decode_fragment,
replace_fragment, etc.

so, moving this case to later scheduling and considering it to mainly cover
sideline-type parallel-analysis uses

adding file wait is part of PR 202946

Tim in PR 200067:

I've run into some issues with this. The easiest thing is that we're not
well setup for a polling thread with no yield, sleep, or wait support,
though that part is relatively easy to fix. The more problematic case is
around synchronization issues. We can't consider the thread native, native
threads only grab DR or client locks during short bounded defined periods
that we can detect when we suspend (syscall interception code etc.) and are
fine the rest of the time. A polling nudge thread on the other hand is in
DR or Client code all the time, there is no good place to suspend it that
we can easily detect.

If we ignore client locks and fix cases like PR 225020, then perhaps we
could consider it ok to suspend the thread if it was in client dll code (as
opposed to DR or ntdll via DR with special case handling of yield/sleep).
For client locks we could try tracking them via our lock api routines, but
that starts getting pretty messy. Alternatively we could suspend the nudge
thread last, presumably at that point we wouldn't care about client locks
anymore since they couldn't block anyone. Alternatively, we could try not
suspending the nudge thread (at least if we were only targeting flush)
since it's not using the cache. That would work if synch_all users were
only using it to handle in cache threads, but some users (including flush
it looks like) are using it to break (or at least not hold) locks, do
unsafe data structure modifications etc. which we'd have to verify against
all our API routines.

I think the only really workable thing above is to special case the nudge
thread and suspend it last (so even if it's holding a client lock it can't
block anyone else) and only when it's in the client dll code (225020 will
be fixed soon and I don't know of any other problematic cases like that).
Either that or not supporting using a nudge as a polling thread (i.e. synch
is blocked till the nudge thread finishes).

post-nudge feature:
What remains to be done is to get the sideline threads to use the
client-owned thread synchronization support from the nudge work. Easiest is
probably starting the sideline thread via an internal nudge (though we'll
need an argument for that xref PR 231295.

thread creation status from t222812-etc-minifeatures tree:

I put my preliminary stuff, which was tested, under CLIENT_SIDELINE for
now: dr_create_client_thread(), dr_terminate_client_thread(),
cleanup_and_terminate_client_thread, dr_thread_yield()
didn't fully test native treatment with thread doing syscalls, etc.
what about our check_sole_thread() checks
NYI for linux for now: can borrow from create_thread() in x86/sideline.c,
with a different stack freeing model: will need to tweak
cleanup_and_terminate_client_thread to be os-neutral
FIXME: provide cond vars, other thread utils?
xref discussion above; here I had dr_thread_yield() which doesn't
really solve anything.
code comments:
- FIXME PR 210591: transparency issues:
- 1. All dlls will be notifed of thread creation by DLL_THREAD_ATTACH
- 1. The thread will show up in the list of threads accessed by
- NtQuerySystemInformation's SystemProcessesAndThreadsInformation
  structure.
- FIXME PR 202669: if the client leaves reservation space we should have
- the stack auto-expand.
what will a stack overflow be reported as?

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=41

attach injection on Linux

From [email protected] on February 24, 2009 10:01:18

this was PR 204490

focusing on Linux as Windows has issues that are best solved with a kernel
driver, unless we think we can rely on backward decoding heuristics or
don't mind losing control for a while

we do have some issues on Linux:

ability to suspend threads before we have control of them: we will
probably rely on ptrace (xref issue #37 )
determining the state of sharing of CLONE_* among threads: should
probably use a modify-and-observe approach

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=38

suite/tests/security-common/retnonexisting.c's fault not caught by SEH on x64

From [email protected] on February 18, 2009 20:41:50

suite/tests/security-common/retnonexisting.c's fault not caught by SEH on
x64 due undoubtedly to violating the stringent SEH64 rules

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=29

add ACKNOWLEDGEMENTS file to list past and present contributors

From [email protected] on February 12, 2009 16:40:13

We should add an ACKNOWLEDGEMENTS file to list past and present contributors.
Though the past VMware, Determina, MIT, and HP developers do not have any
copyright ownership anymore we should still acknowledge their contributions.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=7

build: add pre-build configure or cmake for local configuration and other benefits

From [email protected] on February 16, 2009 14:42:31

split from issue #2 : see notes there on why we never needed this for past
proprietary products

I would lean toward CMake instead of autoconf http://www.cmake.org/ We could achieve the following with cmake:

configure build for local toolchain tools: currently we have some checks
for -T vs -dT, gcc -fvisibility, etc., but we don't build on RHEL3
( issue #18 ) due to typedef conflicts, and in the future if we port to
other platforms we'll want to add more features like HAVE_PROC_MAPS.
add dependencies back in to core/ (USEDEP=0 right now due to "make
clear" bugs with the .dep files: PR 207890, PR 214218)
build in parallel (PR 209902)
reduce Windows building time due to cygpath invocations: move them all
to configure time
possibly: hook our tests up to the cmake-integrated CTest and CDash
plaform-independent regression testing and reporting engine?

Some issues:

devs have to install cmake
but, available on yum, apt-get repos so painless on linux
not clear cmake will support our mix of cygwin paths but microsoft compiler

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=19

misc core bugs hit in x64 regression and -opt_memory regression and other tests

From [email protected] on February 16, 2009 19:02:18

in the process of getting regression suites working a little better after
years of no QA I hit a number of bugs in the core. listing them all here
to document them. I'm fixing them all in one shot instead of filing
separately:

erroneous x64 assert in dr_get_process_id()
x64 dr_file_{tell,seek} return -1 when successful
=> should return expected value from llseek_syscall()
-core_units (-opt_memory) assert from addr16 optimization
to handle jecxz/loop in cbr.c we need to set the target of the
long-taken instr added at the end of the sequence, so
instr_convert_short_meta_jmp_to_long() now returns that
avoid stack overflow in "-opt_memory -loglevel 4" beyond this stack
(showing frame size here):
<decode_fragment> sub $0x324,%esp
<decode_fragment_exact> sub $0x34,%esp
<fragment_recreate_with_linkstubs> sub $0x70,%esp
<common_disassemble_fragment> sub $0x2a0,%esp
<disassemble_fragment> sub $0x34,%esp
<shift_links_to_new_fragment> sub $0x390,%esp
<emit_fragment_common> sub $0x59c,%esp
<emit_fragment_as_replacement> sub $0x28,%esp
<end_and_emit_trace> sub $0x160,%esp
<monitor_cache_enter> sub $0x34c,%esp
sub $0x13c,%esp
added -O to debug (should still be pretty debuggable) to get big
reductions:
<decode_fragment> sub $0x12c,%esp
<decode_fragment_exact> sub $0x20,%esp
<fragment_recreate_with_linkstubs> sub $0x3c,%esp
<common_disassemble_fragment> sub $0xcc,%esp
<disassemble_fragment> sub $0x28,%esp
<shift_links_to_new_fragment> sub $0xcc,%esp
<emit_fragment_common> sub $0xdc,%esp
<emit_fragment_as_replacement> sub $0x18,%esp
<end_and_emit_trace> sub $0x7c,%esp
<monitor_cache_enter> sub $0x7c,%esp
sub $0xdc,%esp
with -opt_memory we should not try to lazy-link across signal delivery
(shows up as signal0001 assert: link.c:3396 stub != NULL)
handle_post_sigprocmask() was using reg values instead of values
stored pre-syscall causing common/segfault to crash in DR code
to handle execve, I'm putting libdrpreload.so into the same
lib32/{debug,release} directory as libdynamorio.so.
this must have been broken for a while: in internal tests we used to
have exports/x86_linux_dbg/ that held both libraries which is one
reason it was not noticed, b/c our regression tests were not run with
the release layout.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=23

libutil/, tools/, and api/ should put build results in top-level build/ and exports/

From [email protected] on February 12, 2009 14:36:44

this was PR 196865
should move pieces of core/Makefile to top-level Makefile to shared output
dirs and keep source dirs uncluttered

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=6

perf: two-layer ibl hashtable with inner fixed-size

From [email protected] on February 21, 2009 10:52:11

We should try a two-layer scheme for rets: a 256-entry table updated on
every call, with collision chaining at the target.

See Ole's paper: http://engweb.vmware.com/~agesen/wbia2006.pdf summary of scheme:

2 level return lookup hashtable
  1st level fixed size, direct-mapped 256 entries
    no cmp for empty or for collision (cmp at target)
  2nd level full table                               
@ every call prime the first level w/ a store        
  mov after_call_frag_prefx => table_slot            
@ return                                             
  spill eax, ecx                                     
  ret addr -> ecx                                    
  movzx cl -> eax                                    
  jmp *table_base + eax                              
@ after_call_frag_prefx                              
  lea compare ecx to after_call_addr                 
  => miss: spill flags, full ret ibl                 
     hit: continue (common case no eflags and small dcache footprint)

discussion notes:

thread-shared full ret ibl table
thread-private 256-entry table: but then w/ shared fragments need
spill + extra instrs on call store?
stick whole table in TEB
or use own segment (PR 208009)
or try shared table: on uniprocessor may work fine
no call inlining
mark after-call as FRAG_X, propagate to trace if 1st bb there
insert collision prefix if fragment starts w/ FRAG_X
need to manage hardcoded code cache addr versus cache deletion
hack to combine w/ linking:
- put in unreachable jmp to after-call addr, after jmp to callee
- when unlinked, put table-empty addr there
- when linked, put code cache addr there
selfprot: how allow write to table?
if through segment and ds has limit below it then could protect

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=31

early and comprehensive injection on Linux

From [email protected] on February 24, 2009 11:25:45

for LD_PRELOAD we'll start by using -z initfirst. for that we need libc
independence ( issue #48 /PR 206369) and to directly read our env vars off the
stack.
we should also directly read the elf aux vector (PR 289138).

xref issue #37 /PR 248204: ptrace injection

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=47

custom trace interface expansion: full control over traces

From [email protected] on February 24, 2009 11:28:08

this was PR 262400: custom trace interface expansion: full control over traces

user requests for more control over trace creation:

full control over trace heads (not just DR trace heads + additional
client trace heads)
build but don't execute (to evaluate profiling cost)
other than NET (take in bb tag sequence? or full instrlist)

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=48

build: gcc 4.1.2 warnings with new -O

From [email protected] on February 24, 2009 09:43:39

after r20 swapped debug build to -O, now with gcc 4.1.2 we have warnings on
all strings that are longer than the 50-char STAT_NAME_MAX_LEN:
In file included from dynamo.c:320:
./lib/statsx.h: In function 'statistics_init':
./lib/statsx.h:180: warning: value computed is not used
./lib/statsx.h:181: warning: value computed is not used
...

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=35

gcc 4.3.0 bug: ASSERT on all x64 apps: "reachability contraints not satisfiable"

From [email protected] on February 15, 2009 17:11:00

ASSERT with gcc 4.3.0 on all x64 apps: "reachability contraints not
satisfiable"

SYSLOG_ERROR: Application hello64 (1982). Internal Error Internal
DynamoRIO Error: heap.c:474 heap_allowable_region_start <=
must_reach_region_start && "PR 215395 reachability contraints not satisfiable"

This is due to gcc 4.3.0, even at -O0, considering "ptr - const < ptr" (in
REACHABLE_32BIT_END()) to always be true, which is an invalid optimization
as it ignores underflow. I have a simple test app that shows the problem
very clearly, inlined below with objdump -dS. I don't see any gcc options
to affect this optimization.

The disturbing aspect is I don't know what other bugs this gcc behavior is
going to cause: certainly throughout win32/os.c we watch for overflow via
"if (pb + mbi.RegionSize < pb)".

The toolchain compiler gcc 4.1.2 does not have the buggy behavior of course.

/work/dr/test/gcc-underflow-bug.c:
#include <stdio.h>

int main()
{
unsigned char ptr1 = (unsigned char *) 0x0000000071542000;
unsigned char *ptr3 = ptr1 - 0x80000000; / 0xfffffffff1542000 */
if (ptr3 < ptr1)
printf("ptr3 less than\n");
if ((ptr1 - 0x80000000) < ptr1)
printf("temp less than\n");
}

#if 0
gcc 4.3.0 -O0:
int main()
{
4004cc: 55 push %rbp
4004cd: 48 89 e5 mov %rsp,%rbp
4004d0: 48 83 ec 20 sub $0x20,%rsp
unsigned char ptr1 = (unsigned char *) 0x0000000071542000;
4004d4: 48 c7 45 f0 00 20 54 movq $0x71542000,-0x10(%rbp)
4004db: 71
unsigned char *ptr3 = ptr1 - 0x80000000; / 0xfffffffff1542000 */
4004dc: 48 8b 45 f0 mov -0x10(%rbp),%rax
4004e0: 48 05 00 00 00 80 add $0xffffffff80000000,%rax
4004e6: 48 89 45 f8 mov %rax,-0x8(%rbp)
if (ptr3 < ptr1)
4004ea: 48 8b 45 f8 mov -0x8(%rbp),%rax
4004ee: 48 3b 45 f0 cmp -0x10(%rbp),%rax
4004f2: 73 0a jae 4004fe <main+0x32>
printf("ptr3 less than\n");
4004f4: bf 08 06 40 00 mov $0x400608,%edi
4004f9: e8 ba fe ff ff callq 4003b8 puts@plt
if ((ptr1 - 0x80000000) < ptr1)
printf("temp less than\n");
4004fe: bf 17 06 40 00 mov $0x400617,%edi
400503: e8 b0 fe ff ff callq 4003b8 puts@plt
}
400508: c9 leaveq
400509: c3 retq

/build/toolchain/lin32/gcc-4.1.2-2/bin/x86_64-linux-gcc:
if ((ptr1 - 0x80000000) < ptr1)
4004ca: 48 8b 45 f0 mov -0x10(%rbp),%rax
4004ce: 48 05 00 00 00 80 add $0xffffffff80000000,%rax
4004d4: 48 3b 45 f0 cmp -0x10(%rbp),%rax
4004d8: 73 0a jae 4004e4 <main+0x4c>
printf("temp less than\n");
4004da: bf db 05 40 00 mov $0x4005db,%edi
4004df: e8 d4 fe ff ff callq 4003b8 puts@plt
}
#endif

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=14

file Issues on existing lists of bugs and features

From [email protected] on February 11, 2009 16:05:23

I have some long lists of bugs and features that should be transferred into
this Issue tracker. In particular, anything referenced in the code via
either a Determina or VMware bug database number, so future developers can
go look at the details.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=3

qin.zhao/2009/i1-post-syscall2

From [email protected] on February 16, 2009 22:27:33

stats:
110 diff lines

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=25

CRASH x64 drinject stack mis-alignment: instr_create_restore_from_dc_via_reg()

From [email protected] on February 15, 2009 15:49:30

Just running release build x64 drinject on
suite/tests/client-interface/strace I hit a somewhat non-deterministic
crash: only happens within test harness. The problem is that drinject is
not aligning the stack to 16 for the 3 calls it makes. The crash shows up
like this:

(adc.158): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
dynamorio!instr_create_restore_from_dc_via_reg+0x10c:
000000007107fb2c 0f28442420 movaps xmm0,xmmword ptr [rsp+20h] ss:000000000012fa68=0000000800000128000000001f009006
0:000> kn

Child-SP RetAddr Call Site

00 000000000012fa48 0000000071066448
dynamorio!instr_create_restore_from_dc_via_reg+0x10c
[d:\derek\opensource\dynamorio\core\x86\instr.c @ 4818]
01 000000000012fa98 0000000071068ab4
dynamorio!emit_fcache_enter_common+0x128
[d:\derek\opensource\dynamorio\core\x86\emit_utils.c @ 3173]
02 000000000012fc78 000000007105b103
dynamorio!emit_fcache_enter_shared+0x14
[d:\derek\opensource\dynamorio\core\x86\emit_utils.c @ 3825]
03 000000000012fcb8 000000007105b7c0 dynamorio!shared_gencode_init+0x113
[d:\derek\opensource\dynamorio\core\x86\arch.c @ 308]
04 000000000012fd48 000000007102470b dynamorio!arch_init+0x10
[d:\derek\opensource\dynamorio\core\x86\arch.c @ 523]
05 000000000012fd78 0000000071090562 dynamorio!dynamorio_app_init+0x10b
[d:\derek\opensource\dynamorio\core\dynamo.c @ 463]
06 000000000012fde8 0000000071093a00 dynamorio!auto_setup+0x22
[d:\derek\opensource\dynamorio\core\x86\x86_code.c @ 144]
07 000000000012fe28 0000000000000000 dynamorio!dynamo_auto_start+0x10
0:000> r
rax=0000000000000000 rbx=000000001f602270 rcx=ffffffffffffffff
rdx=0000000000000068 rsi=ffffffffffffffff rdi=ffffffffffffffff
rip=000000007107fb2c rsp=000000000012fa48 rbp=0000000000000000 r8 =000000001f5d2111 r9 =0000000000000128 r10 =0000000000000000 r11 =000000000012fbb8 r12 =0000000000000000 r13 =0000000000000000 r14 =000000001f601570 r15 =000000001f602280
iopl=0 nv up ei pl zr na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010246
dynamorio!instr_create_restore_from_dc_via_reg+0x10c:
000000007107fb2c 0f28442420 movaps xmm0,xmmword ptr [rsp+20h] ss:000000000012fa68=0000000800000128000000001f009006

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=13

distinguish external and internal drinject

From [email protected] on February 24, 2009 10:39:55

We build drinject for the release package with EXTERNAL_INJECTOR=1 to
provide a simplified interface. I wasn't sure which to build from the
top-level Makefile: maybe both, and we should rename one of them to avoid
confusion?

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=39

support clients using standard libraries: STL in particular

From [email protected] on February 19, 2009 13:52:10

This is a key feature to make it easier to write clients.
This was PR 200203.

If the library is linked statically and has all of its calls to other
libraries and system calls wrapped (and those calls do not interact with
other user-mode resources) then it should be safe to use.

We already provide __wrap_malloc, __wrap_realloc, and __wrap_free so
clients can easily use ld's -wrap feature to override malloc, realloc, and
free at link time. In practice this may be sufficient for many libraries,
though there are no guarantees unless we wrap all calls to libc and all raw
system calls.

There are still issues with easily creating a version of STL libraries that
are PIC but static. It would be nice if we provided instructions for how
to do so, since the provided static libraries on more modern Linux distros
do not work properly. We got this to work only on a 32-bit system with
provided libraries (see api/docs/intro.dox sec_extlibs); on other systems
we either hit link-time or run-time errors.

Xref http://www.govirtual.org/message/1110 There are also issues with STL libraries making system calls that hit our
vsyscall hook (I put in a debug build check that then suggests
-sysenter_is_int80 as a workaround) or bypass our handling.

It would also be nice to have STL support on Windows.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=30

ASSERT suite/tests/runall/detach_test.exe: x86\emit_utils.c:6924 after_shared_syscall_code_ex(dcontext _IF_X64(mode)) < pc && nxt_pc < pc

From [email protected] on February 16, 2009 17:52:51

short regression run #1 with -code_api:
runlog.001:> <Application
d:\derek\dr\regression\cleanup\suite\tests\runall\detach_test.exe (2164).
Internal Error Internal DynamoRIO Error: x86\emit_utils.c:6924
after_shared_syscall_code_ex(dcontext _IF_X64(mode)) < pc && nxt_pc < pc

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=22

derek.bruening/2009/i13-i14-x64-bugs

From [email protected] on February 15, 2009 18:56:07

Fixed some x64 bugs:

issue #13: CRASH x64 drinject stack mis-alignment: instr_create_restore_from_dc_via_reg()

fixed by padding prior to any assumptions on TOS

issue #14: underflow bug in gcc 4.3.0

workaround for now
also fixed 64-bit warnings in strace client that show up with VS2005 compiler

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=15

targeted injection on Linux via ptrace

From [email protected] on February 24, 2009 09:59:58

we should implement targeted injection a la drinject on Linux, via ptrace,
as an alternative when LD_PRELOAD is unavailable (due to security policies)

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=37

tests: suite/tests/* object files and results/ should go in top-level build/

From [email protected] on February 15, 2009 15:43:32

xref issue #6 it would be nice to clean up /suite/tests/* object files to go into
something like /build/tests/

we could put the results/ dir there as well

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=12

optimize and shrink clean call sequences

From [email protected] on February 24, 2009 11:01:08

With a post optimization pass, or perhaps more clever use of existing
passes, we could do much better on calling convention (PR 250976) and xsp
(PR 307242) conflicting args.

We should also really consider inlining client callees (PR 218907), since
clean calls for 64-bit are enormous (71 instrs/264 bytes for 2-arg x64;
26 instrs/99 bytes for x86) and we could avoid all the xmm saves and replace
pushf w/ lahf.

Xref PR 264138, PR 302107, PR 306394.

We can scan the callee and see which registers and flags do not need to be
saved.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=42

tests: set up nightly regression tests

From [email protected] on February 14, 2009 10:30:29

If we can procure cycles on machines somewhere we should set up nightly or
bi-nightly regression tests.

Our tests at Determina used to take over 24 hours. After issue #8 prunes
them to target just the API they should get shorter.

For a first pass we should have:

64-bit Linux of some sort: can run 32-bit tests here
32-bit Windows XP
64-bit Windows Vista

Due to missing WOW64 follow-children (issue to be file: xref issue #3 ) we
can't do full 32-bit testing on a 64-bit Windows machine.

Ideally we should have at least one machine with some flavor of Windows NT,
Windows 2000, Windows XP, Windows 2003, and Windows Vista, in each of
32-bit and for the later ones 64-bit, along with several flavors of Linux,
and both AMD and Intel processors of several varieties. We can use virtual
machines for nearly all our tests but it would be nice to have nightly
performance tests on native machines.

We should also have at least one 64-bit Linux and one 64-bit Windows
machine that can run a pre-commit regression test on demand for use by
developers who do not have one of each flavor locally available.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=11

perf: ibl opts: cmp-vs-literal, no collision cmp

From [email protected] on February 21, 2009 10:55:05

this was PR 215263

recording some things that other systems use to achieve performance on
ibs, and that we may want to try

driven by CGO07 paper "Evaluating Indirect Branch Handling
Mechanisms in Software Dynamic Translation Systems" though that's on
Strata where they cheat on returns and so don't consider them in their
study

sieve == expand table into cmp-vs-literal, trading code for data
footprint (HDTrans)
inlined cmps (like my cgo03): 1st 2 targets for call* good, but jmp*
needs profiling and in fact 1st 2 there is worse
not in that paper but in a vmware paper: no collision check inline,
instead jmp to resolution code, if no conflict jmp straight there

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=32

symbol table lookup support for clients

From [email protected] on February 24, 2009 11:12:45

this was PR 243532

I'm sure plenty of users would like runtime symbol table lookup, though for
most uses post-processing (if simply presenting a source code line to a user,
or passing to an offline tool) or pre-processing would probably work.

Xref our own internal closest-exported-symbol lookup in module.c, and the
very old linux/symtab.c.

For Linux we can get runtime symbol access more easily than for Windows.
We may want to split it off into a separate isolated library. Xref client
utility library discussions.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=44

handle execve of different-architecture child: 32 to 64, 64 to 32

From [email protected] on February 16, 2009 17:42:26

we'll need to decide from the parent which lib32 vs lib64 to use

xref the analogue on Windows: PR 254193: [x64] inject into
different-architecture child: x64 to WOW64, WOW64 to x64

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=20

update makefiles to build with local tools instead of /build/toolchain

From [email protected] on February 11, 2009 15:46:52

The makefiles are currently assuming that the VMware toolchain is present.
For a proprietary product, IMHO having the toolchain standardized and
either in the repository or on a server is the right way to go, in order to
properly build old versions.

However, for an open source project we want to support building on as wide
a variety of toolchains as possible, and we don't really need to fix bugs
in old versions (though we could support that by listing not only minimum
versions but also maximum in our Makefiles). (Plus, we can't exactly
commit, say, the Windows DDK into our repository.) So instead of only
supporting a single version, long-term we should try to expand support.
There are many issues there as different versions of a compiler bring up
different subtle issues.

Here is a short list of what we currently require:

Linux:

Require gcc built to target both 32-bit and 64-bit
Require gas built to target both 32-bit and 64-bit
Require gas >= 2.18.50 for -msyntax=intel -mmnemonic=intel
Require gas >= 2.16 for fxsaveq
Require gcc >= 3.4 for -fvisibility: though we do have a workaround
Windows:
Require cl for intrinsics
Require cl >= 14.0 to target both 32-bit and 64-bit and for C99 macro varargs
Require cygwin perl and other cygwin tools
Both:
Require doxygen >= 1.5.1, 1.5.3+ better

We can split future work (such as switching to cmake, getting version #s
from top level instead of being hardcoded in certain places, parallelizing
the build, re-enabling dependency checking, etc.) off into separate Issues
but for now this issues covers getting things working for the initial
developers (we'll reduce the priority once that's covered).

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=2

build: version number should propagate from top-level Makefile/makezip.sh

From [email protected] on February 16, 2009 14:39:49

split from issue #2 today the version # is in multiple places. they should all come
from a single top-level define, if possible:

figure out whether we're back-compat and update *_COMPATIBLE_VERSION
core/lib/dr_api.h and core/x86/instrument.h and core/x86/instrument.c
api/docs/footer.html
api/docs/release.dox
tools/DRgui/DynamoRIO.rc
top-level Makefile

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=17

perf: vtune results: jecxz is bad: experiment w/ jecxz-less ibl

From [email protected] on February 21, 2009 11:00:26

this was PR 212436

latest renewal of vtune analysis perfctr evidence points at jecxz causing
problems
though in branch misprediction, surprisingly
we have a list of cases to file from our last analysis
following up on only one of them here:
two efficient ways to eliminate jecxz:

PR 208795/PR 212049: spill eax,flags and use cmp,jne, w/ ecx spill off
hit path to offset eax/flags preservation
PR 211955: recreate eflags on miss before go to ibl or on hit if live
post-cmp

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=33

port libutil/ and tools/DRview.c to 64-bit

From [email protected] on February 18, 2009 20:31:49

this was PR 244209

in addition to being useful as a standalone tool, this is needed for the
runall tests (xref issue #16 )

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=28

support thread-private versus shared on individual thread basis

From [email protected] on February 24, 2009 11:15:47

this was PR 336588

the idea is to support configuring individual threads to use only
thread-private caches while the rest use thread-shared, to avoid memory
costs while allowing efficient thread-private instrumentation

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=45

handle pre-thread-init and post-exit signals

From [email protected] on February 18, 2009 14:38:22

with NPTL (POSIX) a SYS_kill signal is delivered to a single thread chosen
arbitrarily among those that aren't blocking it (each thread has its own
signal mask). this means a thread can receive a signal during our thread
initialization, before we've set up its dcontext and other structures, in
which case we die.

we also have problems when a signal comes in during process exit after
we've removed our handler: the app then dies ungracefully. technically
that could happen natively too but is much less likely since it exits more
quickly. this is less of an issue for release build and is more of an
annoyance for itimer testing.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=26

port all suite/tests/* to x64

From [email protected] on February 16, 2009 09:34:03

this was PR 262902: [x64] port suite/tests/* to x64

not all of the tests have been ported. the most work involves
externalizing the inline asm, but in the past I put in infrastructure to at
least have it in the same file.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=16

update runregression for new open-source setup

From [email protected] on February 13, 2009 20:33:18

we need to remove the spec2000 references from "make runregression" and
suite/runregression

the set of tests run should probably be tweaked to reflect the current
priorities

it would be nice to have output sent outside of the source tree: xref PR
196865 and issue #6

and of course there is plenty of cleanup and enhancements that our tests
need, and we should set up some nightly regressions on machines somewhere,
but those will be filed as other cases

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=8

HANG -thin_client any app

From [email protected] on February 16, 2009 19:29:23

-thin_client no longer works
might be nice to revive and keep it working
removed from short regressions for now

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=24

add client event for client or DR fault

From [email protected] on February 24, 2009 14:35:01

this was PR 200414

we'll want to add an event for faults occurring in DR or client code so that
a tool can report such failures in a customized way or try to recover from
client faults.

note we still don't have a good plan for SEH across client/DR boundary -
e.g. when a client passed a NULL pointer to a DR routine we'll currently
take the blame

we already provide dr_safe_{read,write}.
xref issue #51 /PR 198875 on adding try/except.

we already provide an exception event on Windows and a signal event on
Linux (PR 304708) for application faults (both of which do not handle
faults in DR or a client, deferring to this case).

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=50

client support for persistent and process-shared caches

From [email protected] on February 24, 2009 10:51:55

this was PR 305329

We should explore pcache support now before the 0.9.6 release to determine
whether we can add it later while preserving backward compatibility.
I think we can:

need client id => we already have it
need to store in pcache, and make part of pcache namespace for
simultaneous pcaches.
for multiple clients, put all ids in namespace, so won't load unless
exact same combination is present on subsequent run.

also add event callback when loading pcache => client can decide whether
to load based on whether already has desired instru in pcache. thus,
want client version # as well (else, client has to bump id#?). could
specify separately when do persist nudge and so not change the dr_init
signature, or make an event callback if DR needs to ask for it. for
multiple clients, if any one of them rejects it, don't use it.

problem: the client id is really just for the current instance of DR and is
specified by the user; it's so the client can request its options, etc.
it's not a persistent tool identifier.

so how will a client know whether its instrumentation is present in a
pcache? do we want two types of identifiers: one that's in the
namespace, for simultaneous pcaches, and another that's in the pcache
header that the client can check for versioning? what if for pcaches we
added two event callbacks: get_client_id() and get_client_version().
they're ints, or maybe 64-bit ints. we use the 1st in the pcache
namespace, and store the 2nd in the pcache header for retrieval when we
load. a client that doesn't supply them isn't allowed to persist. when
DR creates a pcache it will call these for each client so it can label
the pcache with which tools were active when it was created.
problem: confusing having this new notion of id vs the client_id_t used
in dr_init.

=> solution: use client library identifying hash (similar to what we use
for app modules for pcaches and hot patches: checksum + timestamp + size)
in the namespace. if the client is rebuilt, pcaches will have to be
re-generated: that shouldn't be a big deal, and can avoid bugs where the
client dev thinks a change shouldn't affect persistence but really it does.
DR_EMIT_DO_NOT_PERSIST flag so can control on individual fragment
basis => addition only
query whether ilist/fragment is persistable => addition only
control over where pcaches are stored and who they are shared with;
also over when to persist
=> drdeploy/global options, plus maybe API call to freeze/persist
relocation:
- app relocation: client non-meta code: if set translation fields right,
  our own relocation support should cover it
- client meta code relocation: require client to use position-independent
  code? else, how point client at its code later to fix up? make it
  provide a relocation table? in any case should all be additions.

We'll want to finish some internal pieces first:

PR 214016/9581, PR 214084/9649: relocation support
PR 270739: [x64] need vendor pcache flag for lcall,ljmp rex.w differences
PR 214155/9720: offline publisher + two-level hierarchy of sharing
PR 215036/10601, PR 215277/10842: two-phase, lazy md5
perf: PR 215260/10825: measure -no_persist_map_rw_separate prefetch impact
perf: PR 210308/5836: don't want private bbs not being in ibl to
cause perf problems!
PR 206574/2096 (xref PR 215247/10812 winlogon synchall issues)

We also have some performance issues (xref PR 326610):

PR 213262: support -no_indirect_stubs w/ -coarse_units, to avoid losing perf
PR 326815: [perf] -coarse_units slow on gcc and gap

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=40

build: define release package build env; set up nightly regression

From [email protected] on February 24, 2009 09:50:32

for a release package we want maximum binary compatibility

we should have a release build setup as a nightly regression to avoid
breaking it (like in issue #35 )

I've been using gcc 4.1.2 with glibc 2.2.5

With more recent toolchain but running on RHEL4 I hit:

bin/drdeploy -client docs/html/samples/build32/bbsize.so 0x1 "" ls
bin/drdeploy: line 242: 1432 Floating point exception$*

Not sure exactly which fp emulation is being assumed there.

If we use too recent of glibc (2.7+) we end up pulling in:

nm rel-1.3.2/linux/dynamorio/lib32/release/libdynamorio.so | grep GLIBC
U __isoc99_sscanf@@GLIBC_2.7

And then on an older system we hit:

bin/drdeploy -client docs/html/samples/build32/bbsize.so 0x1 "" ls
ls: /lib/tls/libc.so.6: version `GLIBC_2.7' not found (required by
/home/derek/rel-1.3.2/dynamorio/lib32/release/libdynamorio.so)

We could avoid the __isoc99_sscanf by defining _GNU_SOURCE, though defined
everywhere we hit issue #34 : probably we could get it to work just defining
around stdio.h. But better to have an older setup in general for
floating-point support, etc.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=36

APP CRASH calc.exe with client api/samples/cbr

From [email protected] on February 16, 2009 17:50:30

api/samples/cbr (after instr_convert_short_meta_jmp_to_long() now
returns long-taken and we avoid a reachability assert) causes app crash on
calc when starting up: works fine with -disable_traces so probably
decode_fragment() mis-interpreting the client's jecxz-expansion (xref PR
266292)?

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=21

REG_* enum name conflict with sys/ucontext.h

From [email protected] on February 24, 2009 09:39:38

this was PR 371339

if we build a client that includes ucontext.h, usually via signal.h, with
g++ we hit a name conflict:

g++ -m32 -g3 -DLINUX -DSHOW_RESULTS -DX86_32 -fPIC -shared -nostartfiles
-I/work/opensource/dr/exports/include signal.c -o signal.so
In file included from /usr/include/signal.h:353,
from signal.c:40:
/usr/include/sys/ucontext.h:168: error: conflicting declaration ‘REG_EDI’
/work/opensource/dr/exports/include/dr_ir_opnd.h:73: error: ‘REG_EDI’ has a
previous declaration as ‘ REG_EDI’
/usr/include/sys/ucontext.h:170: error: conflicting declaration ‘REG_ESI’
/work/opensource/dr/exports/include/dr_ir_opnd.h:72: error: ‘REG_ESI’ has a
previous declaration as ‘ REG_ESI’
/usr/include/sys/ucontext.h:172: error: conflicting declaration ‘REG_EBP’
/work/opensource/dr/exports/include/dr_ir_opnd.h:71: error: ‘REG_EBP’ has a
previous declaration as ‘ REG_EBP’
/usr/include/sys/ucontext.h:174: error: conflicting declaration ‘REG_ESP’
/work/opensource/dr/exports/include/dr_ir_opnd.h:70: error: ‘REG_ESP’ has a
previous declaration as ‘ REG_ESP’
/usr/include/sys/ucontext.h:176: error: conflicting declaration ‘REG_EBX’
/work/opensource/dr/exports/include/dr_ir_opnd.h:69: error: ‘REG_EBX’ has a
previous declaration as ‘ REG_EBX’
/usr/include/sys/ucontext.h:178: error: conflicting declaration ‘REG_EDX’
/work/opensource/dr/exports/include/dr_ir_opnd.h:68: error: ‘REG_EDX’ has a
previous declaration as ‘ REG_EDX’
/usr/include/sys/ucontext.h:180: error: conflicting declaration ‘REG_ECX’
/work/opensource/dr/exports/include/dr_ir_opnd.h:67: error: ‘REG_ECX’ has a
previous declaration as ‘ REG_ECX’
/usr/include/sys/ucontext.h:182: error: conflicting declaration ‘REG_EAX’
/work/opensource/dr/exports/include/dr_ir_opnd.h:66: error: ‘REG_EAX’ has a
previous declaration as ‘ REG_EAX’

we hit the same conflict in the core if we build with gcc with _GNU_SOURCE
defined

not clear how to solve w/o renaming our enums, which would break many
existing clients

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=34

decoding: non-optimal encodings

From [email protected] on February 14, 2009 10:13:53

non-optimal encodings:

we require user to specify OPSZ_0 to get c1: o/w we use Ib
we do document this in some INSTR_CREATE macros
we do not use 41 90 but instead 41 87 c0 since complex and we don't trust
that all processors treat 41 90 as a non-nop

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=10

i4-make-review

From [email protected] on February 12, 2009 11:20:53

issue #4: set up automated code reviews

moved diff automation from /core/Makefile to /Makefile
replaced scp with svn commit to /reviews///
replaced email with log message for auto-Issue
removed diff numbers
removed manually adding new files to the diff since svn does that for us
removed per-file commit message support

stats:
570 diff lines
Makefile | 120 +++++++++++++++++++++++++++---!
Makefile.custom_build | 26 !!!!!!
core/Makefile | 173 -------------------------------------------!!
core/Makefile.custom_build | 80 --------------------
make/compiler.mk | 3
5 files changed, 107 insertions(+), 260 deletions(-), 35 modifications(!)

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=5

decoding: decoder max should match MAX_INSTR_LENGTH

From [email protected] on February 14, 2009 10:10:53

Today we use a MAX_INSTR_LENGTH define of 17 in various places yet our own
decoder does not stop at 17. The two should match so we can allocate a
buffer and decode from it. Since it is a conservative estimate the maximum
should probably be higher since some processors (at list historically) have
allowed 18, 20, and even (so I've heard, never seen) 24-byte instructions.
The AMD manual claims 15 is the limit, and 17 used to be the typically
assumed limit, but the actual limit at which a processor raises #GP is
variable. We shouldn't make it too high though since we don't want to read
off onto the next page when not necessary.

Example of our decoder on a 23-byte instr (this raises #GP on my Q9300):

0x080494a8 66 66 66 66 66 66 66 data16 nop 0x00000000(%eax,%eax,1)
66 66 66 66 66 66 66 66 0f 1f 84 00 00 00 00 00

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=9

libc independence on Linux

From [email protected] on February 24, 2009 11:22:37

I did not yet remove these libc routines:

worried about, but for now only at init:
U dlclose@@GLIBC_2.0
U dlopen@@GLIBC_2.1
used only at init:
U fclose@@GLIBC_2.1
U feof@@GLIBC_2.0
U fgets@@GLIBC_2.0
U fopen@@GLIBC_2.1
U get_nprocs_conf@@GLIBC_2.0
used only for debugging but will likely be problematic:
U getchar@@GLIBC_2.0
U abort@@GLIBC_2.0
U execlp@@GLIBC_2.0
U remove@@GLIBC_2.0
do not use system calls (I did not verify though, just assuming):
U __environ@@GLIBC_2.0
U __udivdi3@@GLIBC_2.0
U __umoddi3@@GLIBC_2.0
U backtrace@@GLIBC_2.1
U backtrace_symbols_fd@@GLIBC_2.1
U dlerror@@GLIBC_2.0
U dlsym@@GLIBC_2.0
U getenv@@GLIBC_2.0
U memcpy@@GLIBC_2.0
U memmove@@GLIBC_2.0
U memset@@GLIBC_2.0
U rindex@@GLIBC_2.0
U setenv@@GLIBC_2.0
U sqrt@@GLIBC_2.0
U sscanf@@GLIBC_2.0
U stderr@@GLIBC_2.0
U stdout@@GLIBC_2.0
U strcasecmp@@GLIBC_2.0
U strchr@@GLIBC_2.0
U strcmp@@GLIBC_2.0
U strncasecmp@@GLIBC_2.0
U strncat@@GLIBC_2.0
U strncmp@@GLIBC_2.0
U strncpy@@GLIBC_2.0
U strstr@@GLIBC_2.0
U strtoul@@GLIBC_2.0
U tolower@@GLIBC_2.0

Note that we actually call dlclose at process exit which is potentially a
more fragile point then during init. Xref PR 308654 for a crash when we
call dlclose.

xref PR 204554: early injection on Linux, which relies on this case, and
for which we need to access env vars directly

xref glibc issues with binary compatibility: issue #36

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=46

instrument_post_syscall() should be called after post_system_call() processes the real result

From [email protected] on February 11, 2009 13:53:44

I threw in the syscall API too quickly it seems: setting the
mcontext/result post-syscall should be after DR handles the syscall and
should be considered only a cosmetic result for fooling the app.

xref PR 207947 on syscall API feature

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=1

can't use getppid() from new thread to find caller of clone with NPTL

From [email protected] on February 18, 2009 14:54:34

something that wasn't updated properly to handle NPTL: today we call
getppid() from the child to look data up in the parent's dcontext. with
NPTL that does not give us the caller of SYS_clone. plus, the caller could
have exited: ideally the parent should set up a global data structure for
passing info.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=27

set up automated code reviews

From [email protected] on February 12, 2009 11:02:16

we're going to use the commit-diff-to-svn approach.
this case covers adapting core/Makefile's "make diffsend" set of rules to
use the new approach.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=4

simultaneous 32-bit and 64-bit library support for x64 DR controlling WOW64 app

From [email protected] on February 24, 2009 14:14:20

Today 32-bit DR can control the 32-bit parts of a WOW64 app but to see all
of the code including the emulation layer we want 64-bit DR able to run the
whole mixed-mode app. Some of the capabilities here also apply to Linux
mixed-mod apps but those are much, much rarer.

here is my list of cases that will eventually be separately filed here:

PR 240257: support 32-bit clients on WOW64? how mix 32 and 64 bit code?
like Pin, give stream to separate clients? or give 32-bit code to
64-bit client?
PR 253431: [wow64] simultaneous 32-bit and 64-bit dll support in 64-bit DR
PR 314367: re-enable x64 DR controlling WOW64 process once it works
PR 272553: [x64] late injection must switch from kernel32 to ntdll for
wow64 children
PR 271317: preserve cs changes from far ctis and iret
PR 283895: [x64][correctness][performance] for x86 code use separate
x86 ibl tables and compacted or separate tls
PR 283152: support high bit preservation across mode changes
PR 284029: [x64] support syscalls in x86 mode
TODO: reg_spill_dcontext_offs(reg_id_t reg):
/* Use REG_E?? instead of REG_X?? to eventually support 32-bit code
spills in
- mixed 64-bit/32-bit execution. */
PR 269595: WOW64 context translation failing when at our own
post-syscall point
PR 254193: [x64] inject into different-architecture child: x64 to
WOW64, WOW64 to x64
=> long-term we'll only support 64-bit-DR in WOW64 following (PR 253431)
PR 253943: [x64] support sysenter
PR 255555: [x64] 32-bit drinject options for launching 64-bit exe
how know if ilist is 32-bit: instr_get_x86_mode() on each instr, or
can assume if 1st then whole is same? shouldn't matter for most ops:
IR is rich enough and cross-platform enough

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=49

auto-inline instrumentation code

From [email protected] on February 24, 2009 11:02:28

auto-inline client calls when possible, with auto-saving of the minimal app
state for the instrumentation code in question

we could eliminate the arg push/pop and func prologue/epilogue and do
simple inlining without much implementation work

xref issue #42 (PR 307874): optimize and shrink clean call sequences based
on callee
analysis, short of full inlining

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=43