Git Product home page Git Product logo

drace's Introduction

DRace

REUSE status CII Best Practices

DRace is a data-race detector for Windows applications which uses DynamoRIO to dynamically instrument a binary at runtime. It does not require any preparations like instrumentation of the binary to check. While the detector should work with all binaries that use the Windows synchronization API, we focus on applications written in C and C++. Experimental support for hybrid applications containing native and Dotnet parts is implemented as well.

For best results, we recommend to provide debug symbols of the application under test along with the binary. Without this information, callstacks cannot be fully symbolized (exported functions only) and user-level synchronization cannot be detected.

Dependencies

Runtime

When using the pre-build releases, only DynamoRIO is required:

Using the DRace Race Detector

Run the detector as follows

drrun.exe -no_follow_children -c drace-client.dll <detector parameter> -- application.exe <app parameter>
# see limitations for -no_follow_children option

Command Line Options

SYNOPSIS
        drace-client.dll [-c <config>] [-d <detector> [<detector-options>]...] [-s <sample-rate>]
                         [-i <instr-rate>] [--lossy [--lossy-flush]] [--excl-traces] [--excl-stack]
                         [--excl-master] [--stacksz <stacksz>] [--no-annotations] [--delay-syms]
                         [--suplevel <level>] [--sup-races <sup-races>] [--xml-file <filename>]
                         [--out-file <filename>] [--logfile <filename>] [--extctrl] [--brkonrace]
                         [--stats] [--version] [-h] [--heap-only]
OPTIONS
        DRace Options
            -c, --config <config>
                    config file (default: drace.ini)
            -d, --detector <detector> <detector-options>
                    race detector (default: tsan)
            sampling options
                -s, --sample-rate <sample-rate>
                    sample each nth instruction (default: no sampling)
                -i, --instr-rate <instr-rate>
                    instrument each nth instruction (default: no sampling, 0: no instrumentation)
            analysis scope
                --lossy
                    dynamically exclude fragments using lossy counting
                --lossy-flush
                    de-instrument flushed segments (only with --lossy)
                --excl-traces
                    exclude dynamorio traces
                --excl-stack
                    exclude stack accesses
                --excl-master
                    exclude first thread
            --stacksz <stacksz>
                    size of callstack used for race-detection (must be in [1,31], default: 31)
            --no-annotations
                    disable code annotation support
            --delay-syms
                    perform symbol lookup after application shutdown
            --suplevel <level>
                    suppress similar races (0=detector-default, 1=unique top-of-callstack entry,
                    default: 1)
            --sup-races <sup-races>
                    race suppression file (default: race_suppressions.txt)
            data race reporting
                --xml-file, -x <filename>
                    log races in valkyries xml format in this file
                --out-file, -o <filename>
                    log races in human readable format in this file
            --logfile, -l <filename>
                    write all logs to this file (can be null, stdout, stderr, or filename)
            --extctrl
                    use second process for symbol lookup and state-controlling (required for Dotnet)
            --brkonrace
                    abort execution after first race is found (for testing purpose only)
            --stats display per-thread statistics on thread-exit
            --version
                    display version information
            -h, --usage
                    display help
        Detector Options
            --heap-only
                    only analyze heap memory (not supported currently)

Available Detectors

DRace is shipped with the following detector backends:

  • tsan (internal ThreadSanitizer)
  • fasttrack (note: experimental)
  • dummy (no detection at all)
  • printer (print all calls to the detector)

tsan

The detector is run along with the application. No further threads are started.

fasttrack

An implementation of the FT2 algorithm. Less optimized than tsan, still experimental support only.

dummy

This detector does not detect any races. It is there to evaluate the overhead of the other detectors vs the instrumentation overhead.

Externally Controlling DRace

DRace can be externally controlled from a controller (msr.exe) running in a second process. To set the detector state during runtime, the following keys (committed using enter) are available:

e        enable detector on all threads
d        disable detector on all threads
s <rate> set sampling rate to 1/x (similar to `-s` in DRace)

Symbol Resolving

DRace requires symbol information for wrapping functions and to resolve stack traces. For the main functionality of C and C++ only applications, export information is sufficient. However for additional and more precise race-detection (e.g. C++11, QT), debug information is necessary.

The application searches for this information in the path of the module and in _NT_SYMBOL_PATH. However, only local symbols are searched (non SRV parts).

If symbols for system libraries are necessary (e.g. for Dotnet), they have to be downloaded from a symbol server. Thereto it is useful to set the variable as follows:

set _NT_SYMBOL_PATH="c:\\symbolcache\\;SRV*c:\\symbolcache\\*https://msdl.microsoft.com/download/symbols"

Note: Downloading symbols is only supported if the Debugging Tools for Windows are installed. If not, DRace uses the default dbghelp.dll which is not bundled with symsrv.dll and hence is not able to download symbols.

Dotnet

For .Net managed code, a second process (MSR) is needed for symbol resolution. The MSR is started as follows:

ManagedResolver\msr.exe [-v for verbose]

After it is started, DRace connects to MSR using shared memory. The MSR then tries to locate the correct DAC DLL to resolve managed program counters and symbols.

The output (logs) of the MSR are just for debugging reasons. The resolved symbols are passed back to DRace and merged with the non-managed ones.

Note: To properly detect dotnet synchronization, pdb symbol information is required. The pdb files have to perfectly match the used dotnet libraries. Hence, it is (almost always) mandatory to let the MSR download the symbols. Thereto, point the _NT_SYMBOL_PATH variable to a MS symbol server, as shown one section above.

Custom Annotations

Custom synchronization logic is supported by annotating the corresponding code sections. Thereto we provide a header with macros in drace-client/include/annotations/drace_annotation.h. To enable these macros, define DRACE_ANNOTATION prior to including the header.

A example on how to use the annotations is provided in test/mini-apps/annotations/.

Testing

The unit test for the detector backends and other components can be executed with ctest:

# Unit tests
ctest -j4 -T test --output-on-failure

Integration tests for the complete DR-Client can be executed using the following command:

# Integration Tests
# Windows
./bin/drace-system-tests.exe --gtest_output="xml:test-system-results.xml"
#Linux
./bin/drace-system-tests --gtest_output="xml:test-system-results.xml"

Note: Before pushing a commit, please run the integration tests. Later on, bugs are very tricky to find.

Build

DRace is build using CMake. The only (mandatory) external dependency is DynamoRIO. For best compatibility with Windows 10, use the latest available weekly build. The path to your DynamoRIO installation has to be set using -DDynamoRIO_DIR.

If you want to use the drace-gui, you must specify a path to boost and Qt5.

All other dependencies are either internal or included using git submodules. To clone all submodules of this repository, issue the following commands inside the drace directory:

git submodule init
git submodule update --recursive

A sample VisualStudio CMakeSettings.json is given here:

{
  "name": "x64-Release",
  "generator": "Ninja",
  "configurationType": "RelWithDebInfo",
  "inheritEnvironments": [ "msvc_x64_x64" ],
  "buildRoot": "${env.USERPROFILE}\\CMakeBuilds\\${workspaceHash}\\build\\${name}",
  "installRoot": "${env.USERPROFILE}\\CMakeBuilds\\${workspaceHash}\\install\\${name}",
  "cmakeCommandArgs": "-DBUILD_TESTING=1 -DDynamoRIO_DIR=<PATH-TO-DYNAMORIO>/cmake -DBOOST_ROOT=<PATH-TO-BOOST> -DCMAKE_PREFIX_PATH=<PATH-TO-QT>\\msvc2017_64\\lib\\cmake\\Qt5",
  "buildCommandArgs": "-v",
  "ctestCommandArgs": ""
}

Documentation

A doxygen documentation can be generated by building the doc target.

Dependencies

For detailed information on all dependencies, see DEPENDENCIES.md.

Development

  • CMake > 3.8
  • C++11 / C99 Compiler
  • DynamoRIO > 8.0.x

External Libraries

DRace

Mandatory:

Optional:

DRaceGUI

Mandatory:

Managed Symbol Resolver (MSR)

Mandatory:

Tools

DRaceGUI

The DRaceGUI is a graphical interface with which one can use DRace without typing a very long command into the Powershell. This is especially useful for users which use DRace for the first time.

For more information have a look in here

ReportConverter

With the ReportConverter an HTML report generator was added to the project. By using the ReportConverter.py script (or the ReportConverter.exe, which is very slow) one can generate an HTML report from the generated drace XML report.

For more information have a look in here

Standalone

The DRACE_ENABLE_RUNTIME CMake flag can be set to false, if one only wants to build the standalone components of the DRace project. Can be used as a standalone data race detector backend for more general problems.

Standalone Components:

  • drace::detector::Fasttrack (Standalone Version)
  • Binary Decoder

Limitations

All Detectors

  • The size of variables is not considered when detecting races (Only races of variables with the same (base) address are detected. Potential overlaps of variables are ignored.).
  • Finished threads are deleted from the analysis. No race detection of already finished threads.
  • When using powershell, debug outputs from the detector backend are lost. Use e.g. the git bash instead.

TSAN

  • TSAN can only be started once, as the cleanup is not fully working
  • no_follow_children: Due to the TSAN limitation, drace can only analyze a single process. This process is the initially started one.

Fasttrack (Standalone)

  • On 32 Bit architectures, only 16 bits are used to store a thread. This could theoretically cause problems as Windows TIDs are 32 bits (DWORD). If two threads would have the same last 16 bits, they would be considered as the same frame.

Licensing

DRace is primarily licensed under the terms of the MIT license.

Each of its source code files contains a license declaration in its header. Whenever a file is provided under an additional or different license than MIT, this is stated in the file header. Any file that may lack such a header has to be considered licensed under MIT (default license).

If two licenses are specified in a file header, you are free to pick the one that suits best your particular use case. You can also continue to use the file under the dual license. When choosing only one, remove the reference to the other from the file header.

External Resources

Most external resources are located in the vendor directory. For licensing information regarding these components, we refer to the information bundled with the individual resource.

License Header Format

We use the REUSE format for license and copyright information.

/*
 * DRace, a dynamic data race detector
 *
 * Copyright <YEAR> <COPYRIGHT HOLDER>
 *
 * SPDX-License-Identifier: MIT
 */

Citing DRace

A publicly available fulltext of the master's thesis can be found here: High Performance Dynamic Threading Analysis for Hybrid Applications

To cite DRace, please reference:

F. Mößbauer. "High Performance Dynamic Threading Analysis for Hybrid Applications", Master Thesis, Faculty of Mathematics, Computer Science and Statistics, Ludwig-Maximilians-Universität München (2019).

BibTex

@misc{moes19,
           title = {High Performance Dynamic Threading Analysis for Hybrid Applications},
         keyword = {Concurrency Bugs; Race Condition; Program Analysis; Binary Instrumentation; Sampling; Managed Applications},
        abstract = {Verifying the correctness of multithreaded programs is a challenging task due to errors that occur sporadically. Testing, the most important verification method for decades, has proven to be ineffective in this context. On the other hand, data race detectors are very successful in finding concurrency bugs that occur due to missing synchronization. However, those tools introduce a huge runtime overhead and therefore are not applicable to the analysis of real-time applications. Additionally, hybrid binaries consisting of Dotnet and native components are beyond the scope of many data race detectors.
In this thesis, we present a novel approach for a dynamic low-overhead data race detector. We contribute a set of fine-grained tuning techniques based on sampling and scoping. These are evaluated on real-world applications, demonstrating that the runtime overhead is reduced while still maintaining a good detection accuracy. Further, we present a proof of concept for hybrid applications and show that data races in managed Dotnet code are detectable by analyzing the
application on the binary layer. The approaches presented in this thesis are implemented in the open-source tool DRace.},
            year = {2019},
          author = {Felix M\"o\ssbauer},
             url = {http://nbn-resolving.de/urn/resolver.pl?urn=nbn:de:bvb:19-epub-60621-8}
}

drace's People

Contributors

ddiefenthaler avatar fmoessbauer avatar philip-harr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

drace's Issues

Fasttrack is not finding indirect access race

Dataracebench test cases:
DRB005-indirectaccess1-orig-yes.exe, DRB006-indirectaccess2-orig-yes.exe DRB007-indirectaccess3-orig-yes.exe, DRB008-indirectaccess4-orig-yes.exe

int main (int argc, char* argv[])
{
  // max index value is 2013. +12 to obtain a valid xa2[idx] after xa1+12.
  // +1 to ensure a reference like base[2015] is within the bound.
  double * base = (double*) malloc(sizeof(double)* (2013+12+1));
  if (base == 0)
  {
    printf ("Error in malloc(). Aborting ...\n");
    return 1;  
  }

  double * xa1 = base;
  double * xa2 = xa1 + 12;
  int i;

  // initialize segments touched by indexSet
  for (i =521; i<= 2025; ++i)
  {
    base[i]=0.5*i;
  }
// default static even scheduling may not trigger data race, using static,1 instead.
#pragma omp parallel for schedule(static,1)
  for (i =0; i< N; ++i) 
  {
    int idx = indexSet[i];
    xa1[idx]+= 1.0 + i;
    xa2[idx]+= 3.0 + i;
  }

  printf("x1[999]=%f xa2[1285]=%f\n", xa1[999], xa2[1285]);
  free (base);
  return  0;
}

DRace crashes on skylake+CLR if HeapAlloc is wrapped

This crash only occurs on a server with a skylake processor:
OS: Windows 10 Enterprise, 10.0.17134
CPU: Intel Xeon Silver 4116
RAM: 80 GB

on a similar system with Kaby Lake Refresh arch, the bug does not happen.
OS: Windows 10 Enterprise, 10.0.17134
CPU: Intel Core i5-8350U
RAM: 16 GB

# RetAddr           : Args to Child                                                           : Call Site
00 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x0

The crash is reproducible using this application and happens here:
https://github.com/siemens/drace/blob/master/test/mini-apps/cs-sync/main.cs#L115

The crash is not directly related to DRace, but might be an DR issue. Running the wrap sample client also crashes:

C:\opt\DynamoRIO-Windows-7.91.18219-0\bin64\drrun.exe -c C:\opt\DynamoRIO-Windows-7.91.18219-0\samples\bin64\wrap.dll -- gp-cs-sync-clr.exe monitor

Workaround: comment out the line containing HeapAlloc in drace.ini

Probably related issues:

Implement support for OpenMP barrier (and alike)

Many OpenMP functions internally map to the windows sync api, which is fine. However, some are implemented in VCOMPxxx.dll directly. These have to be intercepted and mapped to happens-before. This especially applies for the barriers at the end of each parallel region.

Debug information for a detector

To debug a detector using drace-test.exe or drrun.exe with the drace-client.dll the sources of the detector must be mentioned in the according cmake files.

Is it okay to just mention them statically in the cmake file (like shown in the pictures)

image
(CMakeLists of drace-test)

image
(CMake of drace-client)

sporadic crashes in managed modules

We observe sporadic crashes in large managed applications. When running in debug mode, the crashes happen more often and mostly after loading System.Linq.Expressions.dll.

<Application C:\Program Files\dotnet\dotnet.exe (30488).  Internal Error: DynamoRIO debug check failure: ..\..\core\translate.c:948 false
(Error occurred @245994 frags)
version 7.91.18137, custom build
-no_dynamic_options -client_lib 'C:\Users\felix\CMakeBuilds\a40b8e65-abeb-fe34-9f21-0ec53bca8898\build\x64-Debug-TSAN\drace-client\drace-client.dll;0;"-c" "C:\Users\felix\source\repos\drace\drace.ini" "-d" "dummy" "-i" "0"' -code_api -probe_api -stack_size 56K -max_elide_jmp 0 -max_elide_call 0 -no_inline_ignored_s
0x00000001e597ae20 0x000001f42a426b60

Implement wild-card matching for module names

On Linux, module names look like libc.so.2.6 or libfoo.1.0.0.so. To reliably exclude these modules using the config file (or to add custom logic like function interception), we need wildcard matching support.

For example, libc* could be use to match libc.so.x.

Map sync function names to corresp. detection logic on linux

We have to port the function interception and wrapping logic to Linux. This includes the following steps:

  • add sync function names to the corresponding section in drace.ini (or drace_linux.ini)
  • implement symbol lookup for non-exported symbols on linux

TSan thread ids do not match DR ones

The thread ids reported by tsan on a data race should match the thread ids of DynamoRIO.

C:\Users\Z0040SWB\source\repos\drace\test\src\DetectorTest.cpp(190): error: Expected equality of these values:
a1.thread_id + a2.thread_id
Which is: 2657508032
90 + 91
Which is: 181

Unit test added in 3b68707

Binary release marked as unsafe in Chrome

Chrome warns the user that the drace binary release is a "unsafe" download. We should find a way to avoid this message if possible. Otherwise we should document it in the readme.

image

We further should make our build process fully transparent and repeatable which implies to generate a manifest with used VS version + compiler + OS version and checksums.

Investigate crash on linux i386 in DR-release mode

DynamoRIO 7.91.271 (with client) crashes in non-debug mode during startup on linux i386. This applies to all dynamorio clients, hence it must be a DR bug. In debug mode, DRace runs successfully.

DynamoRIO internal crash at PC 0x00000035.  Please report this at http://dynamorio.org/issues/.  Program aborted.
Received SIGSEGV at unknown pc 0x00000035 in thread 78868
Base: 0xf7d5f000
Registers:eax=0x00000000 ebx=0xfff18e14 ecx=0x00000035 edx=0xf78a5a40
        esi=0xf7f5c09c edi=0xfff18e08 esp=0xfff18dec ebp=0xfff18e0e
        eflags=0x00010286
version 7.91.18271, custom build

Update documentation

Some sections of the documentation (Readme, Contributing, ...) are a bit outdated.

  • Readme: how to run tests
  • Standalone: Supported environments

... list will be continued

Investigate client crash when using xml printer on managed application

<Application C:\Program Files\dotnet\dotnet.exe (23768).  Race-Detection Tool 'DRace' internal crash at PC 0x00000262246e2576.  Please report this at https://github.com/siemens/drace/issues.  Program aborted.
0xc0000005 0x00000000 0x00000262246e2576 0x00000262246e2576 0x0000000000000000 0xffffffffffffffff
Base: 0x0000000071000000
Registers: eax=0x01e47a794d520680 ebx=0x00007ff5f3f2e480 ecx=0x00000000000001e7 edx=0x00000262241658c0
        esi=0xffffffffffffffff edi=0x00000000000000c8 esp=0x00007ff5f3f2e3b0 ebp=0x0000000000000001
        r8 =0x00000000000000c8 r9 =0xffffffffffffffff r10=0x00007ff613ea26e4 r11=0x00007ff5f3f6ed5c
        r12=0x00007ff5f3f2eb30 r13=0x00007ff5f3f2ea80 r14=0x0000000000000001 r15=0x0000000000000004
        eflags=0x00007ff500
version 7.1.17990, custom build
0:007> ~* kb
.  7  Id: 5cd8.888 Suspend: -1 Teb: 000000cb`b4992000 Unfrozen
 # RetAddr           : Args to Child                                                           : Call Site
00 00007ffe`ad5b9252 : 00000000`00000048 00000000`00000001 00000000`00000000 00000000`00000000 : 0x00007ff5`f3ec2e02
01 00007ffe`515f77d8 : 00000262`241aa040 00000000`00000000 00000000`00000000 00000000`00000220 : KERNELBASE!WaitForSingleObjectEx+0xa2

ln 0x00000262246e2576
(00000262`246e2518)   ucrtbase_262246e0000!common_vsnprintf_s<char>+0x5e   |  (00000262`246e2610)   ucrtbase_262246e0000!o__errno

Optimize inline instrumentation

The inline instrumentation has a serious performance impact on managed applications.
A possible mitigation is to move more code to the code cache.

  • investigate which parts can be moved to cache
  • find benchmark application
  • implement proof of concept

Replace std::system with boost::process:child in tests

Currently, we use std::system to trigger an execution of DR + DRace + MSR in the integration tests. This works as long as the test does not crash. Otherwise, the MSR does not terminate.

With std::system we cannot send a signal to a process to force termination, as we do not get the PID. With boost::process this is possible in a cross-platform way.

Evaluate Fasttrack2 algorithm for race-detection

As stated in #11, using TSAN imposes multiple limitations and technical issues on the detection backend. This is mainly due to the direct-address mapping strategy, implemented in TSAN.

In the thesis, I proposed to also evaluate other race-detection backends like Fasttrack2. This was not possible due to the limited time, as well as due to the focus on the instrumentation part.
However, DRace is already prepared for this scenario as the detector is connected using a generic interface.

For Fasttrack2, there currently exists only an OSS implementation for Java, which would have to be ported to C++.
Fortunately this code is well documented and the logic behind is described in this paper, so a port should not be too time consuming.

crash in gp-concurrent-inc when running integration tests with debugger

When debugging the drace-test.exe with fasttrack or tsan, gp-concurrent-inc throws an exception in exe_common.inl just after drintegration.h issued the drrun command.

To get the exception. Run a with TSAN or FASTTRACK compiled drace-test.exe with the vs-debugger and wait until the first integration test has started

image

image

Analyze performance impact of shadow stack

Tests on large managed applications indicate that the shadow stack becomes a performance bottleneck.

Tasks:

  • profile DRace with Windows Performance Toolkit
  • check if fast-path can be implemented in shadow stack
  • use faster hash-maps
    • early tests show that this is not trivial due to DR limitations
    • give DR tables a try

Do not use stringstream as it sporadically crashes

Debugging has shown that using std::stringstream leads to sporadic crashes of DRace. We observed these crashes mainly on managed applications.

We use the stringstream as a convenient way to format things, but this should be easily replacable by C++ strings or better C strings.

Debug

The crash happens at the following address:

drace_client!__acrt_release_locale_ref+0x11 [minkernel\crts\ucrt\src\appcrt\locale\locale_refcounting.cpp @ 76]:
00007ff7`8c0103a1 f044014910      lock add dword ptr [rcx+10h],r9d
00007ff7`8c0103a6 488b81e0000000  mov     rax,qword ptr [rcx+0E0h]
00007ff7`8c0103ad 4885c0          test    rax,rax
00007ff7`8c0103b0 7404            je      drace_client!__acrt_release_locale_ref+0x26 (00007ff7`8c0103b6)
00007ff7`8c0103b2 f0440108        lock add dword ptr [rax],r9d
00007ff7`8c0103b6 488b81f0000000  mov     rax,qword ptr [rcx+0F0h]
00007ff7`8c0103bd 4885c0          test    rax,rax
00007ff7`8c0103c0 7404            je      drace_client!__acrt_release_locale_ref+0x36 (00007ff7`8c0103c6)

Implement interceptors for mem* and str* functions

Currently, memory that is manipulated using these functions is either out-of-scope, or leads to many false-positives. To counter this, it's better to not instrument the implementation of these functions, but to intercept them and feed the processed data directly into the detector.

Be aware, just wrapping the exported symbols in ucrtbase, kernel, etc. is not sufficient, as some parts are directly inlined into the application, or aliased, or dispatched to specific implementations.

See also the function interception in DrMemory: https://github.com/DynamoRIO/drmemory/blob/d261a4dc254016355f64ebf5eff9187dccb34eb2/drmemory/replace.c#L955

Add option to start MSR in GUI

An option (e.g. checkbox, or button) should be added to the GUI to run a powershell with the MSR.
Here, we should consider, that the MSR is long-running, i.e. it is not terminated after each run. If implemented using a checkbox, we could set the --once flag to terminate after drace terminates.

Write tutorial on how to use drace

Write a end-to-end tutorial for new DRace users. This should cover the following topics:

  • Get DRace + DynamoRIO
  • Non-trivial sample application with a data-race
  • How to use the GUI (with screenshots)
  • How to interpret the generated report
  • How to fix the data-race

The tutorial can be part of either the documentation, or the github repository.

DynamoRIO bug preventing execution on AVX-512 capable CPUs

Error observed on a machine with 80GB MEM

C:\opt\DynamoRIO-Windows-7.91.18137-0\bin64\drrun.exe -debug -- C:\Temp\x64-Release-TSAN\bin\test\mini-apps\concurrent-inc\gp-concurrent-inc.exe
<Starting application C:\Temp\x64-Release-TSAN\bin\test\mini-apps\concurrent-inc\gp-concurrent-inc.exe (3536)>
<Early threads found>
<Initial options = -no_dynamic_options -code_api -probe_api -stack_size 56K -max_elide_jmp 0 -max_elide_call 0 -no_inline_ignored_syscalls -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct -pad_jmps_mark_no_trace >
<Application C:\Temp\x64-Release-TSAN\bin\test\mini-apps\concurrent-inc\gp-concurrent-inc.exe (3536) DynamoRIO usage error : encode error: rip-relative reference out of 32-bit reach>
<Usage error: encode error: rip-relative reference out of 32-bit reach (..\..\core\arch\x86\encode.c, line 3112)
version 7.91.18137, custom build
-no_dynamic_options -code_api -probe_api -stack_size 56K -max_elide_jmp 0 -max_elide_call 0 -no_inline_ignored_syscalls -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct -pad_jmps_mark_no_trace >

pipeline is failing for fasttrack detector

To be able to load the fasttrack-dll into the drace-tests.exe, some dynamrio dlls must be in the folder of drace-tests.exe. Otherwise all the tests are failing.

These dlls (in .../DynamoRIO/lib64):
image

more reliable symbol loading in MSR

To properly wrap .Net synchronization functions, debug information beyond exports is required. The debugging information is queried using the MSR which internally uses dbghelp.dll.

If this dll is not copied to the msr folder, only local and cached symbols are available as for requests to a MS symbol server symsrv.dll is required to reside in the same directory. With the default version shipped with windows, this is not the case.

Due to unclear redistribution policies we cannot just copy these files and ship it with DRace.

Possible solution:

  • document external dependency to Debugging Tools For Windows
  • add install directory to dll search path
  • issue a warning if target application is managed but sync functions cannot be found

make central TLS type standard-layout again

Due to the aligned-stack (which inherits from aligned buffer), the per_thread_t is not standard layout anymore. Hence, it might be unsafe to use offsetof() on it's members. For most compilers, this should be not an issue, but we should fix that anyways.

A possible solution would be to carve out the shadow-stack and put it into it's own TLS slot.

Diagnostics (both clang and gcc-8)

drace/drace-client/src/instr/instr-mem-full.cpp:169:44: warning: offsetof within non-standard-layout type ‘drace::per_thread_t’ is conditionally-supported [-Winvalid-offsetof]
  opnd2 = OPND_CREATE_MEMPTR(reg3, offsetof(per_thread_t, buf_end));

Parameterize testing code to run with multiple detector backends

The testing code (unit + integration) should be parameterized to test all implemented detector backends.

Note: For the unit-tests, this is not trivial, as the current TSAN detector must only be initialized and finalized once per process. It is also not possible to perform a clear unload of the library.

Xref #15

Implement memory allocation interceptors for full 64 bit support

Currently, only the lower 32 bit of each memory address are considered for race-detection. For the PoC this was sufficient, but for real applications the full addresses have to be analyzed.

The reason behind this decision was to avoid changes to the memory mapping table of TSAN. To properly handle all accesses, we have to make sure that all memory we want to track is inside either the EXECUTABLE region or inside a heap.

Thereto, we have to shift all heaps inside a shadowable region or alternatively change the regions in TSAN.

premise for detector

Premises for a detector back-end (what one can assume, to be true; for adding in the documentation):

  • the first appearance of a thread in the detector will be a fork
  • a read or write will never contain a tid, which was not forked before
  • reads can happen before writes
  • ?a first lock acquire will always happen before the first release?
  • happens_after may arrive before a corresponding, happens before arrives -> but no backward sync need
    (means no backward synchronisation of a happens_after needed??)
  • no double forks of same thread as child
  • ...

Reviews and extension to be made

Port MSR to Linux

To port the MSR to Linux, the shared-memory system has to be ported as well. After that, remove the #ifdef WINDOWS around the calls to the MSR.

next drace-gui features

  • Load/Save of a configuration
  • open html report after creation
  • possibility to build gui without boost (Note: boost is now mandatory to be able to build the gui app)

Custom Allocators for FastTrack

  • Implement a custom pool allocator for global allocations in fasttrack.
  • Implement a thread local allocator for thread local allocations (thread data, shadow stacks)

Improve usability of DRace

During workshops, I discovered that it is quite hard for first-time users of DRace to get the tool up and running. This includes the following problem sites:

  • Paths: DynamoRIO, DRace, config file, application
  • Parameters: difference between parameters for DynamoRIO, DRace and the application
  • Workflow: To get an HTML report, the report generator python script has to be executed manually

Further pitfalls:

  • Naming: DRaceGUI.exe is not a GUI itself, but converts the xml report to a HTML version
  • Dependencies: Our prebuild version should not depend on python. Some early tests with PyInstaller look promising. We should auto-build this in CI as well and include it in the bundle.

Possible solution

  • Config Path: Most people do not need to change the config. Hence, load it from the drace-client binary dir if it's not found in current relative path.
  • Naming: Just rename the report generator to e.g. ReportConverter, also provide hints on the CLI if the tool is not correctly used (e.g. no or wrong parameters, or '-h')
  • Dependencies: Build exe in CI and include in bundle (either along with python script, or just the exe).
  • Workflow & Paths: Implement Qt5 based GUI where all paths and options can be specified. Then, the full command can be displayed and we could also directly start the application from the GUI.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.