Git Product home page Git Product logo

syzygy's Introduction

Syzygy Transformation Toolchain
===============================

The Syzygy project consists of a suite of tools for the instrumentation of
COFF object files and PE binaries. The various instrumentation modes allow for
computing code coverage results, profiling source code, applying profile-guided
basic block optimizations, block (function) level profile-guided reordering, and
finding memory errors.

For detailed instructions on each of the tools refer to syzygy/build/README.txt
(also included in binary releases), or refer to the output of the tool itself
when run with '--help'.


REDISTRIBUTION
--------------

Any of the binaries included in this distribution may be freely redistributed
as long as LICENSE.txt is included in the distribution.


LICENSING
---------

The Syzygy project is licensed under the Apache License Version 2.0. You can
find a copy of this in LICENSE.txt.

syzygy's People

Contributors

0vercl0k avatar aarongable avatar apchhee avatar banescusebi avatar bergeret avatar chhamilton avatar cseri avatar erikwright avatar fdoray avatar georgesak avatar joisig avatar loskutov avatar madecoste avatar manzagop avatar michalopler avatar mukel avatar nikolajanevski avatar plmonette avatar plmonette-zz avatar robertshield avatar rogerm avatar rubensf avatar samuelhuang avatar sebmarchand avatar sigurasg avatar slightlyoff avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

syzygy's Issues

Error handling code should not be optimized

This will allow us full fidelity when debugging things that have gone wrong during the error handling code path of the RTL. In some cases using base::Alias might be sufficient to ensure visibility of important data.

Unittests repeat when failed

We currently use the unittest fixture in base/test/run_all_unittests.cc. By default this fixture will repeat failed tests a finite number of times, and consider them as passed if they succeed at any point. This is mainly an attempt to deal with flakiness in Chrome and has the potential to hide real issues in our unittests.

It would be worth hoisting a new unittest runner in syzygy/testing that disables this feature (see syzygy/testing/run_all_unittests_with_large_timeout for an example), and migrate all tests to use it.

Refactor stream reading and writing.

There are many places in the code where we coerce objects into streams of bytes and vice versa. We've rewritten this code several time and it would be great to introduce a pair of utility classes for doing this nicely, once and for all, refactoring everything else to use them.

Invert how relative stack IDs are calculated and stored.

Relative stack IDs are expensive to compute, but allow a stack ID to be computed that is persistent across runs of an instrumented binary (abstracts away base addresses).

Currently, we compute the relative stack ID and explicitly store it in the registry. During the next run we need to explicitly calculate relative stack IDs for each stack in order to see if this matches.

A better way would be to store sufficient information in the registry to directly calculate a corresponding absolute stack ID using current module load addresses. Since modules can load/unload at runtime, and some modules used in some stacks may not always be in memory, this will require a persistent module load observer mechanism that updates an in-memory set of stack IDs that can be matched against.

Symbolization issue in agent_logger

agent_logger.exe is failing to symbolize the stack traces.

The symbolization is done in the AgentLogger::AppendTrace, this function do the following things:

  1. It initialize the symbol handler via SymInitialize, this is working fine (or it at least return true)
  2. It append the path of the PDB of the running process to the symbol search path. This is working fine (we successfully augment the symbol search path with the path containing the PDBs that we need).
  3. It calls SymFromAddr for each line of this stack trace, this is currently failing, GetLastError returns the code 126 ("The specified module could not be found").

There's several possible culprits:

  • The version of dbghelp that we're using isn't the right one.
  • msdiaXXX.dll isn't registered (but it'll probably fail in the SymInitialize function if it was the case ?)
  • We're doing something silly in the AppendTrace function and we've been lucky, until today.

I suspect that it has started to fail with the switch to VS2015, or to the Win10 SDK.

Free ticks is inaccurate

It looks like the free_ticks value in reported errors is inaccurate, as it's computed from the current wall-clock time when the block is visited. This leads to misleading information and to skew between multiple reported blocks.
This needs to grab the current ticks ASAP at exception time and report everything relative to that time.

asan_block_info->milliseconds_since_free =

Use Ninja by default.

GYP still defaults to using "GYP_GENERATORS=msvs" unless otherwise specified. It should default to using "GYP_GENERATORS=ninja".

(This is likely best enforced in our custom wrapper script syzygy/build/gyp_main.py)

Write a tool to manipulate MSF files.

It'd be useful to have a small tool to manipulate the streams any MSF file (PDB, PGD...). This tool should offer the following options:

  • List all the streams in a MSF file.
  • Extract/explode all the streams from a MSF file into a given directory.
  • Remove a stream from a MSF file.
  • Add a stream to a MSF file.
  • Replace one of the stream contained in a MSF file by another one (from another file).

Consistent shutdown crash on recent canaries.

For some reason it seems the CloseHandle hook has taken to crashing on shutdown. This happens inside the loader's SEH, and generates a WER-initiated crash dialog ("Google Chrome has stopped working"). We get no crashes for this on crash.
See http://g/syzygy-team/h1LRgf25uFI for a discussion of precisely what's happening. Summary is that the shadow has been deallocated, and the memory access checking stubs have been un-patched.
Through patching, Chrome.dll is managing to get execution post-uninitializtion.

It's probably both safe and sufficient to switch to NOP mode, by reaching back into client's IAT at DLL_PROCESS_DETACH time, and reverting them over to the NOP stubs.

Document a VS toolchain switch testing process.

The recent VS2015 toolchain switch led to us shipping broken binaries to canary users for 3.5 days. While early testing against the VS2015 toolchain occurred, one build configuration was missed which caused breaking code generation changes. Having a documented process in place should minimize the likelihood of this occurring in the future. It will also preserve the institutional knowledge which is at risk of rotting due to the relative rarity of this task.

Source indexing is broken

Less than 10% of the latest syzyasan_rtl.dll build has been source indexed, this is probably caused by the switch to using gitdeps.py.

ReactOS binaries can't be instrumented

From @GoogleCodeExporter on March 19, 2015 16:27

What steps will reproduce the problem?

1. Compile any ReactOS module.
2. Try to ASAN instrument it by following the instructions in the wiki.

What is the expected output? What do you see instead?

The instrumenter outputs errors that are about COFF, SEH table...etc

Trying with /SAFESEH fixes the SEH table issue, but the COFF errors remain, 
leading to no output file(s). Please note that we don't use /SAFESEH, we just 
tried it to test further.

What version of the product are you using? On what operating system?

We used the binaries provided in your repo.

Please provide any additional information below.

We're trying the ASAN instrumentation on ReactOS binaries as we aim for binary 
compatibility with Windows.

Original issue reported on code.google.com by [email protected] on 11 Sep 2014 at 11:02

Copied from original issue: sebmarchand/syzygy#1

Report previous block details on heap underflow

In the last SyzyASAN canary, there are 50 heap underflows reported. From an initial look at this, it's possible that what's really happening is overflow from the preceding block, but we don't seem to report on that block.
It would likely be helpful to report the immediately preceding block at least?

Official win-asan build failing

ninja -t msvc -e environment.x86 -- "C:\b\depot_tools\win_toolchain\vs_files\95ddda401ec5678f15eeed01d2bee08fcbc5ee97\VC\bin\amd64_x86/cl.exe" /nologo /showIncludes /FC @obj/chrome/chrome_initial/kasko_client.obj.rsp /c ../../chrome/app/kasko_client.cc /Foobj/chrome/chrome_initial/kasko_client.obj /Fd"obj/chrome/chrome_initial_cc.pdb"
c:\b\build\slave\win-asan\build\src\chrome\app\kasko_client.h(13): fatal error C1083: Cannot open include file: 'syzygy/kasko/api/minidump_type.h': No such file or directory

Use the new Kasko memory_range functionality.

SyzyASAN crashes currently explicitly store block and shadow memory contents in the protobuf. Kasko has renctly added functionality that lets memory ranges be added directly the the minidump (and hence visible in the crash report). This mechanism is preferable.

CHECK fail because of an unaligned CachePage pointer

(from http://crbug.com/524277 , minidump available at https://drive.google.com/a/google.com/file/d/0B0_BQhEiKmSYNkItbDBfU2xIT3M/view?usp=sharing)

It looks like the following CHECK is failing in ~StackCaptureCache because the |page| pointer isn't page aligned (its value is 0x0ebc2020)


CHECK_EQ(TRUE, ::VirtualFree(page, 0, MEM_RELEASE));

Here's the destructor's code:


StackCaptureCache::~StackCaptureCache() {
  // Clean up the linked list of cache pages.
  while (current_page_ != nullptr) {
    CachePage* page = current_page_;
    current_page_ = page->next_page_;
    page->next_page_ = nullptr;

    memory_notifier_->NotifyReturnedToOS(page, sizeof(*page));
    page->~CachePage();
    CHECK_EQ(TRUE, ::VirtualFree(page, 0, MEM_RELEASE));
  }
}

And the value of |this|:

   =0f330000 kCachePageSize   : 0x905a4d
   =0f330000 kKnownStacksSharding : 0x905a4d
   =1f3987d8 compression_reporting_period_ : 0
   +0x000 logger_          : 0x00623538 agent::asan::AsanLogger
   +0x004 memory_notifier_ : 0x0062bb28 agent::asan::MemoryNotifierInterface
   +0x008 known_stacks_locks_ : [16] base::Lock
   +0x188 max_num_frames_  : 0x3e
   +0x18c known_stacks_    : [16] std::unordered_set >
   +0x38c current_page_lock_ : base::Lock
   +0x3a4 current_page_    : 0x0eab9020 agent::asan::StackCaptureCache::CachePage
   +0x3a8 stats_lock_      : base::Lock
   +0x3c0 statistics_      : agent::asan::StackCaptureCache::Statistics
   +0x400 reclaimed_locks_ : [63] base::Lock
   +0x9e8 reclaimed_       : [63] (null) 

Here the value of |current_page_| (which was |page->next_page_|) is also not page aligned (0x0eab9020), the offset is the same (0x20)

Here's the full crash stack:

00. (Inline) -------- syzyasan_rtl!base::debug::BreakDebugger+0xd [e:\b\build\slave\syzygy_official\build\src\base\debug\debugger_win.cc @ 21]
01. 0045f4d0 0f335dc1 syzyasan_rtl!agent::asan::StackCaptureCache::~StackCaptureCache+0x65 [e:\b\build\slave\syzygy_official\build\src\syzygy\agent\asan\stack_capture_cache.cc @ 145]
02 0045f4e8 0f331f74 syzyasan_rtl!agent::asan::AsanRuntime::TearDown+0x81 [e:\b\build\slave\syzygy_official\build\src\syzygy\agent\asan\runtime.cc @ 556]
03 (Inline) -------- syzyasan_rtl!agent::asan::TearDownAsanRuntime+0x29 [e:\b\build\slave\syzygy_official\build\src\syzygy\agent\asan\runtime_util.cc @ 219]
04 0045f514 0f3679be syzyasan_rtl!DllMain+0x104 [e:\b\build\slave\syzygy_official\build\src\syzygy\agent\asan\syzyasan_rtl.cc @ 86]
05 0045f554 0f367945 syzyasan_rtl!__DllMainCRTStartup+0x72 [f:\dd\vctools\crt\crtw32\startup\dllcrt0.c @ 377]
06 0045f568 779a97de syzyasan_rtl!_DllMainCRTStartup+0x1c [f:\dd\vctools\crt\crtw32\startup\dllcrt0.c @ 340]
07 0045f588 779a9758 ntdll!LdrxCallInitRoutine+0x16
08 0045f5d8 779ccd68 ntdll!LdrpCallInitRoutine+0x43
09 0045f678 779cce11 ntdll!LdrShutdownProcess+0x101
0a 0045f744 756c9862 ntdll!RtlExitUserProcess+0x81
0b 0045f758 0094adfd kernel32!ExitProcessImplementation+0x12
0c 0045f764 0094b08b chrome!__crtExitProcess+0x15 [f:\dd\vctools\crt\crtw32\startup\crt0dat.c @ 774]
0d 0045f7ac 0094b0b0 chrome!doexit+0x119 [f:\dd\vctools\crt\crtw32\startup\crt0dat.c @ 678]
0e 0045f7c0 0094dbf9 chrome!exit+0xf [f:\dd\vctools\crt\crtw32\startup\crt0dat.c @ 417]
0f 0045f800 756b7c04 chrome!__tmainCRTStartup+0x10c [f:\dd\vctools\crt\crtw32\startup\crt0.c @ 264]
10 0045f814 779bad1f kernel32!BaseThreadInitThunk+0x24
11 0045f85c 779bacea ntdll!__RtlUserThreadStart+0x2f
12 0045f86c 00000000 ntdll!_RtlUserThreadStart+0x1b

ClusterFuzz tripping DCHECKs in SyzyASAN RTL

Investigating http://crbug.com/627455 I built syzyasan_rtl.dll in debug and dropped it in. It's not easy to debug what's going on, but by running chrome with --no-sandbox, and dropping a MessageBox into
LONG WINAPI
AsanRuntime::UnhandledExceptionFilter(struct _EXCEPTION_POINTERS* exception) {

  • ::MessageBoxA(NULL, "", "", MB_OK);
    return ExceptionFilterImpl(true, exception);
    }

I can unwind the stack to see what's tripping.

Long story short, I see multiple threads tripping over this DCHECK
BlockHeapInterface* BlockHeapManager::GetHeapFromId(HeapId heap_id) {

DCHECK_NE(reinterpret_cast(nullptr), heap_id);
HeapQuarantinePair* hq = reinterpret_cast<HeapQuarantinePair*>(heap_id);
DCHECK_NE(static_cast<BlockHeapInterface*>(nullptr), hq->first);
return hq->first;
}

One level up I have the block_info the heap_id comes from:
Local var @ 0x65ada88 Type agent::asan::BlockInfo*

0x065adf14
+0x000 block_size : 0x288bfe70
+0x004 header : 0x7fff0020 agent::asan::BlockHeader
+0x000 magic : 0y1100101010000000 (0xca80)
+0x000 checksum : 0y0010110101110 (0x5ae)
+0x000 is_nested : 0y0
+0x000 has_header_padding : 0y0
+0x000 has_excess_trailer_padding : 0y0
+0x004 state : 0y00
+0x004 body_size : 0y101000100010111111111001001100 (0x288bfe4c)
+0x008 alloc_stack : 0x09b1a24c agent::common::StackCapture
+0x000 VFN_table : 0x00fe655c
=00fe34f4 agent::common::StackCapture::kMaxNumFrames : 0x3e
=00ffdcf4 agent::common::StackCapture::kMaxRefCount : 0xffff
=0106bc88 agent::common::StackCapture::bottom_frames_to_skip
: 0
+0x004 absolute_stack_id
: 0x206af057
+0x008 relative_stack_id_ : 0xba916fac
+0x00c num_frames_ : 0x23 '#'
+0x00d max_num_frames_ : 0x23 '#'
+0x00e ref_count_ : 1
+0x010 frames_ : [62] 0x00c1ae49 Void
+0x00c free_stack : (null)
+0x008 header_padding : 0x7fff0030 agent::asan::BlockHeaderPadding
+0x00c header_padding_size : 0
+0x010 body : 0x7fff0030 agent::asan::BlockBody
+0x014 body_size : 0x288bfe4c
+0x018 trailer_padding : 0xa88afe7c agent::asan::BlockTrailerPadding
+0x01c trailer_padding_size : 0
+0x020 trailer : 0xa88afe7c agent::asan::BlockTrailer
+0x000 alloc_tid : 0
+0x004 free_tid : 0
+0x008 alloc_ticks : 0
+0x00c free_ticks : 0
+0x010 heap_id : 0
+0x024 block_pages : 0x7fff1000 ""
+0x028 block_pages_size : 0x288be000
+0x02c left_redzone_pages : (null)
+0x030 left_redzone_pages_size : 0
+0x034 right_redzone_pages : (null)
+0x038 right_redzone_pages_size : 0
+0x03c is_nested : 0

Looks like the trailer is either not initialized, or has been overwritten.

-- example stack --
0:007> kv
*** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr Args to Child
065ad1b4 00c9bbd3 065ad8dc 065adf6c 00000001 syzyasan_rtl!base::debug::BreakDebugger+0x16 (FPO: [Non-Fpo]) (CONV: cdecl) [c:\src\syzygy\src\base\debug\debugger_win.cc @ 21]
065ad73c 00c20a3f 065ada80 cccccccc cccccccc syzyasan_rtl!logging::LogMessage::~LogMessage+0x2c3 (FPO: [Non-Fpo]) (CONV: thiscall) [c:\src\syzygy\src\base\logging.cc @ 742]
065ad8dc 00c1fa21 00000000 065adc1c 065adf6c syzyasan_rtl!agent::asan::heap_managers::BlockHeapManager::GetHeapFromId+0xaf (FPO: [Non-Fpo]) (CONV: cdecl) [c:\src\syzygy\src\syzygy\agent\asan\heap_managers\block_heap_manager.cc @ 556]
065ada80 00c1f209 065adf14 065adf5c 00000000 syzyasan_rtl!agent::asan::heap_managers::BlockHeapManager::FreePristineBlock+0x181 (FPO: [Non-Fpo]) (CONV: thiscall) [c:\src\syzygy\src\syzygy\agent\asan\heap_managers\block_heap_manager.cc @ 802]
065adc1c 00c1ead7 065adf14 065ae03c 065adf6c syzyasan_rtl!agent::asan::heap_managers::BlockHeapManager::FreeCorruptBlock+0x189 (FPO: [Non-Fpo]) (CONV: thiscall) [c:\src\syzygy\src\syzygy\agent\asan\heap_managers\block_heap_manager.cc @ 797]
065adf5c 00c37a33 007b8938 7fff0030 065ae118 syzyasan_rtl!agent::asan::heap_managers::BlockHeapManager::Free+0x267 (FPO: [Non-Fpo]) (CONV: thiscall) [c:\src\syzygy\src\syzygy\agent\asan\heap_managers\block_heap_manager.cc @ 274]
065ae03c 00c41458 007b8938 00000000 7fff0030 syzyasan_rtl!agent::asan::WindowsHeapAdapter::HeapFree+0xd3 (FPO: [Non-Fpo]) (CONV: cdecl) [c:\src\syzygy\src\syzygy\agent\asan\windows_heap_adapter.cc @ 105]
065ae118 11bf9c40 007b8938 00000000 7fff0030 syzyasan_rtl!asan_HeapFree+0x108 (FPO: [Non-Fpo]) (CONV: stdcall) [c:\src\syzygy\src\syzygy\agent\asan\rtl_impl.cc @ 124]
065ae12c 0fcea1f4 7fff0030 065ae14c 0fcec6b7 chrome_child!_free_base+0x1c (FPO: [Non-Fpo]) (CONV: cdecl) [d:\th\minkernel\crts\ucrt\src\appcrt\heap\free_base.cpp @ 107]
065ae138 0fcec6b7 7fff0030 00000000 077748cc chrome_child!sk_free_releaseproc+0xb (FPO: [Non-Fpo]) (CONV: cdecl) [c:\b\build\slave\win_syzyasan_lkgr\build\src\third_party\skia\src\core\skdata.cpp @ 95]
065ae14c 1127540f 00000001 0fcc4291 0778fac0 chrome_child!SkMallocPixelRef::scalar deleting destructor'+0x36 (FPO: [Non-Fpo]) (CONV: thiscall) 065ae154 0fcc4291 0778fac0 077748b0 0fcd0e54 chrome_child!ui::AXNode::Destroy+0xa (FPO: [0,0,0]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\ui\accessibility\ax_node.cc @ 35] 065ae160 0fcd0e54 0778dd18 065ae180 1127540f chrome_child!SkBitmap::~SkBitmap+0x2d (FPO: [0,0,0]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\third_party\skia\src\core\skbitmap.cpp @ 46] 065ae16c 1127540f 00000001 0fcd0e05 0778dd18 chrome_child!SkNoPixelsBitmapDevice::scalar deleting destructor'+0xe (FPO: [Non-Fpo]) (CONV: thiscall)
065ae174 0fcd0e05 0778dd18 065ae1a4 0fcd5dfa chrome_child!ui::AXNode::Destroy+0xa (FPO: [0,0,0]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\ui\accessibility\ax_node.cc @ 35]
065ae180 0fcd5dfa 00000001 0000000a 0778fac0 chrome_child!DeviceCM::scalar deleting destructor'+0x29 (FPO: [Non-Fpo]) (CONV: thiscall) 065ae190 0fce07fe 076336e8 0fe626df 00000000 chrome_child!SkCanvas::internalRestore+0x92 (FPO: [0,0,0]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\third_party\skia\src\core\skcanvas.cpp @ 1348] 065ae198 0fe626df 00000000 065ae220 0fe634be chrome_child!SkCanvas::restore+0x2f (FPO: [0,0,4]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\third_party\skia\src\core\skcanvas.cpp @ 1040] 065ae1a4 0fe634be 065ae1e4 0778fac0 075dd2a0 chrome_child!SkRecord::Record::visit<SkRecords::Draw &>+0x2d (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\third_party\skia\src\core\skrecord.h @ 170] 065ae220 0fe5d8a3 0769b540 0778fac0 00000000 chrome_child!SkRecordDraw+0xeb (FPO: [Non-Fpo]) (CONV: cdecl) [c:\b\build\slave\win_syzyasan_lkgr\build\src\third_party\skia\src\core\skrecorddraw.cpp @ 36] 065ae274 0fcdc605 0778fac0 00000000 075dd2a0 chrome_child!SkBigPicture::playback+0xb5 (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\third_party\skia\src\core\skbigpicture.cpp @ 44] 065ae2ac 0fcd3471 075dd2a0 00000000 00000000 chrome_child!SkCanvas::onDrawPicture+0xcb (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\third_party\skia\src\core\skcanvas.cpp @ 2972] 065ae308 11e23c0f 075dd2a0 00000000 00000000 chrome_child!SkCanvas::drawPicture+0x107 (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\third_party\skia\src\core\skcanvas.cpp @ 2944] 065ae328 11f09f13 0778fac0 00000000 065ae35c chrome_child!cc::DisplayItemList::Raster+0xe0 (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\cc\playback\display_item_list.cc @ 144] 065ae398 11f098a3 0778fac0 00000000 0746a054 chrome_child!cc::RasterSource::RasterCommon+0x10f (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\cc\playback\raster_source.cc @ 204] 065af500 11f24e4b 0778fac0 0746a054 065af61c chrome_child!cc::RasterSource::PlaybackToCanvas+0x17c (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\cc\playback\raster_source.cc @ 98] 065af5e4 11ee0ba0 05900000 00000002 0772f850 chrome_child!cc::RasterBufferProvider::PlaybackToMemory+0x159 (FPO: [Non-Fpo]) (CONV: cdecl) [c:\b\build\slave\win_syzyasan_lkgr\build\src\cc\raster\raster_buffer_provider.cc @ 84] 065af634 11ee0950 0772f850 0760339c 07694070 chrome_child!cc::OneCopyRasterBufferProvider::PlaybackToStagingBuffer+0xf2 (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\cc\raster\one_copy_raster_buffer_provider.cc @ 228] 065af678 11ee08c6 0772f850 07780ca0 07780d60 chrome_child!cc::OneCopyRasterBufferProvider::PlaybackAndCopyOnWorkerThread+0x54 (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\cc\raster\one_copy_raster_buffer_provider.cc @ 176] 065af6dc 11eb41d0 07694070 0746a054 0746a064 chrome_child!cc::OneCopyRasterBufferProvider::RasterBufferImpl::Playback+0xd7 (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\cc\raster\one_copy_raster_buffer_provider.cc @ 62] 065af738 1145c3f4 03fc3918 05b7555c 05b7555a chrome_child!cc::anonymous namespace'::RasterTaskImpl::RunOnWorkerThread+0xd6 (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\cc\tiles\tile_manager.cc @ 94]
065af788 1145c299 00000001 05b91844 05b91828 chrome_child!content::CategorizedWorkerPool::RunTaskInCategoryWithLockAcquired+0xc4 (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\content\renderer\categorized_worker_pool.cc @ 363]
065af7a8 1145c2d2 05b91878 03fc3980 0fb37a1e chrome_child!content::CategorizedWorkerPool::Run+0x16e (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\content\renderer\categorized_worker_pool.cc @ 232]
065af7b4 0fb37a1e 05b79f38 76691430 30323436 chrome_child!content::anonymous namespace'::CategorizedWorkerPoolThread::Run+0xf (FPO: [0,0,0]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\content\renderer\categorized_worker_pool.cc @ 35] 065af7e0 0fb2a30b 00000000 00000000 05b79f38 chrome_child!base::SimpleThread::ThreadMain+0x72 (FPO: [Non-Fpo]) (CONV: thiscall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\base\threading\simple_thread.cc @ 76] 065af7fc 7669338a 000002c8 065af848 77349902 chrome_child!base::anonymous namespace'::ThreadFunc+0x82 (FPO: [Non-Fpo]) (CONV: stdcall) [c:\b\build\slave\win_syzyasan_lkgr\build\src\base\threading\platform_thread_win.cc @ 86]
065af808 77349902 05b79f38 eb0104d5 00000000 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo])
065af848 773498d5 0fb2a261 05b79f38 00000000 ntdll!__RtlUserThreadStart+0x70 (FPO: [Non-Fpo])
065af860 00000000 0fb2a261 05b79f38 00000000 ntdll!_RtlUserThreadStart+0x1b (FPO: [Non-Fpo])

The minidump symbolizer should only report the useful information

The information generated by the minidump symbolizer is confusing to a user who doesn't use SyzyAsan frequently.

Here's an example of what gets reported for a heap buffer overflow:

Bad access information:
feature_set: ASAN_FEATURE_ENABLE_LARGE_BLOCK_HEAP # The user doesn't really care about this
corrupt_ranges_reported: 0 # We should maybe report that we haven't detected any corruption on the heap.
asan_parameters: common::AsanParameters # This is a reporting issue and shouldn't be reported at all !
user_size: 0y000000000000000000000000100000 (0x20) # We should convert it to decimal, and we should probably rename this to block_size in the report ?
milliseconds_since_free: 0 # We shouldn't report this if the block hasn't been freed !
heap_type: kWinHeap # Again, this is useful information for us, but not for the user.
corrupt_range_count: 0 # Same comment than for |corrupt_ranges_reported|
access_size: 4
free_tid: 0 # This is useless
free_stack_size: 0 '' # Ditto
corrupt_ranges: (null) # Ditto
heap_is_corrupt: 0 # This could probably replace the corrupt_range stuffs.
alloc_stack: [62] 0x0f32f384 Void # This is extra information used while debugging (but at this point the user has already opened the minidump in Windbg, so he has access to this info)
shadow_memory: [512] " 0x0581c6c0: 00 00 e0 fa 00 00 00 00. 0x0581c700: fb fb f4 00 00 00 00 00. 0x0581c740: 00 00 00 00 00 00 e0 fa. 0x0581c780: 00 00 00 00 fb fb f4 00.=>0x0581c7c0: e0 fa 00 00 00 00[fb]fb. 0x0581c800: f4 00 00 00 00 00 00 00. 0x0581c840: 00 00 00 00 e0 fa 00 00. 0x0581c880: 00 00 fb fb f4 00 e0 fa. 0x0581c8c0: 00 00 00 00 fb fb f4 00." # Ditto
alloc_stack_size: 0x14 '' # Ditto
header: 0x0581c7c0 Void # Ditto
shadow_info: [128] "0581C7F3 is 3 bytes beyond 32-byte block [0581C7D0,0581C7F0)." # Ditto (I think)
error_type: HEAP_BUFFER_OVERFLOW
free_stack: 62 # Useless
analysis: agent::asan::BlockAnalysisResult # This should be removed , we should just check if the block is corrupt or not.
alloc_tid: 0x1b94 # Not useful unless it's an use after free, and even here we should probably just check if alloc_tid != free_tid
state: 0y00 # Could be useful if it had been symbolized
location: 0x0581c7f3 Void # The 'Void' part is useless
context: _CONTEXT # Should be removed
crash_stack_id: 0xb8def5e5 # Ditto
access_mode: ASAN_READ_ACCESS
block_info: agent::asan::AsanBlockInfo # This should be removed
corrupt_block_count: 0 # Again ?

A better report might look like this (reordered):
error_type: HEAP_BUFFER_OVERFLOW
location: 0x0581c7f3
access_mode: ASAN_READ_ACCESS
access_size: 4
block_size: 32
heap_is_corrupt: 0

Create event response plan.

Documentation of how to handle a bad build. How and what to modify in branches, how to trigger new builds, how to push them to users, how to stop ongoing releases from continuing to push, etc. Clear documentation of the systems involved, and the key points of contact need to be centrally documented in order to improve response time.

Unaligned property on pointer and array Types

It's not clear what to do about the unaligned property of pointers and arrays. This is probably important for correctly coercing memory to types and/or for heuristic type recovery.

PeAndPdbAreMatched returns false on x64 binaries

While running tools\win\sizeviewer\sizeviewer.py (from Chromium) on chrome.exe I found that syzygy reported that my PDB did not match. Checking with my pdbinfo tool and dumpbin showed that in fact they did match, and in fact they had just been built.

I'm not sure what caused these two files to be reported as mismatched.

The .exe and .pdb re attached to this issue. My pdbinfo tool is available (source and binary) at https://github.com/randomascii/tools.

chrome.exe.zip

SyzyASAN canaries report most crashes through Kasko

Since SyzyASAN is the last registered UEF, it ends up handling most crashes, but not all. Some crashes are still delegated directly to the BP handler, such as e.g. inside a WrappedWindowProc. It turns out SyzyASAN delegates to Chrome->Kasko always, which is kind of weird, and doesn't improve reliability. There are also cases where this falls over for reasons unknown, and SyzyASAN/Kasko end up crashing and reporting through BreakPad.

Get rid of the StaticShadow.

We are in the process of transitioning to using a dynamically allocated shadow. This will allow us to completely eliminate this memory overhead when SyzyASAN is disabled. There remain a few places where the static shadow is still used.

From "git grep StaticShadow":

crt_interceptors
memory_interceptors
page_protection_helpers
rtl_utils
runtime

SyzyASAN - ignore NULL pointer accesses?

SyzyASAN seems to be reporting NULL pointer, and near-NULL pointer accesses as ASAN issues. The first 64K of user memory is unmapped by default (convention?), so this is somewhat redundant. Maybe it's better to greenzone the first 64K in shadow and just let these crash.

SyzyASAN raises C++ exception on OOM

SyzyASAN raises a C++ exception on OOM. This is not good as there are situations where this won't crash Chrome properly, but will instead implicitly corrupt SyzyASAN's state.
Suggest setting a new_handler that crashes and terminates the process in the agent main.

Make decomposition work for PGO binaries.

This will be needed once we enable PGO, and will assist with investigations into making PGO deltas smaller.

Breakpad supports PGO, and there is some code that hints at where the additional data lives and looks like. Initially we can use the DIA API directly, but it would also be good to add direct support to pdb_dumper so we can parse the appropriate data ourselves.

https://code.google.com/p/chromium/codesearch#chromium/src/breakpad/src/common/windows/pdb_source_line_writer.cc&sq=package:chromium&l=443

(If I had to guess it looks like there will simply be additional 'Block' symbols as children of the 'Function' symbol, and those blocks will be discontiguous.)

Hoist MSF functionality out of pdb_lib.

The PDB format is actually built on top of the more generic multi-stream format (MSF). There's a desire to reuse this elsewhere, so it would be good to hoist the MSF functionality out of pdb_lib to a new msf_lib target.

Security bug in documentation, and old documentation still in use

The documentation at https://github.com/google/syzygy/wiki/SyzyASanBug lists Chrome's and Microsoft's symbol servers, both times using http. This should be changed to https, to avoid MITM attacks that could exploit parser bugs or source-indexing vulnerabilities.

This code is also archived here and there are still references to this (soon to be obsolete) documentation:

https://code.google.com/archive/p/syzygy/wikis/SyzyASanBug.wiki

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.