Git Product home page Git Product logo

pdfresurrect's Introduction

pdfresurrect
------------
PDFResurrect is a tool aimed at analyzing PDF documents.  The PDF format allows
for previous document changes to be retained in a more recent version of the
document, thereby creating a running history of changes for the document.  This
tool attempts to modify the PDF so that a reading utility will be presented with
the previous versions of the PDF.  The modified "versions" will be generated
as new files leaving the original PDF unmodified.


Notes
-----
The scrubbing feature (-s) should not be trusted for any serious security
uses.  After using this experimental feature, please verify that it in fact
zero'd all of the objects that were of concern (those objects that were to be
zero'd).  Currently this feature will likely not render a working pdf.

This tool relies on the application reading the pdfresurrect extracted versions
to treat the last xref table as the most recent in the document.  This should
typically be the case.

The verbose output, which tries to deduce the PDF object type (e.g. stream,
page), is not always accurate, and the object counts might not be 100%
accurate.  However, this should not prevent the extraction of the versions.
This output is merely to provide a hint for the user as to what might be
different between the documents.

Object counts might appear off in linearized PDF documents.  That is not truly
the case, the reason for this is that each version of the PDF consists of the
objects that compose the linear portion of the PDF plus all of the objects that
compose the version in question.  Suppose there is a linearized PDF with 59
objects in its linear portion, and suppose the PDF has a second version that
consists of 21 objects.  The total number of objects in "version 2"
would be 59 + 21 or 80 objects.


Building
--------
From the top-level directory of pdfresurrect run:
    ./configure
    make

To install/uninstall the resulting binary to a specific path
the '--prefix=' flag can be used:
    ./configure --prefix=/my/desired/path/

Debugging mode can be enabled when configuring by using the following option:
    ./configure --enable-debug

The resulting binary can be placed anywhere, however it can also be
installed/uninstalled to the configured path automatically.  If no path was
specified at configure time, the default is /usr/local/bin
To install/uninstall:
    make install
         or
    make uninstall


Thanks
------
The rest of the 757/757Labs crew.
GNU (www.gnu.org).
All of the contributors: See AUTHORS file.


Contact / Project URL
---------------------
[email protected]
https://github.com/enferex/pdfresurrect

pdfresurrect's People

Contributors

a1346054 avatar enferex avatar jwilk avatar rwhitworth avatar ryandesign avatar xambroz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pdfresurrect's Issues

Maybe a bug

I did some fuzzing on the 0.21b for the purpose of research and I found the program crashed because of invoking load_kids many times. After I analysed the sample pdf and backtrace, I found the xref was modified, causing the recursive invocation and making esp surpass the range of stack. Just like below.

[#0] 0x7ffff7a8ebbc → _int_malloc(av=0x7ffff7dd1b20 <main_arena>, bytes=0x201)
[#1] 0x7ffff7a902b0 → _int_realloc(av=0x7ffff7dd1b20 <main_arena>, oldp=0x655890, oldsize=0x110, nb=0x210)
[#2] 0x7ffff7a91839 → __GI___libc_realloc(oldmem=0x6558a0, bytes=0x200)
[#3] 0x404c17 → get_object(fp=0x609010, obj_id=0x4, xref=0x6092c0, size=0x0, is_stream=0x7fffff7ff254)
[#4] 0x404e76 → load_kids(fp=0x609010, pages_id=0x4, xref=0x6092c0)
[#5] 0x404fc5 → load_kids(fp=0x609010, pages_id=0x6, xref=0x6092c0)
[#6] 0x404fc5 → load_kids(fp=0x609010, pages_id=0x6, xref=0x6092c0)
[#7] 0x404fc5 → load_kids(fp=0x609010, pages_id=0x6, xref=0x6092c0)
[#8] 0x404fc5 → load_kids(fp=0x609010, pages_id=0x6, xref=0x6092c0)
[#9] 0x404fc5 → load_kids(fp=0x609010, pages_id=0x6, xref=0x6092c0)

obj 3 has 2 kids and load_kids next locates the obj 4 and obj 6 according to xref.

3 0 obj
<<
/Type /Pages
/Count 2
/Kids [ 4 0 R 6 0 R ] 
>>
endobj

However xref is modified and the offset of obj 6 is 152 analysed from atol instead of 1522 in normal sample. So when invoking load_kid with kid_id=6, it actually goes back to obj 3 with some data before /Kids ruined.

xref
0 11
0000000000 65535 f
0000000019 00000 n
0000000093 00000 n
0000000147 00000 n // obj 3
0000000222 00000 n
�0000000390 00000 n
000000152² 00000 n // obj 6
0000001690 00000 n
0000002423 00000 n
0800002456 00000 n�
0000002574 00000 n

So it's better to check before invoking atol, making sure the characters are legal or use some algorithms to detect the recursive solution.

SEGV: READ memory access

Summary: The target application (pdfresurrect) crashes while passing a crafted .PDF file

ASAN

==23203==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7ffff6e481b0 bp 0x000000000000 sp 0x7fffffffd410 T0)
==23203==The signal is caused by a READ memory access.
==23203==Hint: address points to the zero page.
    #0 0x7ffff6e481af  (/lib/x86_64-linux-gnu/libc.so.6+0x451af)
    #1 0x44f58c in __interceptor_strtol (/home/input0/pdfresurrect/pdfresurrect+0x44f58c)
    #2 0x510233 in atoi /usr/include/stdlib.h:363:16
    #3 0x510233 in load_xref_from_plaintext /home/input0/pdfresurrect/pdf.c:697
    #4 0x510233 in load_xref_entries /home/input0/pdfresurrect/pdf.c:638
    #5 0x510233 in pdf_load_xrefs /home/input0/pdfresurrect/pdf.c:294
    #6 0x50c295 in init_pdf /home/input0/pdfresurrect/main.c:206:9
    #7 0x50c295 in main /home/input0/pdfresurrect/main.c:280
    #8 0x7ffff6e24b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #9 0x41c8a9 in _start (/home/input0/pdfresurrect/pdfresurrect+0x41c8a9)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/lib/x86_64-linux-gnu/libc.so.6+0x451af) 

GDB BT:

Program received signal SIGSEGV, Segmentation fault.
__GI_____strtol_l_internal (nptr=0x0, endptr=0x7fffffffd418, base=10, group=<optimized out>, loc=0x7ffff71ef560 <_nl_global_locale>)
    at ../stdlib/strtol_l.c:292
292	../stdlib/strtol_l.c: No such file or directory.
(gdb) bt
#0  __GI_____strtol_l_internal (nptr=0x0, endptr=0x7fffffffd418, base=10, group=<optimized out>, loc=0x7ffff71ef560 <_nl_global_locale>)
    at ../stdlib/strtol_l.c:292
#1  0x000000000044f58d in strtol ()
#2  0x0000000000510234 in atoi (__nptr=0xffffffffffffff18 <error: Cannot access memory at address 0xffffffffffffff18>)
    at /usr/include/stdlib.h:363
#3  load_xref_from_plaintext (fp=0x616000000080, xref=<optimized out>) at pdf.c:697
#4  load_xref_entries (xref=<optimized out>, fp=<optimized out>) at pdf.c:638
#5  pdf_load_xrefs (fp=<optimized out>, pdf=<optimized out>) at pdf.c:294
#6  0x000000000050c296 in init_pdf (fp=<optimized out>, name=<optimized out>) at main.c:206
#7  main (argc=<optimized out>, argv=<optimized out>) at main.c:280
(gdb) i r
rax            0xffffffffffffff18	-232
rbx            0x0	0
rcx            0x0	0
rdx            0xa	10
rsi            0x7fffffffd418	140737488344088
rdi            0x0	0
rbp            0x0	0x0
rsp            0x7fffffffd3b0	0x7fffffffd3b0
r8             0x7ffff71ef560	140737339389280
r9             0x0	0
r10            0x0	0
r11            0xa	10
r12            0x0	0
r13            0x7fffffffd418	140737488344088
r14            0x0	0
r15            0x0	0
rip            0x7ffff6e481b0	0x7ffff6e481b0 <__GI_____strtol_l_internal+80>
eflags         0x10283	[ CF SF IF RF ]
cs             0x33	51
ss             0x2b	43
ds             0x0	0
es             0x0	0
fs             0x0	0
gs             0x0	0
(gdb) 

To reproduce: ./pdfresurrect poc

Return codes for fread are not being evaluated

Hello,
the hardened build options reveal that the return values of fread are not being evaluated.
In case of some read error, this is not being processed.

gcc -o pdf.o -c pdf.c -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -mcet -fcf-protection
pdf.c: In function 'get_header':
pdf.c:1243:5: warning: ignoring return value of 'fread', declared with attribute warn_unused_result [-Wunused-result]
fread(header, 1023, 1, fp);
^~~~~~~~~~~~~~~~~~~~~~~~~~
pdf.c: In function 'pdf_load_xrefs':
pdf.c:218:9: warning: ignoring return value of 'fread', declared with attribute warn_unused_result [-Wunused-result]
fread(buf, 1, pos_count, fp);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
pdf.c: In function 'is_valid_xref':
pdf.c:567:5: warning: ignoring return value of 'fgets', declared with attribute warn_unused_result [-Wunused-result]
fgets(buf, 16, fp);
^~~~~~~~~~~~~~~~~~
pdf.c: In function 'get_object_from_here':
pdf.c:1004:5: warning: ignoring return value of 'fread', declared with attribute warn_unused_result [-Wunused-result]
fread(buf, 255, 1, fp);
^~~~~~~~~~~~~~~~~~~~~~
pdf.c: In function 'load_xref_from_plaintext':
pdf.c:616:5: warning: ignoring return value of 'fread', declared with attribute warn_unused_result [-Wunused-result]
fread(buf, 1, 21, fp);
^~~~~~~~~~~~~~~~~~~~~
pdf.c: In function 'pdf_load_pages_kids':
pdf.c:293:13: warning: ignoring return value of 'fread', declared with attribute warn_unused_result [-Wunused-result]
fread(buf, 1, sz, fp);
^~~~~~~~~~~~~~~~~~~~~
Best regards
Michal Ambroz

some bug

I find some bugs through fuzz. You can check it out.

==52118==WARNING: AddressSanitizer failed to allocate 0xffffffffffffffe8 bytes
==52118==AddressSanitizer's allocator is terminating the process instead of returning 0
==52118==If you don't like this behavior set allocator_may_return_null=1
==52118==AddressSanitizer CHECK failed: /build/llvm-toolchain-6.0-QjOn7h/llvm-toolchain-6.0-6.0/projects/compiler-rt/lib/sanitizer_common/sanitizer_allocator.cc:225 "((0)) != (0)" (0x0, 0x0)
    #0 0x4e3165 in __asan::AsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x4e3165)
    #1 0x500a15 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x500a15)
    #2 0x4e9556 in __sanitizer::ReportAllocatorCannotReturnNull() (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x4e9556)
    #3 0x4e9596 in __sanitizer::ReturnNullOrDieOnFailure::OnBadRequest() (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x4e9596)
    #4 0x425179 in __asan::asan_calloc(unsigned long, unsigned long, __sanitizer::BufferedStackTrace*) (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x425179)
    #5 0x4da212 in calloc (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x4da212)
    #6 0x51c061 in load_xref_from_plaintext /home/rtfingc/test/pdfresurrect2/pdf.c:646:21
    #7 0x5162c1 in load_xref_entries /home/rtfingc/test/pdfresurrect2/pdf.c:623:7
    #8 0x5155c1 in pdf_load_xrefs /home/rtfingc/test/pdfresurrect2/pdf.c:279:9
    #9 0x5130a6 in init_pdf /home/rtfingc/test/pdfresurrect2/main.c:206:9
    #10 0x512a24 in main /home/rtfingc/test/pdfresurrect2/main.c:263:17
    #11 0x7fdcd3b5cb96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
    #12 0x41a159 in _start (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x41a159)

in function load_xref_from_plaintext

    SAFE_E(fread(buf, 1, 21, fp), 21, "Failed to load entry Size string.\n");
    xref->n_entries = atoi(buf + strlen("ize "));
    xref->entries = calloc(1, xref->n_entries * sizeof(struct _xref_entry));

buf can be control, and then we can control n_entries without any check


==52201==WARNING: AddressSanitizer failed to allocate 0xfffffffffffffda4 bytes
==52201==AddressSanitizer's allocator is terminating the process instead of returning 0
==52201==If you don't like this behavior set allocator_may_return_null=1
==52201==AddressSanitizer CHECK failed: /build/llvm-toolchain-6.0-QjOn7h/llvm-toolchain-6.0-6.0/projects/compiler-rt/lib/sanitizer_common/sanitizer_allocator.cc:225 "((0)) != (0)" (0x0, 0x0)
    #0 0x4e3165 in __asan::AsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x4e3165)
    #1 0x500a15 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x500a15)
    #2 0x4e9556 in __sanitizer::ReportAllocatorCannotReturnNull() (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x4e9556)
    #3 0x4e9596 in __sanitizer::ReturnNullOrDieOnFailure::OnBadRequest() (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x4e9596)
    #4 0x425116 in __asan::asan_malloc(unsigned long, __sanitizer::BufferedStackTrace*) (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x425116)
    #5 0x4d9feb in malloc (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x4d9feb)
    #6 0x517ce0 in pdf_load_pages_kids /home/rtfingc/test/pdfresurrect2/pdf.c:317:19
    #7 0x5130d9 in init_pdf /home/rtfingc/test/pdfresurrect2/main.c:210:5
    #8 0x512a24 in main /home/rtfingc/test/pdfresurrect2/main.c:263:17
    #9 0x7fb2fa0d7b96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
    #10 0x41a159 in _start (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x41a159)

in pdf_load_pages_kids, another problem similar to the firsrt on.

            sz = pdf->xrefs[i].end - ftell(fp);
            buf = malloc(sz + 1);
            SAFE_E(fread(buf, 1, sz, fp), sz, "Failed to load /Root.\n");
            buf[sz] = '\0';

memleak2: --D-- Version 2 -- Object 70 (Unknown)
memleak2: --A-- Version 2 -- Object 71 (Unknown)
memleak2: --D-- Version 3 -- Object 70 (Unknown)
memleak2: --?-- Version 3 -- Object 71 (Unknown)
---------- memleak2 ----------
Versions: 3

=================================================================
==52214==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 1280 byte(s) in 1 object(s) allocated from:
    #0 0x4da490 in realloc (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x4da490)
    #1 0x518443 in get_object /home/rtfingc/test/pdfresurrect2/pdf.c:1115:18
    #2 0x51b955 in get_object_from_here /home/rtfingc/test/pdfresurrect2/pdf.c:1058:12
    #3 0x5160d2 in is_valid_xref /home/rtfingc/test/pdfresurrect2/pdf.c:603:13
    #4 0x5153b8 in pdf_load_xrefs /home/rtfingc/test/pdfresurrect2/pdf.c:268:14
    #5 0x5130a6 in init_pdf /home/rtfingc/test/pdfresurrect2/main.c:206:9
    #6 0x512a24 in main /home/rtfingc/test/pdfresurrect2/main.c:263:17
    #7 0x7fac10544b96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310

SUMMARY: AddressSanitizer: 1280 byte(s) leaked in 1 allocation(s).

memleak


=================================================================
==52243==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd169be7e0 at pc 0x00000051747e bp 0x7ffd169be790 sp 0x7ffd169be788
WRITE of size 1 at 0x7ffd169be7e0 thread T0
    #0 0x51747d in load_creator /home/rtfingc/test/pdfresurrect2/pdf.c:877:33
    #1 0x51567a in pdf_load_xrefs /home/rtfingc/test/pdfresurrect2/pdf.c:291:5
    #2 0x5130a6 in init_pdf /home/rtfingc/test/pdfresurrect2/main.c:206:9
    #3 0x512a24 in main /home/rtfingc/test/pdfresurrect2/main.c:263:17
    #4 0x7f8cbeb2db96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
    #5 0x41a159 in _start (/home/rtfingc/test/pdfresurrect2/pdfresurrect+0x41a159)
                 
Address 0x7ffd169be7e0 is located in stack of thread T0 at offset 64 in frame
    #0 0x516a8f in load_creator /home/rtfingc/test/pdfresurrect2/pdf.c:832

  This frame has 2 object(s):
    [32, 64) 'obj_id_buf' (line 834) <== Memory access at offset 64 overflows this variable
    [96, 104) 'sz' (line 836)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /home/rtfingc/test/pdfresurrect2/pdf.c:877:33 in load_creator
Shadow bytes around the buggy address:
  0x100022d2fca0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100022d2fcb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100022d2fcc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100022d2fcd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100022d2fce0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x100022d2fcf0: 00 00 00 00 f1 f1 f1 f1 00 00 00 00[f2]f2 f2 f2

stackoverflow

Infinite Loop in pdf.c

Based on commit 3a95b4f.

To recreate:

./pdfresurrect poc

Analysis:

Using GDB I was able to isolate and determine the function which causes the infinite loop. This stems to the function load_xref_from_plaintext (line 589) from pdf.c.

inf-loop-img

As seen in the image above, in each iteration, the filestream is advanced and data stored in the buf variable. This variable is then used and processed. Using the POC, once the buffer is loaded with "\000\377F%%EOF\000ref\r\n2714\r\n\377\000\000\000\000\000\000\000\000\000\000", the program is observed to skip the data as seen in if (strlen(buf) > 17). However, the lack of a check to terminate the program after reaching the end-of-file indicator of the filestream causes the loop to reuse the same buffer value, and this results in an infinite loop as seen in the picture.

poc.zip

Infinite loop in function get_xref_linear_skipped in pdf.c

Hi,
I found an infinite loop in function get_xref_linear_skipped in pdf.c

env:
version: v0.22b commit af10865
OS: ubuntu 20.04

If found 'trailer' ,then look backwards for 'xref'. But if there isn't character 'x' backward, the function get_xref_linear_skipped will go into an infinite loop.

─── source:pdf.c+729 ────
    724        return;
    725
    726      /* If we found 'trailer' look backwards for 'xref' */
    727      ch = 0;
    728      while (SAFE_F(fp, ((ch = fgetc(fp)) != 'x')))
               // fp=0x0000ffffffffeea8  →  [...]  →  0x00000000fbad2488
 →  729        fseek(fp, -2, SEEK_CUR);
    730
    731      if (ch == 'x')
    732      {
    733          xref->start = ftell(fp) - 1;
    734          fseek(fp, -1, SEEK_CUR);
─────────────────────────────────────────────

poc(zipped ):
pdfresurrect_hang_1.zip

To reproduct:

./pdfresurrect [poc]

reporter: chiba of Topsec alphaLab

The lack of a complete magic check leads to heap-buffer-overflow in pdf_get_version()

commit 3dfc102

os version: ubuntu 16.04

//pdf.c:205:34
void[ pdf_get_version](url)(FILE *fp, pdf_t *pdf)
{
    char *header, *c;

    header = get_header(fp);

    /* Locate version string start and make sure we dont go past header */
    if ((c = strstr(header, "%PDF-")) &&
        (c + strlen("%PDF-M.m") + 2))
    {
        pdf->pdf_major_version = atoi(c + strlen("%PDF-"));
    --> pdf->pdf_minor_version = atoi(c + strlen("%PDF-M."));
    }

    free(header);
}
root@ubuntu:/home/fuzz/pdfresurrect# ./pdfresurrect poc
=================================================================
==12207==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x619000000981 at pc 0x0000004b27bc bp 0x7ffec7acc460 sp 0x7ffec7acbc00
READ of size 1 at 0x619000000981 thread T0
    #0 0x4b27bb in __interceptor_atoi (/home/fuzz/pdfresurrect/pdfresurrect+0x4b27bb)
    #1 0x4f9bf2 in pdf_get_version /home/fuzz/pdfresurrect/pdf.c:205:34
    #2 0x4f8aac in init_pdf /home/fuzz/pdfresurrect/main.c:205:5
    #3 0x4f84e3 in main /home/fuzz/pdfresurrect/main.c:279:17
    #4 0x7f7e5efd783f in __libc_start_main /build/glibc-e6zv40/glibc-2.23/csu/../csu/libc-start.c:291
    #5 0x41ae18 in _start (/home/fuzz/pdfresurrect/pdfresurrect+0x41ae18)

0x619000000981 is located 1 bytes to the right of 1024-byte region [0x619000000580,0x619000000980)
allocated by thread T0 here:
    #0 0x4c6fca in calloc (/home/fuzz/pdfresurrect/pdfresurrect+0x4c6fca)
    #1 0x4f8254 in safe_calloc /home/fuzz/pdfresurrect/main.c:223:16
    #2 0x4f9ad0 in get_header /home/fuzz/pdfresurrect/pdf.c:1230:20
    #3 0x4f9ba1 in pdf_get_version /home/fuzz/pdfresurrect/pdf.c:198:14
    #4 0x4f8aac in init_pdf /home/fuzz/pdfresurrect/main.c:205:5
    #5 0x4f84e3 in main /home/fuzz/pdfresurrect/main.c:279:17
    #6 0x7f7e5efd783f in __libc_start_main /build/glibc-e6zv40/glibc-2.23/csu/../csu/libc-start.c:291

SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/fuzz/pdfresurrect/pdfresurrect+0x4b27bb) in __interceptor_atoi
Shadow bytes around the buggy address:
  0x0c327fff80e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c327fff80f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c327fff8100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c327fff8110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c327fff8120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c327fff8130:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fff8140: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fff8150: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fff8160: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fff8170: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fff8180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==12207==ABORTING

build arg:

export CC="clang-8"
export CXX="clang++-8"
export CFLAGS="-O1 -fno-omit-frame-pointer -gline-tables-only -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION -fsanitize=address -fsanitize-address-use-after-scope -fsanitize-coverage=trace-pc-guard"
export CXXFLAGS="-O1 -fno-omit-frame-pointer -gline-tables-only -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION -fsanitize=address -fsanitize-address-use-after-scope -fsanitize-coverage=trace-pc-guard -stdlib=libc++"
./configure
LDFLAGS="$CXXFLAGS" make -j$(nproc)

poc1.gz

Credit: 1vanChen of NSFOCUS Security Team

Bug: Buffer Overflow into Out-of-Bounds Write

Description

In v0.12 and newer, the function get_type() in pdf.c has the following logic:

pdfresurrect/pdf.c

Lines 1299 to 1304 in e4de322

/* Return the value by storing it in static mem */
memcpy(buf, c, (((c - obj) < sizeof(buf)) ? c - obj : sizeof(buf)));
c = buf;
while (!(isspace(*c) || *c=='/' || *c=='>'))
++c;
*c = '\0';

If buf does not contain one of the expected terminating characters (whitespace, /, >), c can point to an address outside buf, causing a \x00 byte to be written out-of-bounds.

Example

Instead of creating a PoC, I found a benign PDF that happens to trigger this bug: http://ftpcontent.worldnow.com/wbbh/documents/Remoteattacksurfaces.pdf
(sha256: 371d87d27666d1f97678cbf4eec03704f4c1e85029009ee2439690303f7dde28)

The problem occurs while parsing the following data:

obj\r\n<</Type/FontDescriptor/FontName/ABCDEE+Calibri/Flags 32/ItalicAngle 0/Ascent 750/Descent -250/CapHeight 750/AvgWidth 521/MaxWidth 1743/FontWeight 400/XHeight 250/StemV 52/FontBBox[ -503 -250 1240 750] /FontFile2 5812 0 R>>\r\nendobj

Due to the reuse of buf between invocations of the function, buf will eventually contain:

"FontDescriptor\000FontName\000DeviceRG"

This benign example causes a read to segfault, but a more carefully crafted input could cause an out-of-bounds write.

Valgrind

<removed for brevity>
Remoteattacksurfaces.pdf: --A-- Version 1 -- Object 2029 (FontDescriptor)
Remoteattacksurfaces.pdf: --A-- Version 1 -- Object 2030 (FontDescriptor)
Remoteattacksurfaces.pdf: --A-- Version 1 -- Object 2031 (FontDescriptor)
Remoteattacksurfaces.pdf: --A-- Version 1 -- Object 2032 (FontDescriptor)
Remoteattacksurfaces.pdf: --A-- Version 1 -- Object 2033 (FontDescriptor)
Remoteattacksurfaces.pdf: --A-- Version 1 -- Object 2034 (FontDescriptor)
Remoteattacksurfaces.pdf: --A-- Version 1 -- Object 2035 (FontDescriptor)
Remoteattacksurfaces.pdf: --A-- Version 1 -- Object 2036 (FontDescriptor)
Remoteattacksurfaces.pdf: --A-- Version 1 -- Object 2037 (FontDescriptor)
Remoteattacksurfaces.pdf: --A-- Version 1 -- Object 2038 (FontDescriptor)
Remoteattacksurfaces.pdf: --A-- Version 1 -- Object 2039 (FontDescriptor)
==18759== Invalid read of size 1
==18759==    at 0x10D12D: get_type (pdf.c:1296)
==18759==    by 0x10B012: pdf_summarize (pdf.c:503)
==18759==    by 0x109F95: main (main.c:337)
==18759==  Address 0x112000 is not stack'd, malloc'd or (recently) free'd
==18759==
==18759==
==18759== Process terminating with default action of signal 11 (SIGSEGV)
==18759==  Access not within mapped region at address 0x112000
==18759==    at 0x10D12D: get_type (pdf.c:1296)
==18759==    by 0x10B012: pdf_summarize (pdf.c:503)
==18759==    by 0x109F95: main (main.c:337)
==18759==  If you believe this happened as a result of a stack
==18759==  overflow in your program's main thread (unlikely but
==18759==  possible), you can try to increase the size of the
==18759==  main thread stack using the --main-stacksize= flag.
==18759==  The main thread stack size used in this run was 8388608.
==18759==
==18759== HEAP SUMMARY:
==18759==     in use at exit: 284,202 bytes in 9 blocks
==18759==   total heap usage: 35,756 allocs, 35,747 frees, 2,340,979,569 bytes allocated
==18759==
==18759== LEAK SUMMARY:
==18759==    definitely lost: 0 bytes in 0 blocks
==18759==    indirectly lost: 0 bytes in 0 blocks
==18759==      possibly lost: 0 bytes in 0 blocks
==18759==    still reachable: 284,202 bytes in 9 blocks
==18759==         suppressed: 0 bytes in 0 blocks
==18759== Rerun with --leak-check=full to see details of leaked memory
==18759==
==18759== For counts of detected and suppressed errors, rerun with: -v
==18759== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.