marylinh / seccompsandbox Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 700 KB

Automatically exported from code.google.com/p/seccompsandbox

License: BSD 3-Clause "New" or "Revised" License

C++ 60.83% Assembly 13.65% C 24.50% Makefile 0.67% Python 0.35%

seccompsandbox's People

Contributors

Watchers

seccompsandbox's Issues

seccomp-sandbox dirties 4k*kMaxThreads on startup (currently 400k)

trusted_process.cc writes to one page per kMaxThreads on startup.
Since kMaxThreads is currently 100, this uses 400k of memory.

It should be possible to change this so that we only touch these pages
as new threads are created.

Original issue reported on code.google.com by [email protected] on 1 Oct 2010 at 12:05

Build error with GCC 4.5

When building with GCC 4.5, this error pops up:

sandbox.cc: In static member function ‘static void
playground::Sandbox::startSandbox()’:
sandbox.cc:447:24: error: ‘secureMem’ was not declared in this scope

In GCC 4.4 and earlier, and in the original C++ standard,

struct A { struct B {}; };

A::A::A::B::B::B::B::B myVariable;

is a perfectly valid declaration of a variable of type A::B.

In GCC 4.5, and in the current C++ standard, A::A is A's constructor, not
the type A. Normally, this won't be a problem, but in this case macro use
allowed it to creep in. Attached very trivial patch (based on the chromium
sources, but applicable without changes) modifies this so that it compiles.

This patch should be completely harmless for other compilers, but I can't
actually test for any ill effects, because patched or unpatched, with GCC
4.4.2 or GCC 4.5 (20091210 snapshot), I get a segmentation fault after an
error is reported on /proc/self/maps. I haven't ruled out the possibility
of a local problem, so don't consider that last part a bug report yet
unless you are already getting that yourself. :)

Original issue reported on code.google.com by [email protected] on 15 Dec 2009 at 9:17

Attachments:

gcc45.patch

kMaxThreads=100 is rather low for Native Client

The seccomp sandbox allocates per-thread data structures on startup,
so the maximum number of threads is fixed at startup.  This is set via
kMaxThreads, which is currently set to 100.

This is a bit low for Native Client.  Currently we support 8180
threads on Linux, 2556 on Mac OS X, and at least 900 on Windows (see
tests/egyptian_cotton/nacl.scons).  The maximum number of threads is
visible to untrusted code.  While we don't guarantee any number, 100
is a bit low compared with what we support currently.

See also issue 7.

Original issue reported on code.google.com by [email protected] on 1 Oct 2010 at 12:06

Allow libraries to be patched before fork(), before enabling sandbox

Currently, in Chromium, enabling the seccomp sandbox is done entirely
after forking from the zygote process, and this includes patching
libraries.  However, it would be good if patching libraries could be
done before fork().  This would have two advantages:

 1) Performance:  Patching libraries only once would save time and memory.

 2) Security, when using the SUID sandbox:  Currently the zygote
    process needs to keep a directory FD for /proc, because the
    seccomp sandbox needs /proc/self/maps in order to do library
    patching.

    /proc conveys a lot of authority, so this makes the SUID sandbox
    less secure than it would otherwise be, even if this FD is only
    held by the zygote process and not its children.

    If the zygote process had Breakpad enabled (although it's not
    supposed to), a SUID-sandboxed process could take control of the
    zygote (and hence its /proc FD) by sending it a signal, waiting
    for the zygote to make itelf dumpable using prctl(), and then
    taking control of the zygote using ptrace().

In order to allow patching before fork(), we would need to add a
global flag to the syscall interceptor to pass through syscalls
unaltered until the sandbox has been enabled fully.

Original issue reported on code.google.com by [email protected] on 18 Oct 2010 at 1:09

Extend the sandbox to work for legacy programs

Currently the seccomp sandbox works as a library.  After starting up, a process 
can enable the sandbox.  This means the sandbox is limited to trusted programs 
that wish to run parts of themselves untrusted.

It would be good if the seccomp sandbox could be applied to existing programs.  
To run an existing executable, we would have to enable sandboxing before the 
executable's code is run.  Furthermore, we don't want to have to modify glibc's 
dynamic linker (ld.so), or trust it.  So we would need to enable sandboxing 
before the dynamic linker gets control too.

We would need to support whatever syscalls ld.so does on startup.  One case of 
this is ld.so's TLS initialisation.  On i386, this uses set_thread_area().  On 
x86-64, it uses arch_prctl()+ARCH_SET_FS.

There is a design sketch for this at http://plash.beasts.org/wiki/SeccompSandbox

Original issue reported on code.google.com by [email protected] on 11 Nov 2010 at 4:05

Split code into "trusted" and "untrusted" directories

In the Native Client source tree, the code is split into "trusted" and
"untrusted" directories, with an additional "shared" directory for
code that is used in both contexts.

It would be good to do something similar for the seccomp sandbox.  It
would make the code easier to review.

Ideally, each of the files that handles specific syscalls (mmap.cc,
open.cc, exit.cc, etc.) would be split into two files, to separate the
sandbox_*() and process_*() functions.

When I was first getting familiar with the codebase, I found that
having sandbox_*() and process_*() in the same file made the codebase
harder to navigate by grepping, because it is not immediately obvious
whether a symbol is referred to from trusted or untrusted code.

Original issue reported on code.google.com by [email protected] on 21 Oct 2010 at 10:02

The return value of NOINTR_SYS is ignored

clang complains "error: expression result unused [-Wunused-value]" in a couple 
places while building the seccomp sandbox.

I've listed the places below. Instead of silencing the compiler, you probably 
want to log an error. I don't know how logging works in the seccomp sandbox.

Index: mutex.h
===================================================================
--- mutex.h (revision 153)
+++ mutex.h (working copy)
@@ -124,7 +124,7 @@
         #else
         #error Unsupported target platform
         #endif
-        NOINTR_SYS(sys.futex(mutex, FUTEX_WAKE, 1, 0));
+        (void)NOINTR_SYS(sys.futex(mutex, FUTEX_WAKE, 1, 0));
         return rc;
       }

Index: sandbox.cc
===================================================================
--- sandbox.cc  (revision 153)
+++ sandbox.cc  (working copy)
@@ -244,8 +244,8 @@
         status_ = STATUS_AVAILABLE;
       }
       int rc;
-      NOINTR_SYS(sys.waitpid(pid, &rc, 0));
-      NOINTR_SYS(sys.close(fds[0]));
+      (void)NOINTR_SYS(sys.waitpid(pid, &rc, 0));
+      (void)NOINTR_SYS(sys.close(fds[0]));
       return status_ != STATUS_UNSUPPORTED;
   }
 }
@@ -349,7 +349,7 @@
   // Take a snapshot of the current memory mappings. These mappings will be
   // off-limits to all future mmap(), munmap(), mremap(), and mprotect() calls.
   snapshotMemoryMappings(processFdPub_, proc_self_maps_);
-  NOINTR_SYS(sys.close(proc_self_maps_));
+  (void)NOINTR_SYS(sys.close(proc_self_maps_));
   proc_self_maps_ = -1;

   // Creating the trusted thread enables sandboxing
Index: trusted_process.cc
===================================================================
--- trusted_process.cc  (revision 153)
+++ trusted_process.cc  (working copy)
@@ -118,8 +118,8 @@
       nextThread = currentThread->mem->newSecureMem;
       goto newThreadCreated;
     } else if (header.sysnum == __NR_exit) {
-      NOINTR_SYS(sys.close(iter->second.fdPub));
-      NOINTR_SYS(sys.close(iter->second.fd));
+      (void)NOINTR_SYS(sys.close(iter->second.fdPub));
+      (void)NOINTR_SYS(sys.close(iter->second.fd));
       SecureMem::Args* secureMem = currentThread->mem;
       threads.erase(iter);
       secureMemPool_.push_back(secureMem);

Original issue reported on code.google.com by [email protected] on 26 Jan 2011 at 4:05

ssize_t not found in library.h

What steps will reproduce the problem?
make -f makefile

What do you see instead?
library.h:159:46: error: 'ssize_t' has not been declared

Solved by including sys/types.h, which per IEEE Std 1003.1-2001, shall define 
ssize_t

Original issue reported on code.google.com by [email protected] on 14 May 2012 at 5:47

Attachments:

ssize_t-not-found.patch

Build error with GCC 4.6 on x64

When building with GCC 4.6 on x64, the following build error appears for a 
number of files:

In file included from seccompsandbox/syscall_table.h:18:0,
                 from seccompsandbox/sandbox_impl.h:51,
                 from seccompsandbox/debug.h:14,
                 from seccompsandbox/ioctl.cc:5:
seccompsandbox/securemem.h: In static member function ‘static void 
playground::SecureMem::sendSystemCall(const 
playground::SecureMem::SyscallRequestInfo&, playground::SecureMem::LockType, 
T1, T2, T3) [with T1 = int, T2 = int, T3 = void*]’:
seccompsandbox/ioctl.cc:39:61:   instantiated from here
seccompsandbox/securemem.h:180:5: error: cast to pointer from integer of 
different size [-Werror=int-to-pointer-cast]
seccompsandbox/securemem.h:180:5: error: cast to pointer from integer of 
different size [-Werror=int-to-pointer-cast]

This warning was added to GCC 4.6 as part of 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28584

Original issue reported on code.google.com by [email protected] on 5 Jun 2011 at 8:20

Disable PaX mprotect

seccompsandbox fails in a kernel with PaX because it restricts mprotect() 
writing to executable sections. The executables need to be explicitely marked 
not to enforce secure memory protections.

Original issue reported on code.google.com by [email protected] on 14 May 2012 at 6:10

Attachments:

mprotect-off.patch

test_debugging fails on x86-64 because %gs is 0

test_debugging has started to fail for me on x86-64.

The cause seems to be Debug::enter()'s test for %gs.  It checks whether %gs is 
zero, and if so, Debug::enter() doesn't increment the recursion counter and it 
returns true.

However, we would expect %gs to be zero on x86-64.  See the test program below.

The result is that, in debugging mode, we get infinite recursion:  
defaultSystemCallHandler() calls Debug::syscall(), which calls gettimeofday(), 
which triggers a call to defaultSystemCallHandler().  Without the recursion 
check, this calls gettimeofday() again.

On my Ubuntu Lucid VM, this didn't just run out of stack, it triggered the OOM 
killer, and my window borders disappeared because the kernel killed Metacity 
(!).

What I don't understand is why the test was passing before.  I'm not sure what 
has changed.  Maybe syscall_entrypoint.cc's special case for gettimeofday() was 
making this work.  But if that is the case, I don't know why this has started 
failing.


I am not sure if %fs/%gs should ever show up as having non-zero values on 
x86-64.  The test program below gives the following output:

%gs = 0
%gs:0 = 1234
%fs = 0
%fs:0 = 139925201401600


#include <stdio.h>
#include <unistd.h>
#include <asm/unistd.h>
#include <asm/prctl.h>

int main() {
  long tls = 1234;
  long val;
  syscall(__NR_arch_prctl, ARCH_SET_GS, &tls);

  asm("mov %%gs, %0" : "=r" (val));
  printf("%%gs = %li\n", val);
  asm("mov %%gs:0, %0" : "=r" (val));
  printf("%%gs:0 = %li\n", val);

  asm("mov %%fs, %0" : "=r" (val));
  printf("%%fs = %li\n", val);
  asm("mov %%fs:0, %0" : "=r" (val));
  printf("%%fs:0 = %li\n", val);

  return 0;
}

Original issue reported on code.google.com by [email protected] on 26 Sep 2010 at 12:24

Allow sandbox to be initialised without needing access to /proc

This is a less specific version of issue 9.

We would like to be able to initialise the seccomp sandbox without
needing access to /proc/self/maps, so that we don't have to open a
hole in the SUID sandbox to get access to /proc.

Original issue reported on code.google.com by [email protected] on 18 Oct 2010 at 1:27

Vulnerability in process_sigaction()

Following on from http://codereview.chromium.org/3380018/show and
http://codereview.chromium.org/3414016/show, for the sake of
completeness, I am filing a bug on this.

There is a vulnerability in process_sigaction() in sigaction.cc, which
does the following:

  SecureMem::sendSystemCall(threadFdPub, false, -1, mem, sigaction_req.sysnum,
                            sigaction_req.signum, sigaction_req.action,
                            sigaction_req.old_action,
                            sigaction_req.sigsetsize);

It receives the syscall number sigaction_req.sysnum in a message, but
it passes it on to the trusted thread for execution without checking it.

This means an attacker can execute any syscall with 4 arguments.  The
only constraint is that the first argument cannot be 11.

Original issue reported on code.google.com by [email protected] on 27 Sep 2010 at 1:41

Missing unistd.h include in test_patching.cc

What steps will reproduce the problem?
make test

What do you see instead?
tests/test_patching.cc: In function 'void patch_range(char*, char*)':
tests/test_patching.cc:19:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:25:66: error: 'getpagesize' was not declared in this 
scope
tests/test_patching.cc:26:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:29:3: error: 'close' was not declared in this scope
tests/test_patching.cc:29:3: error: invalid type in declaration before '=' token
tests/test_patching.cc:29:3: error: '_exit' was not declared in this scope
tests/test_patching.cc: In function 'void test_patching_syscall()':
tests/test_patching.cc:33:20: error: 'getpid' was not declared in this scope
tests/test_patching.cc:34:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:39:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:40:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:41:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:42:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:52:3: error: '_exit' was not declared in this scope
tests/test_patching.cc: In function 'void check_patching_vsyscall(char*, 
char*)':
tests/test_patching.cc:76:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:77:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:78:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:79:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:80:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:81:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:82:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:83:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:84:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:85:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:86:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:87:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:88:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:89:3: error: '_exit' was not declared in this scope
tests/test_patching.cc: In function 'void 
test_patching_vsyscall_gettimeofday()':
tests/test_patching.cc:95:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:96:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:97:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:102:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:103:3: error: '_exit' was not declared in this scope
tests/test_patching.cc: In function 'void test_patching_vsyscall_time()':
tests/test_patching.cc:109:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:111:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:117:3: error: '_exit' was not declared in this scope
tests/test_patching.cc: In function 'void test_patching_vsyscall_getcpu()':
tests/test_patching.cc:121:3: error: '_exit' was not declared in this scope
tests/test_patching.cc:129:3: error: '_exit' was not declared in this scope
make: *** [tests/test_patching.o64] Error 1


Solved by including unistd.h, which is the proper header for declaring _exit(), 
close(), getpid() and getpagesize()
All but getpagesize() are defined by POSIX to have its prototype in unistd.h

getpagesize() appears in SVr4, 4.4BSD, SUSv2 and is also declared in unistd.h

Original issue reported on code.google.com by [email protected] on 14 May 2012 at 5:55

Attachments:

unistd.patch

Test failures on 32-bit systems due to differences in NX page protection

I mentioned this problem on
http://codereview.chromium.org/2074003/show, but I am filing a bug so
that it doesn't get lost.

Currently some of seccomp-sandbox's tests fail on 32-bit systems.

On my netbook, running 32-bit Ubuntu Karmic, two tests failed with SIGSEGV:
test_sa_flags
test_segv_resethand
(NX page protection works on this system.)

On another machine, running 32-bit GHardy, just test_sa_flags failed,
again with SIGSEGV.  (NX page protection doesn't work on this system.)

The tests are fine on the two 64-bit machines I tested on.

From sandbox.cc:

     // Non-executable version of the restorer function. We use this to
     // trigger a SEGV upon returning from the user's signal handler, giving
     // us an ability to clean up prior to returning from the SEGV handler.

I don't think this will work on systems where no-execute page
protection doesn't work, i.e. older kernels and older hardware.  This
restorer function will run and so the signal handler's counter won't
be decremented.  You can verify this by linking the tests with
-Wl,-z,execstack (this option is badly named because it doesn't only
affect the stack).

This explains the test_sa_flags failure.

The test_segv_resethand failure seems a bit odder.  The signal
handler's "ret" instruction jumps to non-executable code which causes
a SIGSEGV.  But when I examined this with "strace -i" and gdb, the
"ret" is shown as the source of the fault, rather than the address
that "ret" jumps to (which is what the code expects).  It looks like
the reported %eip varies between CPUs or kernel versions.

Original issue reported on code.google.com by [email protected] on 27 Sep 2010 at 1:28

Restrictions on sendmsg() could be bypassed through race using MAP_SHARED

When the nascent thread is starting, it forks a subprocess which does
a sendmsg() call to the trusted process.

The reason for doing the sendmsg() in a forked subprocess is
presumably to stop the untrusted threads from tampering with
sendmsg()'s "struct msghdr" arguments, which are passed in memory and
not in registers.

However, the trusted thread uses the new thread's stack for the
"struct msghdr" (%ebp in trusted_thread_i386.S).  This stack is mapped
by untrusted code, and it could have been mapped with MAP_SHARED, in
which case fork() will not create a private copy.

This means untrusted code could bypass the sandbox's restrictions on
sendmsg() by racing to overwrite this memory.  e.g. It could fill out
a non-NULL msg_name value.

I haven't tried testing this though.

The fix would be to use any page that is guaranteed to be mapped with
MAP_PRIVATE.

Does this sound right, Markus?

Original issue reported on code.google.com by [email protected] on 23 Sep 2010 at 9:50

The sandbox does not intercept glibc's calls to the x86-64 vsyscall page

In Ubuntu Lucid, libpthread contains calls to the x86-64 vsyscall page:

$ objdump -d /lib/libpthread.so.0
...
    ae50:       48 c7 c0 00 00 60 ff    mov    $0xffffffffff600000,%rax
    ae57:       ff d0                   callq  *%rax
...
(This is a call to vgettimeofday.)

When I disassemble these functions in a sandboxed process using gdb, I can see 
that the SYSCALL instructions have been patched, but the indirect calls to the 
vsyscall page have not.  There is code in library.cc for patching indirect 
calls, but it is only enabled for patching the vdso.

In practice, the vsyscall calls seem to be conditional on 
__have_futex_clock_realtime being false.  libpthread won't call vgettimeofday 
on a kernel that supports FUTEX_CLOCK_REALTIME.

This issue might be behind the problem with Linux 3.1 (issue chromium:104084), 
but I need to investigate more.

This is a difficult problem to solve in general, because it's probably not 
practical to enable library.cc's indirect-call patching code for libpthread.so 
or libc.so.  The kernel does not allow us to patch the vsyscall page (which is 
in the kernel range of address space), unlike the vdso.  However, the vsyscall 
page is deprecated, so we probably don't need to handle the general case.

Original issue reported on code.google.com by [email protected] on 15 Nov 2011 at 5:10

Change syscallTable to be filled out at run time

Currently syscallTable is filled out statically in syscall_table.c.
This has to be done in C to make it read-only because of a limitation
in g++.

An alternative would be to fill out the table at run time.

From http://codereview.chromium.org/3414016/show:
  "syscall_table.c is only saving us 4k of memory vs. populating at
  runtime, and only for non-PIC code.  Building this into a PIE or a
  library would lose the saving.

  Populating the table at runtime would make it easier to define
  policies or have alternate syscall handlers.  e.g. NaCl requires
  modify_ldt(), but it would be good to disable this for other
  processes just in case.  Plash would like to intercept open() to
  operate purely via message passing."

Another advantage would be that the table can be filled out in C++.
The asm("playground$foo") tricks we use to mix C and C++ wouldn't be
needed any more.

Original issue reported on code.google.com by [email protected] on 30 Oct 2010 at 1:38

Concurrent sendmsg()/recvmsg() calls are not allowed

I discovered that seccomp-sandbox does not currently allow concurrent
sendmsg() and recvmsg() calls.  If one thread is blocked in a
recvmsg() call, a second thread that calls sendmsg() will block.

This is because seccomp-sandbox uses a global mutex (syscall_mutex_)
for all syscalls that require data to be written to a secure memory
area by the trusted process.  The trusted process will handle only one
syscall at a time, and it waits for syscall_mutex_ to be unlocked
before handling another syscall.

I discovered this while trying to hook up Native Client to use
seccomp-sandbox.  Some of the tests deadlocked: there was a background
thread blocked on recvmsg(), while foreground threads would then block
on calls like mmap().

To fix this, I propose two changes:

1) Use one mutex per thread, rather than a global mutex.

2) Change the trusted process so that it does not wait for the
   thread's mutex to be unlocked before processing another syscall
   (which might come from another thread).

The wait in (2) happens in sendSystemCallInternal() in securemem.cc.
This wait should only be necessary if an allowed syscall has a side
effect that must complete for a subsequent allowed syscall to be safe.
I don't think this is the case for any currently allowed syscalls: the
trusted process does not attempt to model state changes of the
sandboxed process; ordering of syscalls, once checked, is not
significant.  (A possible exception is in the IPC syscalls in ipc.cc.)

The only wait needed should be in lockSystemCall(), to prevent a
secure memory area from being reused while it is still in use.

I have got an implementation of these changes which I'll send out
soon.

Original issue reported on code.google.com by [email protected] on 7 Sep 2010 at 3:57

clang error: cmp literal, memaddress is ambiguous

clang's integrated assembler emits the following error when building the 
seccomp sandbox:

/tmp/cc-DNyGz3.s:155:9: error: ambiguous instructions require an explicit 
suffix (could be 'cmpb', 'cmpw', 'cmpl', or 'cmpq')
        cmp $0, 0(%rax)
        ^
/tmp/cc-DNyGz3.s:157:9: error: ambiguous instructions require an explicit 
suffix (could be 'cmpb', 'cmpw', 'cmpl', or 'cmpq')
        cmp $1, 0(%rax)

This patch fixes the problem:


Index: fault_handler_i386.S
===================================================================
--- fault_handler_i386.S    (revision 153)
+++ fault_handler_i386.S    (working copy)
@@ -178,9 +178,9 @@
         // callers might be confused by this and will need fixing for running
         // inside of the seccomp sandbox.
      20:lea  playground$sa_segv, %eax
-        cmp  $0, 0(%eax)         // SIG_DFL
+        cmpw $0, 0(%eax)         // SIG_DFL
         jz   21f
-        cmp  $1, 0(%eax)         // SIG_IGN
+        cmpw  $1, 0(%eax)         // SIG_IGN
         jnz  22f                 // can't really ignore synchronous signals

Original issue reported on code.google.com by [email protected] on 26 Jan 2011 at 4:03

marylinh / seccompsandbox Goto Github PK

seccompsandbox's People

Contributors

Watchers

seccompsandbox's Issues

Recommend Projects

Recommend Topics

Recommend Org