Git Product home page Git Product logo

Comments (35)

kyren avatar kyren commented on July 19, 2024 1

This should definitely be fixed now, in fact it was fixed with 0.10.

from rlua.

jonas-schievink avatar jonas-schievink commented on July 19, 2024

There's no need to use drop. In fact, that's impossible because the userdata can't be accessed by-value inside drop. I can just use an external object holding the only reference to the userdata to resurrect it:

extern crate rlua;

use rlua::*;

struct Userdata {
    id: u8,
}

impl Drop for Userdata {
    fn drop(&mut self) {
        println!("dropping {}", self.id);
    }
}

impl LuaUserDataType for Userdata {
    fn add_methods(methods: &mut LuaUserDataMethods<Self>) {
        methods.add_method("access", |lua, this, args| {
            println!("accessing userdata {}", this.id);
            lua.pack(())
        });
    }
}

fn main() {
    let lua = Lua::new();
    {
        let globals = lua.globals();
        globals.set("userdata", Userdata { id: 123 }).unwrap();
    }

    lua.eval::<()>(r#"
        local tbl = setmetatable({
            userdata = userdata
        }, { __gc = function(self)
            -- resurrect userdata
            hatch = self.userdata
        end })

        print("userdata = ", userdata)
        print("hatch = ", hatch)
        print "collecting..."
        tbl = nil
        userdata = nil  -- make table and userdata collectable
        collectgarbage("collect")
        print("userdata = ", userdata)
        print("hatch = ", hatch)
        hatch:access()
    "#, None).unwrap();
}

This demonstrates not a double-drop, but a use-after-free in hatch:access():

userdata = 	userdata: 0x7f68e0c306a8
hatch = 	nil
collecting...
dropping 123
userdata = 	nil
hatch = 	userdata: 0x7f68e0c306a8
accessing userdata 32

from rlua.

kyren avatar kyren commented on July 19, 2024

So, I apparently had a bad understanding of how __gc metamethods worked from lua circa 5.1, because I was unaware that somewhere between 5.1 and 5.3, tables got the ability to have __gc methods that could be assigned by scripts.

Apparently, ANY userdata may be resurrected, and may have its __gc methods called twice, correct? I now need to.. I guess store all userdata in a form where it can be nulled, and panic on access if it is nulled, and allow it to be nulled more than once safely. Something like, I suppose storing Option, and simply setting it to None on gc instead of dropping it, then I can check to make sure it is not None before returning a pointer.

I can do that, but what it means is it may be.. very hard or impossible to prevent a lua script from forcing a panic. I suppose practically that is not a HUGE deal, since preventing a lua script from causing a DOS attack, even with execution limits, is probably already extremely extremely difficult.

from rlua.

kyren avatar kyren commented on July 19, 2024

I believe this may be fixed by 36134e6. I don't necessarily like the fact that some userdata now may have multiple levels of Option, and may be able to be collapsed somewhat.

from rlua.

kyren avatar kyren commented on July 19, 2024

Actually no, there's something I didn't think of. Since Lua 5.3 allows __gc methods on lua tables, that means that any function that may trigger a gc metamethod may longjmp, which makes my entire longjmp story unsound. I was relying on only methods marked as 'e' in the reference manual causing longjmp, but this means that any method marked 'm' may ALSO cause longjmp, which is more or less everything. It's noted in the reference manual that 'm' methods may trigger a __gc metamethod, but I was relying on only userdata having __gc methods, and not being defined in scripts.

This may mean that it is technically impossible to write sound lua bindings.

Okay, so probably just extremely hard not impossible. I think the sanest way to do it would be to, as your first step, write an entire C API that guarantees never to longjmp past the caller.

from rlua.

kyren avatar kyren commented on July 19, 2024

If this is by design, it would be helpful to document that. In general, rluas internals are barely documented which makes it hard to understand and debug stuff like #18.

I completely agree that it's not well documented. It's not well documented because it has until this point been in dramatic flux. When we finally (FINALLY) stop finding fundamental flaws in my view of the universe, I may rewrite it to be consistent and actually document it.

This is not the first time that I have discovered that a lua bindings system I work on is fundamentally unsound, and it probably won't be the last. Does ANYBODY get this right, if it's even possible to GET right? I'm legitimately considering writing my own 5.3 compatible interpreter at this point, it would be faster than wrapping everything in the universe to make sure it doesn't longjmp.

from rlua.

jonas-schievink avatar jonas-schievink commented on July 19, 2024

I did briefly consider a more low-level wrapper around Lua's C API, which would handle this and abstract away some Lua version differences, too. But I don't yet know how such a library would have to look to be useful, reasonably fast but still safe.

I'm legitimately considering writing my own 5.3 compatible interpreter at this point

I was interested in this for other reasons (and found a way to implement a pretty slow GC in safe Rust), but it is unlikely that such an implementation wouldn't have some bugs, too. Maybe no safety issues, but bugs in the language itself, which isn't really pleasant to debug, either.

This library is at least the third attempt at safe Lua bindings for Rust, at some point it has to work, even if just because of the infinite monkey theorem ;)

from rlua.

kyren avatar kyren commented on July 19, 2024

Sometimes I think if I do end up writing something that works, it is probably only due to the infinite monkey theorem.

from rlua.

jonas-schievink avatar jonas-schievink commented on July 19, 2024

A reasonably easy way to "fix" the remaining issue would be to replace setmetatable with a wrapper that puts __gc behind a pcall. Should the pcall fail, we can either just abort the program (not good in the presence of user-written code) or call a Rust handler that can log the error, set a flag, etc.

from rlua.

kyren avatar kyren commented on July 19, 2024

That's not a bad idea!

from rlua.

kyren avatar kyren commented on July 19, 2024

The documentation for setmetatable says:

Sets the metatable for the given table. (To change the metatable of other types from Lua code, you must use the debug library (§6.10).) If metatable is nil, removes the metatable of the given table. If the original metatable has a __metatable field, raises an error.

But, if you go look at the documentation for the debug library, it says, for debug.setmetatable:

Sets the metatable for the given value to the given table (which can be nil). Returns value.

So, I don't know if debug.setmetatable can be used to set the metatable on something other than a table, but if it can, that is also a soundness hole.

from rlua.

kyren avatar kyren commented on July 19, 2024

I could wrap both setmetatable and debug.setmetatable, I could only wrap setmetatable, or I could wrap setmetatable and not load the debug library. I feel like other things in the debug library could be used to cause unsoundness as well.. somehow. I mean, you can do things like set hooks with things that cause 'error' or replace arbitrary upvalues. I know it's drastic but I'm leaning towards removing the debug library somehow.

from rlua.

jonas-schievink avatar jonas-schievink commented on July 19, 2024

I think you can ignore the debug library, it can be used to crash any Lua program and manipulates VM state. Just disable it when running untrusted code.

from rlua.

kyren avatar kyren commented on July 19, 2024

Okay, I took kind of a heavyhanded approach. Currently, with the patch for handling resurrected userdata, Lua CAN cause aborts by simply resurrecting a garbage collected userdata and then attempting to call a method on it. Because the cat's already out of the bag there, 9c34d4b simply wraps user provided __gc methods with an abort on error. Also, for now, I simply disabled the debug library, which I realized is quite heavy handed, but I want to address that when I address configurable startup environments. I will document all this in the README when 0.9 is released.

from rlua.

kyren avatar kyren commented on July 19, 2024

Also, I believe (though am not certain) that there is one actual soundness issue left (I'm not counting the theoretical "all longjmps are unsafe" issue as a soundness issue).

That issue is that, IF the system allocator returns null, I believe lua longjmps with an OOM error, and that causes the same issues as __gc obviously. I think.. the only time most system allocators return null is when provided an un-allocatable size of memory in the first place, and most modern environments of course just overcommit memory, but I'm sure not all platforms where this may build and run will be like that. I believe the solution is to provide a custom allocator to lua that aborts on OOM, but it would currently require nightly to use the rust allocator? I'm not sure on that actually.

from rlua.

jonas-schievink avatar jonas-schievink commented on July 19, 2024

That's easy to fix: Use lua_newstate instead of luaL_newstate and pass our own allocation function that aborts when allocation fails.

from rlua.

kyren avatar kyren commented on July 19, 2024

That's easy to fix: Use lua_newstate instead of luaL_newstate and pass our own allocation function that aborts when allocation fails.

Yeah, I knew that part, but what do you use to allocate? Are the rust raw allocators available in stable, or.. I guess you can use Box to allocate a char array or something?

from rlua.

jonas-schievink avatar jonas-schievink commented on July 19, 2024

Perhaps libc::malloc and friends? That'd mean a new dependency, but libc isn't that bad to have.

from rlua.

kyren avatar kyren commented on July 19, 2024

Right, so don't use the rust allocator, directly use malloc, but abort instead of returning nullptr. That would work, but I'd much rather use the rust allocator once it's available. I don't really mind a dependency on libc though, everything depends on libc, so I'll do that in the meantime.

from rlua.

kyren avatar kyren commented on July 19, 2024

Okay, I believe 2553623 fixes the OOM unsoundness, and that is the last tangential issue that this raised. I think I'm going to release 0.9 now.

from rlua.

jonas-schievink avatar jonas-schievink commented on July 19, 2024

Looks great! I'm really looking forward to the release, a lot of new stuff, lots of fixes, much more documentation. Maybe announce this release on the users forum or the subreddit?

from rlua.

kyren avatar kyren commented on July 19, 2024

Yeah, me too. I reallllly could not have done this without your help though, thank you so much!

from rlua.

jonas-schievink avatar jonas-schievink commented on July 19, 2024

No problem! I think it's about time Rust gets a safe and usable Lua wrapper.

from rlua.

psychon avatar psychon commented on July 19, 2024

How about just unsetting the metatable in __gc? See e.g. awesomeWM/awesome@1408e8b
Implementing this seems trivial, but I am not quite sure about which other code is affected (e.g. made redundant) by this:

diff --git a/src/util.rs b/src/util.rs
index 41c76fc..dc68574 100644
--- a/src/util.rs
+++ b/src/util.rs
@@ -335,6 +335,8 @@ pub unsafe fn get_userdata<T>(state: *mut ffi::lua_State, index: c_int) -> *mut
 pub unsafe extern "C" fn userdata_destructor<T>(state: *mut ffi::lua_State) -> c_int {
     match catch_unwind(|| {
         *(ffi::lua_touserdata(state, 1) as *mut Option<T>) = None;
+        ffi::lua_newtable(state);
+        ffi::lua_setmetatable(state, -2);
         0
     }) {
         Ok(r) => r,

from rlua.

jonas-schievink avatar jonas-schievink commented on July 19, 2024

@psychon Looks like that could simplify quite a bit of code - it would remove the need for the Option wrapping every userdata, and it remove one of the bullet points in #38

from rlua.

psychon avatar psychon commented on July 19, 2024

For bonus points: Instead of unsetting the metatable, have a special metatable that marks objects as "this was already finalized" so that you end up with better error messages. (Yup, I was always too lazy to implement this)

from rlua.

kyren avatar kyren commented on July 19, 2024

I haven’t looked into it in detail yet, but I’m gonna reopen this in light of the comments on #38.

from rlua.

kyren avatar kyren commented on July 19, 2024

It might be sound again when 0.10 is released..

..I hope.

from rlua.

jonas-schievink avatar jonas-schievink commented on July 19, 2024

@kyren Feel free to close, I've kept this open so #19 (comment) doesn't get lost, which seems like it could simplify rluas internals a bit, but I can open another issue for that.

from rlua.

kyren avatar kyren commented on July 19, 2024
diff --git a/src/util.rs b/src/util.rs
index 41c76fc..dc68574 100644
--- a/src/util.rs
+++ b/src/util.rs
@@ -335,6 +335,8 @@ pub unsafe fn get_userdata<T>(state: *mut ffi::lua_State, index: c_int) -> *mut
 pub unsafe extern "C" fn userdata_destructor<T>(state: *mut ffi::lua_State) -> c_int {
     match catch_unwind(|| {
         *(ffi::lua_touserdata(state, 1) as *mut Option<T>) = None;
+        ffi::lua_newtable(state);
+        ffi::lua_setmetatable(state, -2);
         0
     }) {
         Ok(r) => r,

So, to summarize the suggested change, I think IF AND ONLY IF we wrap the calls to newtable and setmetatable in a gc_guard, then it would be sound to eliminate the Option from userdata? If we don't wrap in a gc_guard, then the code can longjmp away and we can have invalidated userdata with a __gc method, correct?

If so, I'm happy to make that change. Though, eventually something will have to be done about relying on gc_guard, because eventually we want to support memory limits on scripts, which will let allocation as well as garbage collection error, but I think I have a plan of attack there.

from rlua.

kyren avatar kyren commented on July 19, 2024

I actually now believe that this change solves a bug. First, a question..

If you have a Lua table with a finalizer, and a Lua userdata with a finalizer, and the Lua table's finalizer accesses the Lua userdata, is the Lua table finalizer guaranteed to run before the Lua userdata finalizer if both objects are garbage collected at the same time? I'm guessing that it's not.

If that's the case, there are several panics in util.rs related to WrappedError that I believe can all be triggered from Lua. Since removing the metatable from the userdata also removes the identifying type information, this cleanly solves that problem.

from rlua.

jonas-schievink avatar jonas-schievink commented on July 19, 2024

IF AND ONLY IF we wrap the calls to newtable and setmetatable in a gc_guard, then it would be sound to eliminate the Option from userdata?

That sounds correct, I think.

If you have a Lua table with a finalizer, and a Lua userdata with a finalizer, and the Lua table's finalizer accesses the Lua userdata, is the Lua table finalizer guaranteed to run before the Lua userdata finalizer if both objects are garbage collected at the same time?

Nope, don't think so. I think it currently depends on the order in which the objects are created, but Lua doesn't make any guarantees.

from rlua.

kyren avatar kyren commented on July 19, 2024

Okay, hopefully 77eb73a correctly implements that change.

from rlua.

kyren avatar kyren commented on July 19, 2024

NOPE, it's all wrong, I used mem::replace instead of ptr::write, give me a moment.

Edit: okay, 6382baa hopefully is correct.

from rlua.

kyren avatar kyren commented on July 19, 2024

I'm going to go ahead and close this, since both the original soundness issue was fixed and also the proposed simplification was implemented.

from rlua.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.