Git Product home page Git Product logo

Comments (12)

OneXDeveloper avatar OneXDeveloper commented on August 12, 2024

Decided to write up a minimum example for you.

use hudhook::hooks::dx12::ImguiDX12Hooks;
use hudhook::hooks::{ImguiRenderLoop, ImguiRenderLoopFlags};
use hudhook::hudhook;
use hudhook::renderers::imgui_dx12::imgui::{Condition, Ui, Window};
use hudhook::reexports::*;

#[derive(Default)]
struct Example {}

impl ImguiRenderLoop for Example {
    fn render(&mut self, ui: &mut Ui, _flags: &ImguiRenderLoopFlags) {
        Window::new("Example").size([400., 400.], Condition::FirstUseEver).build(ui, || {
            if ui.button("Unhook") {
                hudhook::lifecycle::eject();
                return;
            }
        });
    }
}

hudhook!(Example::default().into_hook::<ImguiDX12Hooks>());

This was able to reproduce it every time I clicked the button for me. I then went and added a file logger so we could see what was going on.

[00:00:00.000] (a38) INFO   logger initialized
[00:00:00.000] (a38) DEBUG  Done init
[00:00:00.000] (a38) TRACE  Rendering started
[00:00:00.000] (a38) ERROR  Null command queue
[00:00:00.000] (a38) TRACE  Invoking IDXGISwapChain3::Present trampoline
[00:00:00.000] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90250f60)), 1, 0x57443faef0) invoked
[00:00:00.001] (a38) TRACE  CommandQueue description: D3D12_COMMAND_QUEUE_DESC { Type: D3D12_COMMAND_LIST_TYPE(0), Priority: 0, Flags: D3D12_COMMAND_QUEUE_FLAGS(0), NodeMask: 1 }
[00:00:00.001] (a38) TRACE  cmd_queue ptr was set
[00:00:00.001] (a38) TRACE  Trampoline returned HRESULT(0x00000000)
[00:00:00.013] (a38) TRACE  Entering WndProc 5055c 84 0 2fb11bf
[00:00:00.015] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90250a60)), 15, 0x18d323b0c00) invoked
[00:00:00.015] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90250f60)), 3, 0x18d3b984f60) invoked
[00:00:00.015] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90251960)), 1, 0x18d8ad1bf60) invoked
[00:00:00.015] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90250f60)), 5, 0x18d3b984f60) invoked
[00:00:00.015] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90250f60)), 10, 0x18d3b984f60) invoked
[00:00:00.016] (a38) TRACE  IDXGISwapChain3::Present(IDXGISwapChain3(IUnknown(0x18c906c9780)), 0, 512) invoked
[00:00:00.016] (a38) TRACE  Rendering started
[00:00:00.037] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18d3464a8e0)), 1, 0x57443fc848) invoked
[00:00:00.041] (a38) TRACE  Display size 3862x2182
[00:00:00.042] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90250f60)), 1, 0x57443fca40) invoked
[00:00:00.042] (a38) TRACE  Rendering done
[00:00:00.042] (a38) TRACE  Invoking IDXGISwapChain3::Present trampoline
[00:00:00.042] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90250f60)), 1, 0x57443faef0) invoked
[00:00:00.042] (a38) TRACE  Trampoline returned HRESULT(0x00000000)
...
# Near the end
[00:00:07.294] (a38) TRACE  Entering WndProc 5055c 202 0 3620777
[00:00:07.294] (a38) TRACE  Entering WndProc 5055c 215 0 0
[00:00:07.295] (a38) TRACE  Entering WndProc 5055c 84 0 3750779
[00:00:07.297] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90250a60)), 15, 0x18d323b0c00) invoked
[00:00:07.297] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90250f60)), 3, 0x18d3b984f60) invoked
[00:00:07.297] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90251960)), 1, 0x18d8ad1bf60) invoked
[00:00:07.297] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90250f60)), 5, 0x18d3b984f60) invoked
[00:00:07.297] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90250f60)), 10, 0x18d3b984f60) invoked
[00:00:07.297] (a38) TRACE  IDXGISwapChain3::Present(IDXGISwapChain3(IUnknown(0x18c906c9780)), 0, 512) invoked
[00:00:07.297] (a38) TRACE  Rendering started
[00:00:07.298] (a38) TRACE  Display size 3862x2182
[00:00:07.298] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90250f60)), 1, 0x57443fca40) invoked
[00:00:07.298] (a38) TRACE  Rendering done
[00:00:07.298] (a38) TRACE  Invoking IDXGISwapChain3::Present trampoline
[00:00:07.298] (49bc) TRACE  Disabling hooks...
[00:00:07.298] (a38) TRACE  ID3D12CommandQueue::ExecuteCommandLists(ID3D12CommandQueue(IUnknown(0x18c90250f60)), 1, 0x57443faef0) invoked
[00:00:07.298] (49bc) TRACE  Cleaning up renderer...
[00:00:07.298] (a38) TRACE  Trampoline returned HRESULT(0x00000000)

That's all that was in there. Not sure if that's where it hung up or if the logger just stopped logging at that point.

from hudhook.

OneXDeveloper avatar OneXDeveloper commented on August 12, 2024

I was able to either resolve it, or make it extremely rare by making these changes to the unhook function.

  1. Put in a tiny thread sleep after locking the renderer. This allows it time to finish the trampoline.
  2. Dropping the renderer explicitly.
    unsafe fn unhook(&mut self) {
        trace!("Disabling hooks...");
        for hook in [&self.hook_dscp, &self.hook_cqecl, &self.hook_rbuf] {
            if let Err(e) = hook.disable() {
                error!("Couldn't disable hook: {e}");
            }
        }

        trace!("Cleaning up renderer...");
        if let Some(renderer) = IMGUI_RENDERER.take() {
            let mut locked_renderer = renderer.lock();
            locked_renderer.cleanup(None);
            trace!("Waiting for frame to finish before finishing cleanup");
            sleep(Duration::from_millis(25));
            // Drop lock to render
            drop(locked_renderer);
            // Drop renderer itself
            drop(renderer);
        }
        debug_assert!(IMGUI_RENDERER.get().is_none());

        drop(IMGUI_RENDER_LOOP.take());
        debug_assert!(IMGUI_RENDER_LOOP.get().is_none());

        COMMAND_QUEUE_GUARD.take();
        debug_assert!(COMMAND_QUEUE_GUARD.get().is_none());

        DXGI_DEBUG_ENABLED.store(false, Ordering::SeqCst);
        trace!("Unhook completed");
    }

I'm not sure this is the right approach, but it works on my machine ;)

from hudhook.

veeenu avatar veeenu commented on August 12, 2024

Hey! Thank you for your feedback :)

I'll take care of this but next time feel free to submit a PR, it is very welcome!
I think the solution could be a bit more complicated, and definitely abusing sleeps is going to make things worse and harder to debug if anything.
Taking and dropping the global statics is a great idea, as it frees memory for the next time around and runs a lot of the necessary cleanup. I'm surprised we don't already do this in each hook's Drop impl.
What game were you trying to run this against? Was it running full screen by chance? If that's the case, it could very well be that it just stopped gathering windowing events because of the dead wndproc and as a result you can't get to any other application.

from hudhook.

OneXDeveloper avatar OneXDeveloper commented on August 12, 2024

@veeenu

It was a hard lock of the graphics. I also saw errors in the event log in windows saying the nvidia driver was crashing. I was not running it full screen.

I suspect some of the data being freed by the renderer was still being used by the other thread in the trampoline call. We don’t seem to be holding the lock on it the whole time the trampoline is being called. I tried to extend the hold on that lock passed where the trampoline is called but that seemed to have caused other issues. It would not render at all that way and seemed like it deadlocked.

from hudhook.

veeenu avatar veeenu commented on August 12, 2024

It was a hard lock of the graphics. I also saw errors in the event log in windows saying the nvidia driver was crashing. I was not running it full screen.

That sounds like a driver bug. Userland applications shouldn't be able to crash kernel code.

It would not render at all that way and seemed like it deadlocked.

That's the reason why the mutex locks are structured that way. The current solution is the proper way to ensure correct synchronization. The proper solution to the detach misbehaviors should be to use a condvar to wait for the trampoline to finish.

from hudhook.

veeenu avatar veeenu commented on August 12, 2024

@OneXDeveloper would you mind giving this branch a spin?

The logic is that I flip one atomic boolean per hooked function, so that while the function is running the boolean is true. Then, at the time of cleanup, I remove the hooks and wait on those three booleans to become false, signaling that the three hooked functions have exited and I can proceed to cleanup.

Managed to repeatedly unhook and rehook without ever crashing. If it works allright for you, I'll merge that in.

from hudhook.

OneXDeveloper avatar OneXDeveloper commented on August 12, 2024

This looks great! Way more elegant & correct than my sleep solution! I’ll give this a try later today. Thanks a bunch!

from hudhook.

OneXDeveloper avatar OneXDeveloper commented on August 12, 2024

Hey @veeenu, I gave this a try today and sadly it's still having the crash. I am certain that there is something we need to be waiting for that isn't being blocked by any of those locks but I am unfamiliar with DX12 so I do not know what that would be.

What can I do to help?

from hudhook.

veeenu avatar veeenu commented on August 12, 2024

Uhh, not sure. I'd say the usual like making sure you are on the latest drivers. What game are you testing this on? If you take that branch and add a sleep after waiting on the three fences, what happens?

from hudhook.

OneXDeveloper avatar OneXDeveloper commented on August 12, 2024

@veeenu Apologies, I meant help debug the issue. I am on the latest drivers. D2R is the game. Testing in the menu. I did test adding the sleep but in the same place I posted before which still resolves it.

from hudhook.

OneXDeveloper avatar OneXDeveloper commented on August 12, 2024

@veeenu I tested this and it was no longer locking up.

CQECL_RUNNING.wait();
PRESENT_RUNNING.wait();
RBUF_RUNNING.wait();
std::thread::sleep(Duration::from_millis(25));

trace!("Cleaning up renderer...");
if let Some(renderer) = IMGUI_RENDERER.take() {
    let mut renderer = renderer.lock();
    renderer.cleanup(None);
}

Which I think rules out any issues with the WNDPROC.

I also tested lowering the sleep down to 1ms but that still locked up. I'm guessing when dropping the renderer we are dropping other resources that are still needed for the GPU to process our current frame of imgui. My uneducated guess would be the FrameResources or something else inside the RenderEngine. I suspect we probably need some sort of fence in there for when the GPU is done rendering our frame but that is likely beyond my ability to troubleshoot.

from hudhook.

veeenu avatar veeenu commented on August 12, 2024

I think you're right, most likely this could be solved by using DX12 synchronization primitives like Fences, but I'm not expert enough on those yet. So I want to say adding a delay of at least 1 frame at 30fps (34ms) should be a safe solution for now.

from hudhook.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.