Hi Peter,
This is James from HPE, we recently found a stack override problem on the libusb-win32 driver.
https://sourceforge.net/projects/libusb-win32/
Please review below for details and let us know your opinion and suggestions.
It would be great if you can help us to fix this issue.
I have sent you an email around 8/5.
The problem is the irp->userIosb ptr member which points to local io_status which is allocated locally from the stack.
The kernel updates the 2 qword io_status which is originally on the stack of call_usbd_ex but sometimes the kernel stomps the stack of another routine if call_usbd_ex has already returned when the update happens.
Here’s a lot more detail from libusb_driver.c:
NTSTATUS call_usbd_ex(libusb_device_t *dev, void *urb, ULONG control_code, <-- see note1
int timeout, int max_timeout)
{
KEVENT event;
NTSTATUS status;
IRP *irp;
IO_STACK_LOCATION *next_irp_stack;
LARGE_INTEGER _timeout;
IO_STATUS_BLOCK io_status; <-- see note 2
if (max_timeout > 0 && timeout > max_timeout)
{
timeout = max_timeout;
}
if (timeout <= 0)
timeout = LIBUSB_MAX_CONTROL_TRANSFER_TIMEOUT;
KeInitializeEvent(&event, NotificationEvent, FALSE);
irp = IoBuildDeviceIoControlRequest(control_code, dev->target_device, \
NULL, 0, NULL, 0, TRUE, > see note 3
NULL, &io_status); /
if (!irp)
{
return STATUS_NO_MEMORY;
}
next_irp_stack = IoGetNextIrpStackLocation(irp);
next_irp_stack->Parameters.Others.Argument1 = urb;
next_irp_stack->Parameters.Others.Argument2 = NULL;
IoSetCompletionRoutine(irp, on_usbd_complete, &event, TRUE, TRUE, TRUE);
status = IoCallDriver(dev->target_device, irp); \_ returns STATUS_PENDING so need to wait for irp completion
if(status == STATUS_PENDING) /
{
_timeout.QuadPart = -(timeout * 10000);
if(KeWaitForSingleObject(&event, Executive, KernelMode, \_ wait times-out after specified 500ms -- see note 4
FALSE, &_timeout) == STATUS_TIMEOUT) /
{
USBERR0("request timed out\n");
IoCancelIrp(irp); <-- attempt to cancel the irp because of the time-out
}
}
/* wait until completion routine is called */
KeWaitForSingleObject(&event, Executive, KernelMode, FALSE, NULL); <-- wait (with infinite time-out) for completion
status = irp->IoStatus.Status; <-- see note 5
/* complete the request */
IoCompleteRequest(irp, IO_NO_INCREMENT); <-- see note 6
USBDBG("status = %08Xh\n",status);
return status; <-- see note 7
}
Note 1:
The call_usbd_ex() routine executes all the time but precisely every 60sec there is a unique usb request that always times-out after 500ms and fails:
2: kd> dt _URB ffff868d`0be18570 UrbControlVendorClassRequest.Value
USBPORT!_URB
+0x000 UrbControlVendorClassRequest :
+0x082 Value : 0x3fd <-- 0x3fd = 1021 is a unique usb vendor request code that always times-out
Eaton should be able to tell us what this 0x3fd (1021) request is, why it gets sent every 60sec, and what might cause it to always time-out.
We should check a “working” system to see if this request is being sent and if it always time-out and fails.
If this request didn’t get sent or if it didn’t time-out and fail then we would not have a problem.
Note 2:
The 2 qword IO_STATUS_BLOCK io_status is allocated as a local on the stack of the current routine:
2: kd> dt IO_STATUS_BLOCK Status Information
libusb0!IO_STATUS_BLOCK
+0x000 Status : Int4B <-- low dword of first qword will be updated with ntstatus 0xc0000120 (The I/O request was canceled)
+0x008 Information : Uint8B <-- second qword will be updated with the byte count of the transfer which will be 0x00000000’00000000
This is what gets updated or stomped on the stack and the results are:
- If the writes happen before this routine returns then this is just a normal status block update
- If the writes happen after this routine returns then this stomps the stack possibly of some other routine and timing / stack usage determines if it crashes
Note 3:
The irp is created here. Notice that the &io_status ptr is passed as an arg and it becomes the irp member UserIosb ptr:
2: kd> dt nt!_IRP IoStatus UserIosb
+0x030 IoStatus : _IO_STATUS_BLOCK
+0x048 UserIosb : Ptr64 _IO_STATUS_BLOCK <-- ptr to local io_status allocated on this routine’s stack
The irp that’s created has the following flags:
2: kd> dt nt!_IRP 0xffffe486`c1d56990 Flags
+0x010 Flags : 0x40060000 <-- 0x40000000 => FILE_FLAG_OVERLAPPED => it can be asynchronous (or synchronous)
When the kernel is later called to complete the request, it copies IoStatus into the ptr contents of *UserIosb (ie. io_status) which is the final status info for the irp:
- This is what can stomp the stack of another routine depending upon when it happens
- ie. it might be a synchronous or asynchronous completion
- if it is async then the stack will get stomped later but there will only be a crash if another routine is actively using those stack addrs at that time
Either of the following will work-around the problem:
- If io_status is defined as a global (instead of as a local on the stack) then the stack will not get stomped
- If io_status is replaced with a NULL then the kernel will not update user status (*UserIosb) and the stack will not get stomped
o Note: I don’t think it is documented that this ptr can be a NULL ptr but I verified that this works (using kd to hack the assembly code)
Note 4:
This times-out after a specified 500ms wait/time-out (ie. the request/irp event does not get signaled within 500ms).
If I hack the assembly with kd to make it an infinite wait/time-out then it works:
- This means that this usb request (ie. the specific one that happens every 60sec) is taking far too long
- Eaton should be able to tell us what this vendor _URB request (value 0x3fd / 1021) is and why it takes so long
- We should also check a “working” system to see if the same request is being sent every 60sec and if it also times-out
- ie. if this request is not being sent and/or not timing-out (or we can stop it) then the system will not fail
Note 5:
The “status = irp->IoStatus.Status” should really be “status = irp->UserIosb->Status” (where the final status is copied into) but this doesn’t cause a problem.
It does mean that irp->UserIosb is not being used here.
Note 6:
This call (IoCompleteRequest) tells the kernel to finish and do clean-up of the irp. This includes the kernel executing nt!IopCompleteRequest to copy IoStatus into *UserIosb:
- this can happen synchronously by direct calls (w/o an APC int) as shown in one of the callstacks earlier in this email
- this can also happen asynchronously sometime later via an APC int as shown in another of the callstacks earlier in this email
Note 7:
The call_usbd_ex routine returns here and the locals on its stack should not be accessed from this point onwards:
- if the irp was handled synchronously then io_status on the current stack was previously written and will not be accessed again => no problem
- if the irp is being handled asynchronously then io_status on the stack might be written sometime after the return (on some other routine’s stack):
o if the APC int and the write of io_status happens before the penter() check then the stack is stomped w/o problems because the addr is not yet in use
o if the APC int and the write of io_status happens after the penter() check then the stack is stomped w/o problems if the addr is not currently used
o if the APC int and the write of io_status happens near the penter() check then the args being passed up the stack during that time window get stomped
this last case is where we see 1 or 2 stomped args on the stack causing a penter trap or crash
ie. stack addr of old qword io_status.Information stomped w/ 0 and old dword io_status.Status stomped w/ ntstatus 0xc0000120
Best Regards,
James