Diagnosed yet another crash today that is likely due to the usage of IoBuildDeviceIoControlRequest. Long ago I was burned by this API and vowed to never use it again, but somehow I neglected to share this with everyone else. Sorry about that!
The trouble with this API is that it’s an attractive nuisance. You need to send an IOCTL synchronously and this API looks like it totally does the trick. So, you end up with something like this and take a nap in the afternoon with all the time you saved from having to build the IRP yourself:
VOID SendAnIoctl( PDEVICE_OBJECT TargetDevice) { NTSTATUS status; KEVENT event; IO_STATUS_BLOCK iosb; PIRP irp; ULONG nothingOutput; irp = IoBuildDeviceIoControlRequest(IOCTL_OSR_NOTHING, TargetDevice, NULL, 0, ¬hingOutput, sizeof(ULONG), FALSE, &event, &iosb); if (irp == NULL) { goto Exit; } status = IoCallDriver(TargetDevice, irp); if (status == STATUS_PENDING) { KeWaitForSingleObject(&event, Executive, KernelMode, FALSE, NULL); status = iosb.Status; } Exit: return; }
Then you add some code to actually call the function and huck the IOCTL at the target device:
SendAnIoctl(devObj);
The code sort of looks broken because you allocate the IRP but never free it. However, that’s the magic (and trouble) with this API. It is slightly unusual in that it allocates and initializes a threaded IRP. This means the resulting IRP is queued to the current thread and the I/O Manager is responsible for freeing it automatically when the IRP is complete. You can verify that the IRP is queued by checking out the !thread output after the successful call to IoBuildDeviceIoControlRequest:
!thread THREAD fffffa8019a15b50 ... IRP List: fffff98010b82ee0 < == Hey look, it's my IRP!
This IRP will be freed by the I/O Manager by the use of a Special Kernel APC (SKAPC) for I/O Completion. The I/O Manager will queue the SKAPC to the requesting thread when the IRP is complete, which will result in the requesting thread calling the function IopCompleteRequest. This callback will do all of the final processing to complete the IRP, including dequeueing the IRP from the thread, freeing data buffers, setting the event, etc.
The SKAPC can only run when KeAreAllApcsDisabled returns FALSE, which means the thread is running at IRQL PASSIVE_LEVEL and is not in a guarded region (KeEnterGuardedRegion). If you are at IRQL >= APC_EVEL or are in a guarded region, the APC will sit in the APC queue and wait to be delivered.
Now, here comes the fun part and the trouble with IoBuildDeviceIoControlRequest: imagine that you call SendAnIoctl at IRQL APC_LEVEL. For example, you might wrap the call to SendAnIoctl with an acquire/release of a FAST_MUTEX:
ExAcquireFastMutex(&mutex); SendAnIoctl(devObj); ExReleaseFastMutex(&mutex);
The IoBuildDeviceIoControlRequest then allocates a threaded IRP, which you submit to the lower driver using IoCallDriver. The lower driver completes the IRP with STATUS_SUCCESS, which causes you to not wait on the stack allocated KEVENT. However, because the IRP was threaded, the I/O Manager queued an SKAPC to do the final completion processing on the IRP. Using the !apc command we can see this APC stuck on the queue and waiting for execution:
!apc *** Enumerating APCs in all processes Process fffffa8018dfe040 System Thread fffffa8019a15b50 ApcStateIndex 0 ApcListHead fffffa8019a15ba0 [KERNEL] KAPC @ fffff98010b82f58 Type 12 KernelRoutine fffff80002a9a010 nt!IopCompleteRequest+0 RundownRoutine fffff80002e69ff0 nt!IopAbortRequest+0
This APC will not run until we’ve returned to PASSIVE_LEVEL. In our case, we won’t return to PASSIVE_LEVEL until after SendAnIoctl has returned and we’ve had a chance to drop the FAST_MUTEX. Unfortunately, this means that the I/O Manager will do final completion processing on an IRP that contains references to local variables in an unwound stack frame (see event and iosb in the SendAnIoctl code).
This causes all sorts of havoc that is difficult to diagnose because the offending code is already unwound. Here’s the crash from the example I looked at today:
# Call Site 00 nt!KeBugCheckEx 01 nt!KiBugCheckDispatch 02 nt!KiPageFault 03 nt!KiTryUnwaitThread 04 nt!IopCompleteRequest 05 nt!KiDeliverApc 06 nt!KiCheckForKernelApcDelivery 07 nt!MmAccessFault 08 nt!MmCheckCachedPageStates 09 nt!CcMapAndRead 0a nt!CcMapData 0b fastfat!FatReadVolumeFile 0c fastfat!FatMountVolume 0d fastfat!FatCommonFileSystemControl 0e fastfat!FatFsdFileSystemControl 0f fltmgr!FltpFsControlMountVolume 10 fltmgr!FltpFsControl 11 nt!IopMountVolume
Note how IopCompleteRequest tries to wake up a thread (KiTryUnwaitThread) and then quickly dies. The issue is that it’s trying to set a KEVENT that has been unwound from the stack. Oops!
There are two solutions to this problem:
- Make sure you never use this API at APC_LEVEL or in a guarded region. You can validate this assumption by using KeAreAllApcsDisabled
- Allocate the IRP yourself using IoAllocateIrp and send it along with a completion routine. This will create a non-threaded IRP which doesn’t have the APC problems. If you still want the synchronous behavior, you’ll need to provide an event to the completion routine that you wait on in your mainline code