Last reviewed and updated: 10 August 2020
Analyzing user mode state from the kernel debugger appears to have become something of a black art. Some people swear it can’t be done, others swear that it can’t be done reliably, and a small few claim that they do it all the time without any problems. I’m here to say that, yes, it can be done and to grow that small few to the vast majority…
When it comes down to it, there are only three things that you need to understand in order to properly work with user mode state from a kernel debug connection. So, let’s explore each of these.
The Virtual Address Space in Windows
Windows maintains two different virtual address spaces, the user virtual address space and the kernel virtual address space. In a standard x86 installation, this division results in the low 2GB of virtual memory being given to the current user process and the high 2GB being the kernel virtual address space.
The lower portion of the address space changes depending on the thread currently executing on the processor. The higher portion of the address space however is the same across all process contexts. Thus, the lower portion of the address space is process context specific whereas the higher portion of the address space is process context independent.
WinDBG and Process Context
Understanding the virtual address space in Windows is a critical point to any analyst who wants to inspect user mode state. What one has to realize is that the debugger can only use one process context at a time to translate virtual addresses. This means that if you want to inspect user state you must make sure that you have instructed WinDBG to use the correct process context for that state. Failure to do so will lead to access errors or, even worse, incorrect or misleading information being returned.
Also worth noting at this point is that the .thread command does not change process context by default, thus simply switching to a different thread context is not sufficient to change your process context.
WinDBG and the User Mode Loaded Module List
The user mode loaded module list is our final piece to understanding working with WinDBG and user mode state. Unlike in kernel mode where we have a single loaded module list that WinDBG keeps track of, WinDBG does not keep track of the user module list for each process. Instead, WinDBG keeps a single list that represents the user module list at the time of the last .reload. What this means for you is that any time you begin working with a new user mode state, you want to make sure you refresh the user module list so that it matches the process you are analyzing.
Inspecting User State in Practice
Given that we now have the foundation, let’s put the pieces together and see some practical examples. I’ll start off by breaking in to an idle system from a live kernel debug session and inspecting the current process context:
0: kd> !process -1 0 PROCESS 8055c0c0 SessionId: none Cid: 0000 Peb: 00000000 ParentCid: 0000 DirBase: 00319000 ObjectTable: e1002e40 HandleCount: 253. Image: Idle
It’s the Idle process, which isn’t much of a shock. The Idle process is interesting in that it’s one of the two system processes, which are processes with no user mode state. After a .reload we can inspect the user module list with lmu:
0: kd> lmu start end module name
Note that there are no modules on the user mode loaded module list, which makes sense considering the fact that this is a system process. However, in the !process 0 0 output I see an instance of Notepad and I really want to set a breakpoint in that process:
PROCESS 863c22f0 SessionId: 0 Cid: 00bc Peb: 7ffdb000 ParentCid: 05fc DirBase: 06c602c0 ObjectTable: e16718e8 HandleCount: 29. Image: notepad.exe
In order to do that, I need to use WinDBG’s .process command to switch to the Notepad process context. In a live debug session we also want to specify the /i to inspect the process state invasively. This will require that we resume the target machine, after which the target will break in to the debugger in the correct process context:
0: kd> .process /i 863c22f0 You need to continue execution (press 'g' ) for the context to be switched. When the debugger breaks in again, you will be in the new process context. 0: kd> g Break instruction exception - code 80000003 (first chance) nt!RtlpBreakWithStatusInstruction: 8052b5dc int 3
From here, we should be able to inspect the current process and see that we’re in the Notepad process:
1: kd> !process -1 0 PROCESS 863c22f0 SessionId: 0 Cid: 00bc Peb: 7ffdb000 ParentCid: 05fc DirBase: 06c602c0 ObjectTable: e16718e8 HandleCount: 29. Image: notepad.exe
However, we still do not have any user modules on our loaded module list!
1: kd> lmu start end module name
Remember, WinDBG caches the user module list from the last .reload, thus we’re still using the original loaded module list from the Idle process. In order to get WinDBG to refresh the user loaded module list, we need to perform a .reload again. Though we can save a bit of time here by just instructing WinDBG to reload the user module list with .reload /user:
1: kd> .reload /user Loading User Symbols .......................
Now we can actually see some results when inspecting the user module list:
1: kd> lmu start end module name 01000000 01014000 notepad (deferred) 5ad70000 5ada8000 UxTheme (deferred) 5cb70000 5cb96000 ShimEng (deferred) … 7c900000 7c9af000 ntdll (pdb symbols) ...
At this point in the analysis, we are free to inspect user mode state or set breakpoints in user mode routines. However, be aware that setting a breakpoint in a DLL mapped into multiple processes will result in the breakpoint being set in all of those processes. Writes from the kernel mode debugger are not subject to copy-on-write, thus setting a breakpoint with bp will put an int 3 instruction in the shared physical page. You can see the results of this here:
0: kd> !process -1 0 PROCESS 863c22f0 SessionId: 0 Cid: 00bc Peb: 7ffdb000 ParentCid: 05fc DirBase: 06c602c0 ObjectTable: e16718e8 HandleCount: 29. Image: notepad.exe 0: kd> bp ntdll!ntcreatefile 0: kd> g Breakpoint 0 hit ntdll!ZwCreateFile: 001b:7c90d090 mov eax,25h 0: kd> !process -1 0 PROCESS 8612abe0 SessionId: 0 Cid: 0430 Peb: 7ffdc000 ParentCid: 02a4 DirBase: 06c60160 ObjectTable: e15d5858 HandleCount: 1115. Image: svchost.exe
A process specific breakpoint can be your savior here though:
0: kd> bp /p @$proc ntdll!ntcreatefile
Though the breakpoint will still be set in all processes sharing the page, the process specific breakpoint will cause WinDBG to only break if the breakpoint is hit by the specified process. Here we use the $proc pseudo register, which always maps to the current process.
Note what happens now if we become interested in a different process, say VMWareUser.exe:
PROCESS 86182878 SessionId: 0 Cid: 06e0 Peb: 7ffde000 ParentCid: 05fc DirBase: 06c60220 ObjectTable: e16091e0 HandleCount: 87. Image: VMwareUser.exe
We do all of the same processing as above and then check the user module list:
0: kd> .process /i 86182878 You need to continue execution (press 'g' ) for the context to be switched. When the debugger breaks in again, you will be in the new process context. 0: kd> g Break instruction exception - code 80000003 (first chance) nt!RtlpBreakWithStatusInstruction: 8052b5dc int 3 0: kd> !process -1 0 PROCESS 86182878 SessionId: 0 Cid: 06e0 Peb: 7ffde000 ParentCid: 05fc DirBase: 06c60220 ObjectTable: e16091e0 HandleCount: 87. Image: VMwareUser.exe 0: kd> lmu start end module name 01000000 01014000 notepad (deferred) 5ad70000 5ada8000 UxTheme (deferred) 5cb70000 5cb96000 ShimEng (deferred) ...
Note how it looks like Notepad is mapped into the VMWareUser.exe process. Clearly this is bogus, it’s just WinDBG using the cached user module list from the last .reload performed. Because our analysis has brought us to a new user process, we will again need to perform a .reload /user to have our module list updated:
0: kd> .reload /user Loading User Symbols .................................. 0: kd> lmu start end module name 00400000 00537000 VMwareUser (deferred) 10000000 10010000 sigc_2_0 (deferred) 5ad70000 5ada8000 uxtheme (deferred) 5b860000 5b8b5000 NETAPI32 (deferred) ...
What About Crash Dumps?
If you try to perform a .process /i command from within a crash dump, you’ll be greeted with an error:
0: kd> .process /i 898c9020
This operation only works on live kernel debug sessions due to the fact that the invasive switch requires that code actually execute on the target machine. Luckily, there is a way to force WinDBG to internally switch to a different process context without changing the state of the target. For that, we’ll use .process with the /r and /p switches. In addition to getting us into the correct process context, this will force a reload of the user symbol list:
0: kd> .process /r /p 86182878 Implicit process is now 86182878 .cache forcedecodeuser done Loading User Symbols ..................................
Additionally, .thread also takes /r and /p switches to automatically switch the debugger to the correct process context for a particular thread. This is extremely helpful if you’re moving around a full memory dump and would like to automatically have your process context set for each thread you inspect:
0: kd> .thread /r /p 863e5a60 Implicit thread is now 863e5a60 Implicit process is now 86182878 .cache forcedecodeuser done Loading User Symbols ..................................
Seeing User State with !process and !thread
Last but not least, both !process and !thread take a flag value of 0x10, which causes the extension command to perform the equivalent of a .process /r /p for the appropriate process before displaying the call stacks of the threads. Thus, instead of this:
0: kd> !thread 86153418 f … nt!KiSwapContext+0x2f (FPO: [Uses EBP] [0,0,4]) nt!KiSwapThread+0x8a (FPO: [0,0,0]) (CONV: fastcall) nt!KeWaitForSingleObject+0x1c2 (FPO: [Non-Fpo]) (CONV: stdcall) win32k!xxxSleepThread+0x192 (FPO: [Non-Fpo]) win32k!xxxRealInternalGetMessage+0x418 (FPO: [Non-Fpo]) win32k!NtUserGetMessage+0x27 (FPO: [Non-Fpo]) nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ ee59ed64) WARNING: Frame IP not in any known module. Following frames may be wrong. 0x7c90e4f4
Which aborts once entering user mode, you will see this:
0: kd> !thread 86153418 1f … nt!KiSwapContext+0x2f (FPO: [Uses EBP] [0,0,4]) nt!KiSwapThread+0x8a (FPO: [0,0,0]) (CONV: fastcall) nt!KeWaitForSingleObject+0x1c2 (FPO: [Non-Fpo]) (CONV: stdcall) win32k!xxxSleepThread+0x192 (FPO: [Non-Fpo]) win32k!xxxRealInternalGetMessage+0x418 (FPO: [Non-Fpo]) win32k!NtUserGetMessage+0x27 (FPO: [Non-Fpo]) nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ ee59ed64) ntdll!KiFastSystemCallRet (FPO: [0,0,0]) USER32!NtUserGetMessage+0xc notepad!WinMain+0xe5 (FPO: [Non-Fpo]) notepad!WinMainCRTStartup+0x174 (FPO: [Non-Fpo]) kernel32!BaseProcessStart+0x23 (FPO: [Non-Fpo])
Black Art No More!
While there’s always more to explore, hopefully this article serves to pique your interest and allow you to incorporate more user mode analysis into your kernel debugging sessions!
Analyst’s Perspective is a column by OSR consulting associate, Scott Noone. When he’s not root-causing complex kernel issues, he’s leading the development and instruction of OSR’s Kernel Debugging seminar. Comments or suggestions for this or future Analyst’s Perspective columns can be addressed to ap@osr.com.