Last reviewed and updated: 10 August 2020
A fundamental complexity in Windows kernel mode development is that the execution environment comes from a different era in software development. Basically, the idea is this: if you’re writing kernel mode code, then you must know what you’re doing. If you know what you’re doing, then we don’t need to validate your function parameters and therefore we can shave off some CPU cycles. We also don’t need to bother validating the execution environment at all because, you know, everyone knows what they’re doing.
If you don’t know what you’re doing, then you’re stupid, according to this philosophy. If you don’t pass valid arguments, you get what you deserve. If you don’t understand the rules of the execution environment, then kernel mode software development must be too hard for you. Go away. Do something else.
You can argue whether this is a good philosophy or a bad philosophy. But, regardless whether it’s good or bad, the problem is that this approach doesn’t really scale. Without proper validation, you can easily crash the system by calling a function with an invalid argument. Even worse, the system might not crash but instead subtly corrupt an internal structure or return an invalid result. Also, you need great documentation for people to, you know, actually learn what the rules are for the environment.
Documentation issues aside, Windows 2000 introduced Windows Driver Verifier to address the issue of insufficient runtime validation of arguments and execution environment. This allows us to put the OS into a special mode where nominated drivers aren’t entirely trusted and we can gain the benefits of a limited amount of OS level validation. With each iteration of Windows, Verifier has become more and more powerful and maintains its title as the single greatest gift that the Universe has bestowed upon driver developers. Passing Verifier is the minimum requirement for professional software development in the Windows operating system. If you’re not running your driver under Verifier, you have failed. Seriously.
I was recently talking about Verifier with an IT administrator for a large organization. He mentioned that he had a lot of systems crashing and went around enabling Verifier for various third party drivers on the systems hoping to find the culprit. The systems started crashing immediately and directly pointing to a bug in a third party driver. After bringing the crash up with the vendor, their response was, “shut Verifier off, we don’t test with that.” This is so wrong that I’m close to publicly shaming the company. I’m just not sure what they’re thinking. My suggestion to the IT admin was to beat the company harder and, if they won’t listen, find a replacement product.
“Fools!” you say, “I use Verifier all the time. I am SAFE!” However, you might be missing something critical in your testing: just enabling Verifier for your driver only is hardly ever sufficient. Do you have a KMDF driver? A FltMgr filter? An NDIS driver? StorPort miniport? For any of these, you really need to enable Verifier on both your driver and the wrapper library.
The problem is that Verifier is validating calls into the operating system. For the above drivers, your driver isn’t calling into the operating system, at least not most of the time. The library is. For example, if you’re KMDF driver calls WdfDeviceCreate, it’s the Framework that calls ExAllocatePool to allocate your WDFDEVICE and your Device Context. The buffers allocated in this case won’t be subject to validation by Driver Verifier unless you have explicitly enabled Driver Verifier for the Framework. If you only enable Driver Verifier for your driver, the only pool validation that you get will be for calls that your driver makes directly to ExAllocatePool (Figure 1).
So, the rule is, when you enable Driver Verifier for your driver, always also enable Driver Verifier for the wrapper/library that your driver uses (Table 1).
Another option that people frequently miss: you can also enable Verifier on the NTOS Kernel image! This means that allocations made by the OS itself (e.g. File Objects) will also be subject to Verifier’s checking. This results in a unique form of parameter validation that you might not catch otherwise.
Ultimately, the lesson is that more Verifier is a good thing so make sure you enable it early, often, and for any driver that your driver touches. Of course, the downside to Verifier is that the system behaves differently when Verifier is enabled, thus you’re not actually testing the real customer environment. So, unless you’re going to make all your customers turn on Verifier as part of install, make sure you also test without Verifier enabled as part of your QA. See the sidebar, Still Want More Validation… below for another helpful tip.
[infopane color=”6″ icon=”0182.png”]Still Want More Validation? Be sure to Try WDFVerifier and the Checked Build
If you’re writing a WDF driver, you definitely want to also enable WDF Verifier. See the topic Using KMDF Verifier in the WDK Documentation (Google for it when the provided link breaks, as it will).
Whether you’re writing a WDF driver or not, there’s still a lot of additional checking of which you can take advantage. For example, at some point you should always test with the checked build of Windows, the checked build of any wrapper/library components used by your driver, and the checked build of any driver(s) with which your driver interacts. The checked OS image and HAL are distributed as part of the WDK. You can find the documentation under Downloading a Checked Build of Windows in MSDN. In addition to these, download a complete checked build of Windows (assuming you can find it; they’re getting harder and harder to find with each OS release). You can either choose to install the complete checked build, or you can extract (with some work) just the checked images for the wrapper/library that your driver uses, plus the checked executables for any drivers with which your driver interacts. For example, if you’re writing a file system filter driver, you’ll want to use the checked build of Filter Manager (fltMgr.sys) plus the checked builds of the file systems that you filter. Our experience is that this can be very helpful in identifying potential problems.
[/infopane]