Designing a driver model for an operating system is always a balancing act. Provide too much flexibility and you end up with unnecessary complexity in both the drivers and the operating system. However, if you don’t provide enough flexibility you stifle the industry’s ability to innovate on your platform, which either hurts adoption of your platform or pushes developers into finding unsupported ways to circumvent the model.
Windows is an interesting example of this struggle as we actually have both extremes: a flexible, native driver model that ultimately became impossible to support (I’m looking at you, WDM) and a series of locked down “mini” models that failed to evolve with the industry (SCSIPort, anyone?).
Not surprisingly, over time both of these approaches have proven problematic. The increasing complexity of the native driver model lead to system stability issues. Attempts to push the driver development community to add features to their drivers that would help overall system performance (e.g. Fast Resume) fell on deaf ears. The drivers are complicated enough already, who wants to add more features? The mini-models had the opposite effect: developers were dying to add product differentiating features to their code, but the mini-models weren’t designed to support these features and they were out of luck.
The complexity of the native driver model has, thankfully, been tamed quite nicely by the Windows Driver Framework. With WDF we get a well-defined interface that makes it easy and safe to get a simple driver working, but always with the flexibility to escape the Framework and increase our complexity.
WDF does not, however, attempt to address the issues of the mini-models. WDF is a generic driver mode for writing generic drivers, but the mini-models exist to make it easier to write drivers for specific device types. Microsoft has repeatedly chosen to not simply abandon these models, thus we’re still stuck with locked in mini-models for certain device types.
That is not to say that these models have remained static. In fact, it is interesting to watch how each mini-model has evolved over the years to address the needs of the market. The evolution of the storage driver models has been almost painful to watch (and definitely painful to experience), though with Windows 8.1 we appear to finally be on track to having a single storage driver model that meets the evolving needs of the industry.
Unfortunately, there is a downside to all of this progress: a poor story for backwards compatibility. As it turns out, it is now quite tricky to write a single, high performance storage driver that optimally supports Windows 7 and later. While there are some technical reasons for this, the primary frustrations come from the StorPort build environment and a lack of clarity in the documentation. In the interest of clearing up the story a bit, we’ll highlight our main issues that we’ve dealt with in creating our high performance NVMe controller driver for Windows 7 and later.
Good Luck with Binary Compatibility
Evolution is messy business, as is evident in the header file StorPort.h. As the driver model has evolved, there have been multiple different methods used to add APIs, change fields of structures, etc. Sometimes an attempt was made to provide single binary compatibility on all platforms, and sometimes binary compatibility was clearly not a priority
My favorite example of the chaos is the HW_INITIALIZATION_DATA structure, which each driver is required to initialize and pass to StorPortInitialize. Initially, this structure was inherited verbatim from the SCSIPort model. When virtual storage adapter support was added in Windows Server 2003 SP1, this structure needed modification to support some new, virtual adapter specific callbacks. Instead of simply changing the structure, the existing structure was copy/pasted and renamed VIRTUAL_HW_INITIALIZATION_DATA. Additional fields were then added to this structure to support virtual adapter operations.
The virtual adapter driver still calls the same StorPortInitialize routine as always to initialize the driver, so how does StorPort know which one you want? By the HwInitializationDataSize field, of course! The driver sets this field prior to calling StorPortInitialize and StorPort uses the size to determine which structure the driver is passing.
In Windows 8, StorPort has some additional fields to be added to both of the initialization structures. Instead of simply adding the fields to both structures, the developers instead chose to abandon the multiple structures and go back to a single HW_INITIALIZATION_DATA structure. This structure now includes the virtual extensions as well as the Windows 8 specific fields. StorPort detects that a driver is using the new structure by checking the HwInitializationDataSize for the new structure’s size.
This model makes it very difficult to provide a single binary that takes advantage of the new features on Windows 8 while maintaining backwards compatibility on Windows 7. A single driver can’t have both structure sizes without making their own private copy of one of the structures, which opens another can of worms. And even if you do decide to solve this one you’re not even out of DriverEntry yet!
Inconsistent Version Information in Headers
StorPort is constantly being updated to add features and support. Huge changes can even come in hotfixes and service packs.
For the most part, the documentation does a fine job of indicating which releases of the operating system support which changes. The header files are a different story though. They often don’t provide compile time checks, meaning you can call functions or set options that don’t exist or, worse, set options that do exist but don’t work properly.
STOR_PERF_CONCURRENT_CHANNELS is a good example of the trouble this can cause. This feature was added to the headers in Windows 7 and allows the driver to avoid some significant locking in the StorPort wrapper, thus increasing driver throughput. However, the current documentation states that you must not set it prior to Windows 8 and posts online indicate that there are stability issues under stress with this option set on earlier platforms. Setting this option in your driver prior to Windows 8 is a time bomb, but the build environment will happily let you do it even when compiling for Windows 7 as your target.
Is that an SRB or an SRB?
StorPort is an evolution of SCSIPort, which was the driver model for writing storage adapter drivers back when everything was SCSI. To this day, StorPort still treats all storage devices as if they are SCSI devices, sending the drivers SCSI_REQUEST_BLOCK structures containing Command Data Blocks (CDBs) and targeting the controller’s devices via their Bus, Target, and LUN (BTL) address.
It is no longer the 1980s and our storage devices are no longer SCSI. Given that our 1980s based driver model continues to talk to us in terms of SCSI, our storage controller drivers spend much effort translating SCSI semantics and operations into something that is specific to our bus.
Windows 8 does not do anything to change this, we’re still stuck dealing with handling SCSI requests in our storage driver. However, the groundwork for a new future has been laid by the introduction of the STORAGE_REQUEST_BLOCK. As the name implies, this structure represents a storage request to the controller. Of course, the only type of storage request you can currently send is SCSI, but the structure is extensible enough to add a different technology in the future. The structure may also be extended to support device addressing schemes other than BTL.
Drivers must opt in to this new request block structure and, to maintain backwards compatibility, all StorPort entry points continue to pass a SCSI_REQUEST_BLOCK pointer as the parameter. The driver must determine which type of structure they actually receive based on the structures’ shared Function member. STORAGE_REQUEST_BLOCKs have a function of SRB_FUNCTION_STORAGE_REQUEST_BLOCK, everything else is a SCSI_REQUEST_BLOCK (or one of its variants).
Due to the extensibility and overall fanciness of the new structure, retrieving fields of the structure is often not as simple as dereferencing a field by name. Several macros are provided for parsing these structures and retrieving individual fields. For example, SrbGetScsiData retrieves SCSI related information from a STORAGE_REQUEST_BLOCK that describes a SCSI operation (CDB, Sense Buffer, etc.). As an aside, the new structure also makes it difficult to extract this information while debugging. See Sidebar – Dumping SRBs in WinDbg: How Hard Can It Be? (at the end of this article).
As you can imagine, the code to handle both the old and new structures in a single driver quickly becomes spaghetti. For every single field that the driver wants to retrieve, the driver must first determine which type of structure it is and then either directly dereference the field or call a macro to do it.
Thankfully, a header file is provided that does most of the work for you: srbhelper.h. This header contains many inline functions to retrieve fields from either type of structure – just call with the SRB pointer and the inline function does the rest. For example, check out the implementation of SrbGetCdbLength:
FORCEINLINE UCHAR SrbGetCdbLength( _In_ PVOID Srb ) { PSTORAGE_REQUEST_BLOCK srb = (PSTORAGE_REQUEST_BLOCK)Srb; UCHAR CdbLength = 0; if (srb->Function == SRB_FUNCTION_STORAGE_REQUEST_BLOCK) { SrbGetScsiData(srb, &CdbLength, // CdbLength8 NULL, // CdbLength32 NULL, // ScsiStatus NULL, // SenseInfoBuffer NULL); // SenseInfoBufferLength } else { CdbLength = ((PSCSI_REQUEST_BLOCK)srb)->CdbLength; } return CdbLength; }
Pretty neat, right? Now you just need to discipline yourself to actually call this function every time you need the length of the CDB and you’re done! Well, not quite…These inline functions only exist when building for Windows 8 and later, so you can’t call them for your Windows 7 builds! The pesky backwards compatibility story kills it every time.
In the interest of keeping our code clean, we chose to layer our own wrappers over the wrappers. Given that the STORAGE_REQUEST_BLOCK structure doesn’t exist on Windows 7, we can assume that when built for Windows 7 we’re dealing with SCSI_REQUEST_BLOCKs and can let the srbhelper.h functions figure it out on Windows 8 and later. For example, see our implementation of OsrSrbGetCdbLength:
FORCEINLINE UCHAR OsrSrbGetCdbLength( _In_ PVOID Srb ) { #if (NTDDI_VERSION >= NTDDI_WIN8) return SrbGetCdbLength(Srb); #else return ((PSCSI_REQUEST_BLOCK)Srb)->CdbLength; #endif }
osrsrbhelper.h turned into a very long module that was not very fun to write. However, our mainline paths are now clean and free from distinguishing between the two different request formats. To enforce the usage of these functions, the drivers immediately cast all SRB pointers to PVOID and treat them as opaque throughout the driver.
On the Right Track?
We are clearly at a pivotal moment for the storage driver model in Windows. Many changes are being made, including ones that are not simply reactionary but are attempting to plan for the future. This is creating some pain now for those of us still worried about the past, but hopefully we’re trading that for some flexibility in the future.
[infopane color=”6″ icon=”0182.png”]Dumping SRBs in WinDbg: How Hard Can It Be?
Given that StorPort abstracts the operating system from the driver writer, when something goes wrong it’s often quite difficult to get a clear picture of exactly what’s happened. This is precisely why other abstractions provide a set of custom debugger extensions for use by driver writers. For example, WDF provides the excellent WDFKD.dll with commands for displaying structures and logs. NDIS does the same with NDISKD.dll, which is the absolute gold standard for debugger extensions. Seriously, even if you’re never going to write an NDIS based driver you should try playing with this extension. It will make you wish you had a network driver problem to debug.
StorPort, on the other hand…Not so much. There are in fact no debugger commands for interpreting StorPort state or structures. Not even a !srb command to interpret a SCSI_REQUEST_BLOCK or a STORAGE_REQUEST_BLOCK. This is particularly painful with respect to the new STORAGE_REQUEST_BLOCK, good luck trying to manually walk that structure in the debugger and extract something useful.
What’s frustrating is that this need isn’t anything new or surprising. Even SCSIPort had two extension DLLs, SCSIKD and MinipKD, that, while not as useful when compared to modern standards, at least gave you some support. There even was a !srb. Unfortunately that command appears to have rotted, so all you get when running it now is this:
0: kd> !minipkd.srb ffffe001`73af95f8 Could not read SRB @ ffffe00173af95f8
Hopefully in time we’ll see some new or updated commands in the standard tools. In the meantime, we’re stuck manually interpreting these structures or writing our own debugger commands.
[/infopane]