When we teach classes, whether it’s KMDF, Software Drivers, and Debugging, we always provide our students with a long list of things that are “great ideas” that they should use. Some of these are ideas that make their lives easier as they go about their jobs of developing and maintaining Windows drivers. Others are best practices.
Scott and I recently were discussing some of these recommendations… and how we should probably start listening to what we say, and actually do what we say more often than we sometimes do. Here’s part of that discussion:
If you prefer to read than to watch, here’s more or less of a transcript of our discussion.
Scott: I always find myself telling students very strongly to do certain things.
Peter: I mean, that’s what the whole class is all about.
Scott: You have to keep them in line.
Peter: Exactly.
Scott: But here’s the thing: When I find myself telling them to do these things, I realize that these are things I either don’t do or will never do.
Peter: Or that you really intend to do, or you’d like your student to at least believe that you do.
Scott: Exactly. But, somehow, you simply don’t have the time, the desire, or… really the discipline.
Peter: It comes down to engineering discipline, doesn’t it? The follow through.
I have the exact same experience. I know when I am walking through code examples, almost all code examples will consistently use WDF_NO_OBJECT_ATTRIBUTES, for example, or WDF_NO_HANDLE … we should probably mention these are just #define statements that equate to NULL. It’s a way of providing in-line documentation, so you can write:
WDF_NO_OBJECT_ATTRIBUTES,
when invoking a function, instead of supplying NULL and then having to write a comment:
NULL, // No object attributes provided
Scott: Yup. Usually early on in our WDF class we show people that and strongly urge them to use this convention, and to do so consistently. We tell them how thoughtful it was of the developers to provide these various self-documenting #defines, and how using them makes things clear for future maintainers. Then, towards the end of the class, the examples in our slides stop using them.
Peter: What’s even worse, I think, is when you you’re working on own code and see NULL for a parameter… and nothing else. No comment, no nothing. So not only have you not used the provided #define, like you tell people to do in class, but you didn’t even have the brains to write a comment describing what the parameter was. So you fail on both counts.
What are some other things we tell people to do in class, but maybe don’t follow through with.
Scott: I’m always telling people about what a great new feature they’ve added to the operating system in the no execute nonpaged pool. I tell them how you wouldn’t think that there could be serious security concerns with your pool allocations, but it turns out that there are a lot of people out there with a lot of time on their hand who come up with all kinds of nifty, clever ways to gain control in the operating system. So, I tell them, no execute nonpaged pool is something easy and simple that you can do to prevent many common exploits from happening, and that they should always use this feature.
Peter: Increase the security/reliability of your drivers, and there’s nothing to it.
Scott: Right.
Peter: So, this is something you do all the time when you’re writing drivers here at OSR, right?
Scott: Well, all the time that I think of it. Which is, basically…
Peter: Never?
Scott: Well, close to never at least. I mean, I want to do it.
Peter: Of course.
What I talk about in my class a lot, and I always tell people, “Now this is something I always do”, has to do with error codes. I tell them to always try to return a unique error code from each place in your driver where you return an error that’s non-trivial. That way, when you get an error back, you’ll know exactly where that error came from in your code.
Scott: Sure. You should open up NTStatus.h, and browse the status codes available to you …
Peter: And find something that’s descriptive …
Scott: There’s a rich set of error codes that you can return.
Peter: Of course I also tell people that they should look up to see what Win32 error those native errors translate to and comment in their code, “Translates to Win32 error so and so and so.” Oh, and they should make sure that the native NT status codes that they’re returning are actually returned as unique Win32 error values.
Scott: Not just as “error invalid parameter”.
Peter: Right, exactly. It’s interesting because you tell your students, “Oh well make sure that you return unique codes.” And they’ll say, “Okay, yeah I do that! I return STATUS_INVALID_PARAMETER_1, sSTATUS_INVALID_PARAMETER_2, STATUS_INVALID_PARAMETER_3.” NT has them up to STATUS_INVALID_PARAMETER_12 I think. The problem? They all map to the same Win32 error.
Scott: So, to find out which error is firing, please reproduce the problem using the native API and get me back the real status code, not that stupid Win32 thing.
Peter: That should work incredibly well. Using unique error codes is actually something I typically do… but I’m not really as consistent as I’d like to be, certainly. Now that we’re confessing, what are some more of our sins?
Scott: You know how you get crashes from the field from different versions of your driver and WinDbg supports this awesome thing called the symbol server. You can use the little SymStore utility, you index your PDBs, takes you an afternoon, it’s a job for an intern. People love that. People are like, “Wow that’s really great.” It turns out you can also have the compiler index your source code, so you can say which version of your source code from your source code database matches the version of the driver running on your target.
Peter: In fact a buddy of ours at Microsoft is the one who worked on this feature isn’t it?
Scott: Yup. He asked me to test it out for him.
Peter: Right, I remember that.
Scott: It was awesome, get a crash open it up, the right source code gets synced from the source code server, on local disk, open it up, makes support a breeze. I tell everyone how great this is. I push everyone to it. I show them the documentation.
Peter: We don’t have it set up here, right?
Scott: No. We don’t use it. Not for the gazillion projects with the gazillion versions that we have out in the world, no.
Peter: It would be really helpful. Sigh…
We’re creating a problem with this talk, you know that, right?. We’re telling people all our secrets. They’ll never believe us ever again.
Scott: Well, look: This is the whole list.
Peter: That’s right.
Scott: This is all of them.
Peter: We have engineering discipline. We actually do everything else we say you should do in class.
Scott: Exactly.
Peter: In any case, you should definitely do as we say, not as we do. At least in most cases.
I think we’re done. This is Peter.
Scott: This is Scott.
Peter: See you later.