Being primarily kernel mode programmers at OSR, we have gotten into the habit of using Reader/Writer locks. These locks, known as Executive Resources (EResources), are great because they can be used at IRQLs less than DISPATCH_LEVEL and allow shared readers and exclusive writers. In addition they also allow a thread recursive access, as long as the acquisition type is compatible with the lock’s current acquisition level.
That being said, it has always baffled us, at OSR, why something similar to EResources was never available in user mode. We recently had a need to move a piece of software that previously ran in kernel mode to user mode and that software happened to use EResources as its primary locking model. Our choices were either to reengineer the locking model used by the code or to implement EResource semantics in user mode. Our choice was to implement EResource semantics in user mode and this article documents our implementation.
Overview
We’re going to assume that readers of this article are familiar with EResources, but realize that some people are not, so for the unfamiliar here’s an overview…..
As we mentioned previously an EResource is a reader/writer lock. This lock can be acquired in 1 of 2 modes, either shared or exclusive As the modes imply this lock can be held simultaneously by multiple readers (EResource acquired in shared mode) or by one writer (EResource acquired in exclusive mode).
When a thread attempts to acquire an EResource, it can specify whether or not it wants to wait if the specified EResource is not immediately available. If it elects to wait, the thread will be put into a wait state until such time as the lock becomes available, otherwise the acquisition of the EResource will fail. An EResource can be acquired by a thread recursively as long as the acquisition type (i.e. shared or exclusive) is compatible with the current acquisition type of the lock. What this means is that if a thread holds a lock exclusive and tries to require the lock shared, the acquisition will be granted. If, however, the thread holds the lock shared and tries to reacquire the lock exclusive, a deadlock will occur (as it does in kernel mode) since the desired acquisition type is incompatible with the currently held acquisition type. Threads other than the current thread can attempt to acquire the EResource also and will only be allowed access if the mode of the request is compatible with the current state of the lock. So, for example, if Thread A owns the EResource shared, Thread B will also be allowed access to the EResource if it attempts to acquire it shared. If it tries to acquire the EResource exclusively, access will be denied (i.e. it will either be blocked if the user specified to wait for the EResource, or a status will be returned indicating that the EResource was not acquired).
Other features of EResources are:
- If an EResource is held exclusively, it can be converted to shared access
- Users can check to see if an EResource is currently held exclusively or shared
- Users can check to retrieve the number of Shared Waiters or Exclusive Waiters for an EResource.
- EResources must be initialized before being used and must be deleted when they are finished being used.
Now that the basics have been explained, let’s see the functions available to us.
Functions
In order to use OSRERESOURCEs (our version of ERESOURCEs) there are a set of functions that the user has available to them. The functions are listed in Figure 1.
To use an OSRERESOURCE the caller would call OSRInitializeResource in some initialization routine, and then call one of the OSRAcquireResourceXXXLite functions where XXX is either shared or exclusive. After doing some work it would then call OsrReleaseResource to release the held resource. Finally, when done with the resource, the user would call OSRDeleteResourceLite to delete the resource.
So as you can see OSRERESOURCEs are pretty easy to use. Since this is a user mode library free for anyone to use, I guess it would be a good thing to discuss what is going on underneath the covers of the OSRERESOURCE implementation, so that is what we’ll discuss next.
Underneath the Covers of OSRERESOURCEs
In this section we will discuss how we implemented OSRERESOURCEs. Our implementation creates a Dynamic Link Library called “UMERESOURCE.DLL” This DLL has a set of “C” interfaces that we listed in Figure 1 that are defined in the include file “UMERESOURCE.H”. These interfaces work on an OSRERESOURCE structure and are backed by a “C++” OSREResource class which is defined in “ERESOURCE.H”. The OSREResource class and associated OSRERESOURCE_STATE structure are defined in Figure 2 .
// This structure keeps track of the state of the resource. We need // to know how the resource was acquired and how many times it was // recursively acquired. // typedef struct _OSRERESOURCE_STATE { union { struct { ULONG State; // Acquisition State ULONG AcquireCount; // Count of times acquired }; ULONGLONG Reserved; }; } OSRERESOURCE_STATE, *POSRERESOURCE_STATE; class OSREResource : public CObject { public: OSREResource(); virtual ~OSREResource(); enum { ACQUIRED_NONE, ACQUIRED_SHARED, ACQUIRED_EXCLUSIVE } ACQUIRED_STATE; BOOLEAN AcquireResourceExclusive(BOOLEAN Wait); BOOLEAN AcquireResourceShared(BOOLEAN Wait); BOOLEAN AcquireSharedStarveExclusive(BOOLEAN Wait); BOOLEAN IsResourceAcquiredExclusive(); BOOLEAN IsResourceAcquiredShared(); VOID ReleaseResource(); VOID ConvertExclusiveToShared(); VOID ReleaseResourceForThread(ERESOURCE_THREAD ResourceThreadId); ULONG GetSharedWaiterCount(); ULONG GetExclusiveWaiterCount(); NTSTATUS InitializeResource(); NTSTATUS DeleteResource(); private: mutable CRITICAL_SECTION m_CriticalSection; typedef std::map<DWORD,OSRERESOURCE_STATE> CMapThreadToLockState; CMapThreadToLockState m_LockStateMap; volatile DWORD m_SharedCount; volatile DWORD m_StarveExclusiveCount; volatile DWORD m_ExclusiveCount; volatile DWORD m_SharedWaitersCount; volatile DWORD m_ExclusiveWaitersCount; HANDLE m_WaitEventHandle; BOOLEAN m_Initialized; };
This class contains:
- m_CriticalSection which is used by the code to synchronize access to the OSREResource data structures, ensuring that only one thread is accessing them at one time.
- m_LockStateMap table which maps a Thread identifier to its resource state (i.e. the mode in which it acquired the lock). This resource state is an OSRERESOURCE_STATE structure, defined in Figure 2, which is composed of 2 fields, State and AcquireCount. State is used to keep track of how the resource was acquired, ACQUIRED_NONE, ACQUIRED_SHARED, or ACQUIRED_ EXCLUSIVE, and AcquireCount is used to keep track of the number of times the thread has acquired the resource (remember that resources can be acquired recursively).
- m_WaitEventHandle – the handle to the notification event that all threads wait on when trying to acquire the resource.
- m_SharedCount – number of shared holders of the resource. Since threads can hold the resource recursively this does not necessarily reflect the number of threads holding the resource simultaneously.
- m_StarveExclusiveCount – number of threads waiting to acquire the resource in shared mode, but want to starve exclusive waiters.
- m_ExclusiveCount – number of exclusive holders of the resource, which is usually one, but since a thread can acquire resource lock recursively, this count would reflect that.
- m_SharedWaitersCount – number of threads waiting to acquire the resource in shared mode.
- m_ExclusiveWaitersCount – number of threads waiting to acquire the resource in exclusive mode.
- m_Initialized – Boolean indicating whether or not the resource was initialized by a call to OSRInitializeResource.
The code which implements this class uses the m_CriticalSection to synchronize access to the rest of the OSREResource data, while the m_LockStateMap is used to keep track of each holder of the resource, the mode of acquisition (State), and the number of times that the acquisition was made since a holder could acquire the resource recursively (AcquireCount). The data members’ m_SharedCount and m_ExclusiveCount are also used to keep track of the acquisition counts.
For any thread which tries to acquire the OSREResource and elects to wait for the resource, the m_WaitEventHandle is the handle to the event that the thread will wait on. This event is called a “NotificationEvent”. What this means is that unlike a “SynchronizationEvent” where only the first waiting thread is woken up, when the event is set, all waiting threads are woken up. This means that each awakened thread has the possibility of acquiring the resource. Consider the scenario where a thread holds the resource exclusively and 10 threads are waiting to acquire the resource shared. In our implementation all the shared waiters will be awakened when the exclusive holder releases the resources and will be able to acquire the resource.
Since the source code is readily available for download, we don’t want to spend a lot of time going through it. What we do want to do however is go over AcquireResourceShared and ReleaseResource to at least give you a feel for how the code works.
AcquireResourceShared
Figure 3 contains the AcquireResourceShared function. As the name implies, this function attempts to acquire the OSREResource shared and if it cannot, waits if the user indicated wait for the resource to become available. Remember that our code acquires m_CriticalSection to ensure proper synchronization when access the class data structures.
BOOLEAN OSREResource::AcquireResourceShared(BOOLEAN Wait) { DWORD cThread = GetCurrentThreadId(); ULONG loopCount = 0; EnterCriticalSection(&m_CriticalSection); m_SharedWaitersCount++; while(TRUE) { // See if we already own the lock. // CMapThreadToLockState::iterator ite = m_LockStateMap.find(cThread); if(ite != m_LockStateMap.end()) { // // We already own it, let's see what access we already have. // ite->second.AcquireCount++; if(ite->second.State & ACQUIRED_SHARED) { // We already have it shared, so just bump the count. // m_SharedCount++; } else if(ite->second.State & ACQUIRED_EXCLUSIVE) { // We have it exclusive, so just give it to the caller exclusive // m_ExclusiveCount++; } m_SharedWaitersCount--; LeaveCriticalSection(&m_CriticalSection); return TRUE; } else if(m_ExclusiveCount || m_ExclusiveWaitersCount) { // Someone else owns it exclusive or there are exclusive waiters, so we have to wait. // if(!Wait) { m_SharedWaitersCount--; LeaveCriticalSection(&m_CriticalSection); break; } LeaveCriticalSection(&m_CriticalSection); WaitForSingleObject(m_WaitEventHandle,INFINITE); // Add a wait loop to prevent this thread from hogging the CPU. The event gets set // when the holder releases the lock and we have to ensure that everyone gets a chance // to run before continuing. // loopCount++; if(loopCount == 5) { Sleep(200); loopCount = 0; } EnterCriticalSection(&m_CriticalSection); continue; } else { // Nobody has it or wants it, so we will take it. We reset the event, so that it is no // longer signaled. Anyone who was waiting for the event should already be woken up. // OSRERESOURCE_STATE state; state.State = ACQUIRED_SHARED; state.AcquireCount = 1; m_SharedCount++; m_SharedWaitersCount--; m_LockStateMap.insert(std::make_pair(cThread,state)); ResetEvent(m_WaitEventHandle); LeaveCriticalSection(&m_CriticalSection); return TRUE; } } return FALSE; }
This routine first looks to see if the calling thread already holds the resource by looking for the threads’ Id in the m_LockStateMap. If the thread is found, then the code updates the ownership counts and returns TRUE to the caller indicating that the resource has been acquired.
If the thread Id was not found in the m_LockStateMap, the code then checks to see if the resource is held exclusively or if there are threads waiting for the resource exclusively. If there are, then this indicates that the caller cannot acquire the resource. What is done next depends on the input Wait parameter. If this parameter is TRUE, then the caller will wait on the m_WaitEventHandle. If this parameter is FALSE, the code will return to the caller indicating that the resource was not acquired.
Finally, if the resource is not held exclusively, or if there are no threads waiting for the resource exclusively, then the code inserts an entry into the m_LockStateMap. This entry indicates how the resource was acquired, updates the appropriate counts in the class, and returns to the caller indicating that the resource was acquired.
There is one thing that we must highlight here and it is that the loop code follows the WaitForSingleObject. Keep in mind that there could be many threads, with various scheduling priorities, wanting to acquire the OSREResource in different ways, all waiting for it to become free. If we automatically gave acquisition of the OSREResource to the first thread through the code, we could potentially starve other waiters, since we know that highest priority will almost always get the CPU first. We didn’t want to keep track of the waiters in a list and heuristically pick the next owner, so we decided to make it somewhat random (BTW, if you don’t like it, change it and let us know what your solution is and why it is better!). Therefore we have all threads coming out of the WaitForSingleObject sleeping for about 200 milliseconds, to let all threads that were blocked have time to run through the acquire code again and reassess their access to the lock.
That’s all there is to acquiring the OSREResource shared, so let’s see how you release it.
ReleaseResource
Figure 4 contains the code for ReleaseResource. As the name implies, this code releases an owned OSREResource. Remember that our code acquires m_Critical Section to ensure proper synchronization when accessing the class data structures.
VOID OSREResource::ReleaseResource() { DWORD cThread = GetCurrentThreadId(); EnterCriticalSection(&m_CriticalSection); // // See if we own the lock. We'd better..... // CMapThreadToLockState::iterator ite = m_LockStateMap.find(cThread); if(ite != m_LockStateMap.end()) { // // We own it. Decrement the ownership count based upon // our access. // if(ite->second.State & ACQUIRED_SHARED) { m_SharedCount--; } else { m_ExclusiveCount--; } // // Look at our ownership count in the lowpart of the // large integer. If it is 1, then this is our // last reference to the lock, so we will delete // ourselves from the ownership map. // if(ite->second.AcquireCount == 1) { // // erase us, we no longer own the lock. // m_LockStateMap.erase(ite); // // Wake up anyone waiting on access to the lock. // SetEvent(m_WaitEventHandle); } else { // // We still have outstanding references.... // ite->second.AcquireCount--; } LeaveCriticalSection(&m_CriticalSection); } else { RaiseException(0xC0000002,EXCEPTION_NONCONTINUABLE,0,NULL); } }
The first step for the code is to attempt to find the calling thread in the m_LockStateMap, which keeps track of the threads that are currently holding the OSREResource. If the calling thread is not found in the m_LockStateMap structure, the code will raise a 0xC0000002 exception to indicate that the OSREResource was not held by the caller. If the caller was found in the m_LockStateMap, the code will decrement either the m_Shared Count or m_ExclusiveCount field, depending upon how the caller had acquired the OSREResource.
The code then looks at whether or not this is the calling thread’s last reference to the OSREResource, which would indicate that the OSREResource is now free to be acquired by other threads. It checks for the last reference by looking at the AcquiredCount field of the returned OSRERESOURCE _STATE structure. If the AcquireCount is not 1, then the code knows that the OSREResource is still not available to other acquirers. If however the AcquireCount field is 1, then the code knows this is the last reference to the OSREResource, at which point it can delete the calling threads entry from the m_LockStateMap, and can signal the OSREResource’s event (m_WaitEventHandle). Signaling the event will cause all the other threads waiting for this OSREResource to be awakened so that they can attempt to acquire the OSREResource.
Problems
As we mentioned earlier, the code will raise different exceptions if it determines that something bad is going on. These were added to aid in debugging problems that you may run into and are shown in Figure 5. As for deadlocks, the only way to debug those is to examine each thread in your application and see what it is waiting for in hopes that you can determine the deadlock.
Summary
So there you have it, an OSR implementation of Executive Resources in user mode. As you can see the code is pretty simple, and provides the advantages of ERESOURCEs to user mode, i.e. a true reader/writer lock.