Mitigating DLL Injection Through Load Order Hijacking --[ 1. Introduction The Windows process model is remarkably unrestricted. From an engineering perspective, this openness affords considerable flexibility in your system design. Why _wouldn't_ you want to throw together a quick and dirty IPC mechanism using window messages, or inject new code directly into a running process? However, it is this same flexibility that is often abused by rogue software developers or attackers to subvert applications into performing actions which were not originally intended by the developer. These techniques are commonly used for things like cheating in video games, bypassing software licensing restrictions, and for process and behavioral monitoring in personal security products (Antivirus software). But this flexibility comes at a significant cost to software vendors and end users. In the case of window messaging, the technique was so heavily abused by hackers in so-called "Shatter Attacks" that with the release of Windows Vista, Microsoft produced one of their most effective attack mitigations to date; User Interface Privilege Isolation (UIPI), effectively preventing processes from communicating with those running at higher integrity level. This change to the Windows security model effectively wiped out an entire class of Privilege Escalation attacks overnight. But one facet of the Windows process model that Microsoft has shown less willing to eradicate is win32's WriteProcessMemory and CreateRemoteThread APIs. The latter of which effectively allows one process to instantiate a new thread of execution within the context of another. Over the years, there have been modifications to the behavior of these APIs, notably the inability to interact with processes running in a different terminal session. However, the vast majority of Windows system software runs in the primary session at Medium IL. And herein lies the problem. --[ 2. DLL Injection 101 Writing to memory and creating a thread is fundamentally sufficient in of itself to execute code within the context of another process. And indeed, simply allocating a chunk of Read/Write/eXecute (RWX) memory, writing code to it, and starting execution at that address would suffice (modulo Arbitrary Code Guard mitigations). However, in practicality this is an exceptionally time consuming way to build and maintain a working feature. It is much more common to simply let the Windows loader do most of the heavy lifting for you. It is for this reason that pretty much every invocation of CreateRemoteThread simply points to LoadLibrary with a pointer to the path of a DLL to load. And this whole flow is entirely facilitated by a significant weakness in the Windows implementation of Address Space Layout Randomisation (ASLR) which means DLL base addresses are not randomised across processes. Therefore, achieving code injection generally follows the following pattern: [1] processHandle = OpenProcess(PROCESS_ALL_ACCESS, ..., PID) [2] dllPath = VirtualAllocEx(processHandle, ..., PAGE_READWRITE) [3] WriteProcessMemory(processHandle, dllPath, PATH_TO_DLL) [4] hKernel32 = GetModuleHandleW("Kernel32") [5] loadLibrary = GetProcAddress(hKernel32, "LoadLibrary"); [6] CreateRemoteThread(processHandle, ..., loadLibrary, dllPathAddr, ...); First, we call OpenProcess[1] to obtain a handle to the process into which we want to inject. We wil use this handle to interact with the process throughout the rest of this flow. We then need to allocate a buffer within the process's address space which will contain the string to the path to the DLL. We achieve this by making a call to VirtualAllocEx[2] (the "Ex" version allowing for the provision of a target process handle in the first argument), then subsequently calling WriteProcessMemory[3] with the newly allocated buffer in the second parameter and the DLL path in the third. Next we need to ascertain the address of the LoadLibrary routine. It is exported from kernel32.dll. Fortunately, due to the aforementioned limitations of Windows ASLR, if we load kernel32 into our own address space, we are guaranteed that it will be loaded at the exact same base address in the target. As such, we simply use GetProcAddress[5] to obtain the address. Finally, we can call CreateRemoteThread[6] with the address of LoadLibrary in the "lpStartAddress" parameter and the address of the dll path string in "lpParameter". Developers can use the fact that the Windows loader will first make a call to the library's initialisation routine DllMain upon completion of the load to perform the required logic. The syntax of DllMain is documented on MSDN as: BOOL WINAPI DllMain( [1] _In_ HINSTANCE hinstDLL, [2] _In_ DWORD fdwReason, _In_ LPVOID lpvReserved ); DllMain is called with the handle to the current DLL instance (the current base address) in parameter hinstDLL[1], and the reason that DllMain is being invoked in fdwReason[2]. The latter is important because DllMain is invoked by the loader when the DLL is first loaded (after dynamic linking is completed), when the DLL is unloaded, and also at the creation and destruction of threads (to allow the DLL to initialise some thread-local data). Typically DLLs used in injection make us of the first of these; DLL_PROCESS_ATTACH. For brevity, this article will mostly cover usermode DLL injection, but it is important to note that DLLs can also be injected from the Kernel. Indeed, kernelmode injection is more common in products such as AVs. --[ 3. LdrRegisterDllNotification As a medium integrity process running on a Windows device on the default session, there is very little that you can do to prevent such injection attacks (and indeed even less that you can do to prevent attacks from Kernel). However, a scantly-documented API exists hidden in the depths of ntdll called LdrRegisterDllNotification. This routine (designed for use in usermode drivers incorporating KDM) allows a usermode application to register a callback every time the loader is invoked to map a new DLL into memory. Once registered, the loader will make a call to the specified callback on the newly created thread once the new DLL has been mapped into memory, but before DllMain is called. The specified callback will then have the opportunity to perform some action. At the most basic level, the callback routine can a) Check that this is not a dll that is required under normal operation (it will get called at load time of _all_ DLLs, including Windows platform libraries) then b) Simply call TerminateThread on the current thread to prevent it from continuing execution into the library's DllMain. However, this introduces a race condition. The application needs to call LdrRegisterDllNotification early enough in it's startup that the injector cannot begin the process of creating a remote thread before the callback is registered. And, even worse, injections performed from the Kernel are often triggered as a result of a callback registered through PsSetCreateProcessNotifyRoutine(Ex) which is called in the context of the newly created process's main thread _before_ the main routine is called. Making it impossible to win the race by registering the callback in main. --[ 4. DLL Load Order Hijacking So now we are in a position where we need to register for a callback _before_ we get a chance to execute any code in main. Fortunately, due to a quirk in the Windows loader, we actually get that chance if we perform the call to LdrRegisterDllNotification in a DLL loaded immediately (not delay-loaded) by the exe. The DllMain routine of each DLL is called immediately after dynamic linking is completed, and before the call to the executable's entry point (technically a shim added by the compiler and runtime which initialises the standard library, gathers arguments and calls main, but for the sake of simplicity we will refer to this whole bootstrapping as main). Crucially, even Kernel callbacks registered through PsSetCreateProcessNotifyRoutine do not get notified until after dynamic linking is completed. And it is for this reason, that we can effectively mitigate many instances of basic DLL injection through the use of a tiny DLL which registers a LdrRegisterDllNotification callback in DllMain which simply calls TerminateThread. This technique does not account for the (not uncommon) model of injecting a DLL with multiple exports. The first thread is a simple call to LoadLibrary as seen in the flow in section 2, and a second call to CreateThread is used to call into one of the exported functions. For this reason, our technique can be further improved by parsing the Export Address Table in the newly injected DLL, finding the offsets of all exported functions, and patching them all with a single-byte Return (RET) instruction. This way, when the DllMain and other routines are called, they simply return harmlessly without executing their payload. --[ 5. Sample code The following is a sample DLL which performs the technique described above. (Truncated for brevity) VOID CALLBACK OnLoad( _In_ ULONG NotificationReason, _In_ PLDR_DLL_NOTIFICATION_DATA NotificationData, _In_opt_ PVOID Context ) { if (NotificationReason != LDR_DLL_NOTIFICATION_REASON_LOADED) { return; } ... ULONG exports_dir_va = nt_header->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress; PIMAGE_EXPORT_DIRECTORY exports_dir = (PIMAGE_EXPORT_DIRECTORY)((PBYTE)base_address + exports_dir_va); PDWORD names = (PDWORD)((PBYTE)base_address + exports_dir->AddressOfNames); for (DWORD i = 0; i < exports_dir->NumberOfNames; i++) { PDWORD address_of_name_ordinals = (PDWORD)((PBYTE)base_address + exports_dir->AddressOfNameOrdinals); WORD name_ordinal = address_of_name_ordinals[i]; PDWORD address_of_functions = (PDWORD)((PBYTE)base_address + exports_dir->AddressOfFunctions); PBYTE function = (PBYTE)((PBYTE)base_address + address_of_functions[name_ordinal]); DWORD oldProtect; BOOL success = VirtualProtect( function, 1, PAGE_EXECUTE_READWRITE, &oldProtect ); if (success) { *function = 0xc3; VirtualProtect( function, 1, oldProtect, &oldProtect ); } } } BOOL APIENTRY DllMain( _In_ HMODULE hModule, _In_ DWORD ul_reason_for_call, _In_opt_ LPVOID lpReserved ) { HMODULE ntdll; LdrRegisterDllNotificationPtr ldrRegisterDllNotification; LPWSTR commandLine; if (ul_reason_for_call == DLL_PROCESS_ATTACH) { ntdll = LoadLibraryW(L"ntdll.dll"); if (ntdll == NULL) { return TRUE; } ldrRegisterDllNotification = GetProcAddress( ntdll, "LdrRegisterDllNotification" ); if (ldrRegisterDllNotification == NULL) { return TRUE; } ldrRegisterDllNotification( 0, &OnLoad, NULL, &g_Cookie ); } return TRUE; } --[ 6. Limitations As previously mentioned, the Windows process model is exceptionally forgiving, and while the mitigation outlined above will function for the vast majority of DLL injections in use today, it is not infallible. Other processes are entirely free to reach in and modify process memory, and can still create threads at will. There is nothing to stop the injector simply resetting the return instruction back to it's original value, or even simply allocating some executable memory and jumping to it. The truth of the matter is, that it is effectively impossible to prevent DLL injection, especially from Kernel mode, and any mitigation that we put in place is merely a speed bump to a determined attacker. However, it is certainly possible to make the task more difficult by loading your process with mitigations such as this in order to prevent from applications with less conviction. --[ 7. Resources https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createremotethread https://learn.microsoft.com/en-us/windows/win32/dlls/dllmain https://ferreirasc.github.io/PE-Export-Address-Table/