# Leveraging from PE parsing technique to write x86 shellcode

## 1. Introduction

Shellcode is often used alongside an exploit to subvert a running program, or by an injector performing a process injection. Hence, shellcode must dynamically locate the required WIN32 API functions to work reliably and efficiently in different Windows versions, and for that task, it typically uses [**LoadLibraryA** ](https://docs.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibrarya)and [**GetProcAddress**](https://docs.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-getprocaddress) that are exported from "**kernel32.dll".**

In this post, we will explore the world of Win32 shellcode development using what we learned from the [**previous** ](https://mohamed-fakroud.gitbook.io/t3nb3w/peb)blog regarding **PEB** structure, and specifically, we will understand how the shellcode leverage from **PE parsing** technique.

Thus, everything will be done directly within the debugger via **IDA Pro**, as well as we will easily get the opcodes and test our shellcode step by step. Finally, to truly understand the structure of **Kernel32.dll,** we will use **CFF** **Explorer** and view the contents of this precious DLL. Now, let's fasten our seat belts and start!

## 2. Finding Kernel32 Base Address <a href="#finding-kernel32-base-address" id="finding-kernel32-base-address"></a>

In our previous post "Digging into Windows PEB", we conclude that any executable file is being loaded in the memory, the Windows loads beside it the main core libraries ***kernel32.dll*** & ***ntdll.dll*** and saves the addresses of these libraries in the base address. The figure below describes the data structures that are followed to find **the base address** of kernel32.dll:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaNgpKjF4N3NOj8H3la%2F-MaNieMjeK_0t8DNQdv9%2Fbaseaddress.png?alt=media\&token=e7e83440-b3c9-4a70-8628-70584957ae6b)

So, we will retrieve the base address of ***kernel32.dll*** from the PEB as shown in the following sample assembly code:&#x20;

```cpp
BITS 32

global _start

section .text

_start:
	xor eax, eax              ; Avoid Null Bytes
	mov eax, [fs:eax + 0x30]	; EAX = PEB
	mov eax, [eax + 0xc]			; EAX = PEB->Ldr
	mov esi, [eax + 0x14]			; ESI = PEB->Ldr.InMemoryOrderModuleList
	lodsd											; EAX = Second module (ntdll.dll)
	xchg eax, esi							; move to next element
	lodsd											; EAX = Third(kernel32.dll)
	mov eax, [eax + 0x10]			; EAX = Base address
```

{% hint style="info" %}
To understand this sample of assembly code just take a look at my previous post.
{% endhint %}

With this assembly code, we can find the kernel32.dll base address and store it in **eax** register, thus we need to assemble it via **nasm.**  As the program is written in x86 assembly, the `elf32` file type is specified using the`-f` flag then disassembled into opcodes using **objdump** :

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MayvsBgK9jZckJhwO3N%2F-Mayw0A6cQsqjbxMW7zP%2Fedit.png?alt=media\&token=3b116ca8-1e7e-4bdc-8d73-476575d84995)

Now, let's test our shellcode within the context of a C program, the shellcode can be placed in a test program (titled `runner.c` in this example) written in C, as shown below:

```c
#include <windows.h>

const char main[] = "\x31\xc0\x64\x8b\x40\x30\x8b\x40\x0c\x8b\x70\x14\xad\x96\xad\x8b\x40\x10";
```

This program should be compiled and executed in IDA PRO for debugging purposes:&#x20;

![Click to visualize EAX value holding base address of kernel32.dll](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaNvTg3o49cGC90kFAy%2F-MaO8HYIHvCKCG5lHJMg%2FAnimation.gif?alt=media\&token=5e71a63f-d0c7-40bb-90bf-61c53a7433a5)

Now, **eax** register points to a memory address **0x75720000**, which indicates that we got the **base address** of the **kernel32.dll** successfully. We can substantiate this result with the fact that we're pointing into **e\_magic** which is a member of **MS-DOS** header of kernel32.dll :

![Pointing into e\_magic](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaOKBvzh6aIKITjZrFD%2F-MaOKR1rPrf2CV0iSmtv%2FMZ.png?alt=media\&token=606a3944-689a-4381-8eab-7f22e88c8d85)

The first field, **e\_magic**, is called also the magic number. This field is used to identify an MS-DOS-compatible file type. All MS-DOS-compatible executable files set this value to **0x5A4D**, which represents the ASCII characters **MZ.** At this level, we retrieve the address of memory where kernel32.dll is loaded!

## 3. Finding the export table of Kernel32.dll <a href="#finding-kernel32-base-address" id="finding-kernel32-base-address"></a>

Before diving into this part, I would like to highlight some mandatory definitions :

* Relative Virtual Address(RVA):  `In an image file, this is the address of an item after it is loaded into memory, with the base address of the image file subtracted from it.The RVA of an item almost always differs from its position within the file on disk (file pointer). --> RVA = VA - BaseAddress`
* Virtual Address (VA):`Same as RVA, except that the base address of the image file is not subtracted. The address is called a VA because Windows creates a distinct VA space for each process, independent of physical memory. For almost all purposes, a VA should be considered just an address. A VA is not as predictable as an RVA because the loader might not load the image at its preferred location. --> VA = RVA + BaseAddress`

We found the base address of kernel32.dll in memory. Now we need to parse this PE file and find the export directory:

```c
  mov ebx, [eax + 0x3c] ; RVA of PE signature
	add ebx, eax			    ; VA of PE signature
	mov ebx, [ebx + 0x78]	; RVA of the exported directory
	add ebx, eax			    ; VA of the exported directory
	mov esi, [ebx + 0x20] ; RVA of the exported function names table
	add esi, eax				  ; VA of the exported function names table
```

{% hint style="info" %}
**e\_lfanew** is a 4-byte offset into the file where the PE file header is located. It is necessary to use this offset to locate the PE header in the file.
{% endhint %}

(Lines 1-2) We know that we can find the “**e\_lfanew**” pointer at the offset **0x3C:**

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaOlHhAXZrmde06-pMQ%2F-MaOmU0Ck72QjMvlv0J7%2Fel_fanew.png?alt=media\&token=d938007c-e98d-429d-992d-886bcef713f5)

After this operation **`mov ebx, [eax + 0x3c]`**, the **ebx** should hold the value **F8**, as depicted in the following figure:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaOotwi7lgwlq1Y3ObJ%2F-MaOovKNra2Wn0ai7CF8%2FF8.png?alt=media\&token=ccaa2787-e671-46bb-84dc-4d9aa2c3028f)

Now, we can find the address of PE signature by adding kernel32 base address and the PE signature RVA: 0x75720000 + F8 = **0x757200F8** and we find the PE signature there:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaOr-Wby15iJsVa6xsh%2F-MaOrOXk6D_Yyir_rg7B%2Fsignature.png?alt=media\&token=27826748-4ca3-4246-bd8e-9c3df0daafe7)

As you know the **PE header** is a structure that contains the following information:

```c
typedef struct _IMAGE_NT_HEADERS {
  DWORD                   Signature;
  IMAGE_FILE_HEADER       FileHeader;
  IMAGE_OPTIONAL_HEADER32 OptionalHeader;
} IMAGE_NT_HEADERS32, *PIMAGE_NT_HEADERS32;
```

**Signature** member identifying the file as a PE image. The bytes are **0x4550**(we could notice the value presented is **50 45** the reason is **little-endian**) which represents the ASCII characters "**PE"** as you can see above in our debugging process.

(Lines 3-4)  The ***IMAGE\_OPTIONAL\_HEADER*** is a structure containing more useful information for us:

```c
typedef struct _IMAGE_OPTIONAL_HEADER {
  WORD                 Magic;
  BYTE                 MajorLinkerVersion;
  BYTE                 MinorLinkerVersion;
  DWORD                SizeOfCode;
  DWORD                SizeOfInitializedData;
....
  IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER32, *PIMAGE_OPTIONAL_HEADER32;
```

It contains our main member which is  **DataDirectory** that contains information such as imported and **exported functions.**

At the offset **0x78** of the PE header, we can find the RVA of **Export Directory:**

![EBX hold RVA of export Directory](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaP2oEWTt9AGtaoou2p%2F-MaP7lrXx8lAG9i_OQh-%2Fexport%20directory.png?alt=media\&token=9c756373-6c7a-421d-b0c3-bc2eb3c47ba5)

&#x20;most of you will ask how we get this **offset** very simple:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaP-DiBUv-qUPDfSVSt%2F-MaP2A3jfpucObSLwq0E%2F78.png?alt=media\&token=37f86f0d-29b2-4003-8721-ec7b391d9985)

or **sizeof(PE\_Signature) + sizeof(IMAGE\_FILE\_HEADER) + offsetof(IMAGE\_OPTIONAL\_HEADER,DataDirectory) = 120 bytes (78 in hex)**

Again, we add this value to the **eax** register and we are now placed on **the export directory** of the kernel32.dll.

The export directory is the following structure:

```c
typedef struct _IMAGE_EXPORT_DIRECTORY {
     DWORD   Characteristics;
     DWORD   TimeDateStamp;
     WORD    MajorVersion;
     WORD    MinorVersion;
     DWORD   Name;
     DWORD   Base;
     DWORD   NumberOfFunctions;
     DWORD   NumberOfNames;
     DWORD   AddressOfFunctions;     
     DWORD   AddressOfNames;        
     DWORD   AddressOfNameOrdinals;  
 };
```

The relevant fields in the **\_IMAGE\_EXPORT\_DIRECTORY**:

* **AddressOfFunctions** is an array of RVAs that points to the actual export functions. It is indexed by an **export ordinal**. `The shellcode needs to map the export name to the ordinal`to use this array.

This mapping is done via **AddressOfNames** and **AddressOfNameOrdinals** arrays. These two arrays exist in parallel. They have the same number of entries, and equivalent indices into these arrays are directly related.

* **AddressOfNames** is an array of **32-bit** RVAs that point to the strings of symbol names.
* **AddressOfNameOrdinals** is an array of **16-bit ordinals**. For a given index id into these arrays, the symbol at AddressOfNames\[id] has the export ordinal value at AddressOfNameOrdinals\[id].&#x20;

(Lines 5-6) In the IMAGE\_EXPORT\_DIRECTORY structure, at the offset **0x20**, contains an RVA of the exported function names table which is **0x000945B4:**

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaPcG1yyul8XiOU8g5b%2F-MaPiZNqcuVKDe7FZTSC%2FRVA_Name.png?alt=media\&token=812c344a-bc3d-480a-9856-e902eb590ce4)

Again :p most of you will ask how we get this **offset** very simple:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaPl0hPIyYOxdpi4m6w%2F-MaPly4OhoEK-F37KjlE%2F20.png?alt=media\&token=e29a993a-7255-4101-90af-e9c491421a4d)

Let's retrieve  the address of exported function names table by adding the Name Pointer Table RVA **0x000945B4** with kernel32 base address **0x75720000**, which results in **0x757B45B4** that store the name of an RVA of the first exported function **0x00096BCA:**

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaPm-niTneMsWrU24d9%2F-MaPo8JvS85qe93lIpAq%2FRVAfirst.png?alt=media\&token=1ca56442-814d-4f4b-b49e-841911af92ef)

## 4. Find LoadLibrayA using Hashed export Names

It’s not always a good idea to use ASCII strings, an UNICODE string since it will just make our shellcode bigger! and also easy to spot. So it would be better to use a hash value to look up our targeted WIN32 API functions.

For that reason, we used [the C program from StackOverflow](https://stackoverflow.com/questions/1128150/win32-api-to-enumerate-dll-export-functions) that resolves all exported WIN32 API functions that exist in kernel32.dll (we're doing the same via assembly version), and in every callback, we will generate a unique hash for the corresponding exported function via the following code snippet:

```c
#include <Windows.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>


void EnumExportedFunctions (char *, void (*callback)(char*));
int Rva2Offset (unsigned int);

typedef struct {
    unsigned char Name[8];
    unsigned int VirtualSize;
    unsigned int VirtualAddress;
    unsigned int SizeOfRawData;
    unsigned int PointerToRawData;
    unsigned int PointerToRelocations;
    unsigned int PointerToLineNumbers;
    unsigned short NumberOfRelocations;
    unsigned short NumberOfLineNumbers;
    unsigned int Characteristics;
} sectionHeader;

sectionHeader *sections;
unsigned int NumberOfSections = 0;

int Rva2Offset (unsigned int rva) {
    int i = 0;

    for (i = 0; i < NumberOfSections; i++) {
        unsigned int x = sections[i].VirtualAddress + sections[i].SizeOfRawData;

        if (x >= rva) {
            return sections[i].PointerToRawData + (rva + sections[i].SizeOfRawData) - x;
        }
    }

    return -1;
}

void EnumExportedFunctions (char *szFilename, void (*callback)(char*)) {
    FILE *hFile = fopen (szFilename, "rb");

    if (hFile != NULL) {
        if (fgetc (hFile) == 'M' && fgetc (hFile) == 'Z') {
            unsigned int e_lfanew = 0;
            unsigned int NumberOfRvaAndSizes = 0;
            unsigned int ExportVirtualAddress = 0;
            unsigned int ExportSize = 0;
            int i = 0;

            fseek (hFile, 0x3C, SEEK_SET);
            fread (&e_lfanew, 4, 1, hFile);
            fseek (hFile, e_lfanew + 6, SEEK_SET);
            fread (&NumberOfSections, 2, 1, hFile);
            fseek (hFile, 108, SEEK_CUR);
            fread (&NumberOfRvaAndSizes, 4, 1, hFile);

            if (NumberOfRvaAndSizes == 16) {
                fread (&ExportVirtualAddress, 4, 1, hFile);
                fread (&ExportSize, 4, 1, hFile);

                if (ExportVirtualAddress > 0 && ExportSize > 0) {
                    fseek (hFile, 120, SEEK_CUR);

                    if (NumberOfSections > 0) {
                        sections = (sectionHeader *) malloc (NumberOfSections * sizeof (sectionHeader));

                        for (i = 0; i < NumberOfSections; i++) {
                            fread (sections[i].Name, 8, 1, hFile);
                            fread (&sections[i].VirtualSize, 4, 1, hFile);
                            fread (&sections[i].VirtualAddress, 4, 1, hFile);
                            fread (&sections[i].SizeOfRawData, 4, 1, hFile);
                            fread (&sections[i].PointerToRawData, 4, 1, hFile);
                            fread (&sections[i].PointerToRelocations, 4, 1, hFile);
                            fread (&sections[i].PointerToLineNumbers, 4, 1, hFile);
                            fread (&sections[i].NumberOfRelocations, 2, 1, hFile);
                            fread (&sections[i].NumberOfLineNumbers, 2, 1, hFile);
                            fread (&sections[i].Characteristics, 4, 1, hFile);
                        }

                        unsigned int NumberOfNames = 0;
                        unsigned int AddressOfNames = 0;

                        int offset = Rva2Offset (ExportVirtualAddress);
                        fseek (hFile, offset + 24, SEEK_SET);
                        fread (&NumberOfNames, 4, 1, hFile);

                        fseek (hFile, 4, SEEK_CUR);
                        fread (&AddressOfNames, 4, 1, hFile);

                        unsigned int namesOffset = Rva2Offset (AddressOfNames), pos = 0;
                        fseek (hFile, namesOffset, SEEK_SET);

                        for (i = 0; i < NumberOfNames; i++) {
                            unsigned int y = 0;
                            fread (&y, 4, 1, hFile);
                            pos = ftell (hFile);
                            fseek (hFile, Rva2Offset (y), SEEK_SET);

                            char c = fgetc (hFile);
                            int szNameLen = 0;

                            while (c != '\0') {
                                c = fgetc (hFile);
                                szNameLen++;
                            }

                            fseek (hFile, (-szNameLen)-1, SEEK_CUR);
                            char* szName = calloc (szNameLen + 1, 1);
                            fread (szName, szNameLen, 1, hFile);

                            callback (szName);

                            fseek (hFile, pos, SEEK_SET);
                        }
                    }
                }
            }
        }

        fclose (hFile);
    }
}

void calculate_hash(char* szName) {
    DWORD hash = 0;
    DWORD i = 0;
    for(i; i < strlen(szName); i++) {
        hash <<= 1;
        hash += szName[i];
    }
    printf("%s:0x%08x\n", szName, hash);

}

int main (int argc, char **argv) {
    printf("Loading %s\n", argv[1]);
    EnumExportedFunctions(argv[1], calculate_hash);
    return 0;
}
```

If you notice above the **calculate\_hash** function basically it's a "loop for" that simply **shifts left by 1** the value existing in **hash** variable then add it to **szName\[i]** which hold an exported function name.

Generate all hashes of exported functions that actually exist in kernel32.dll:

```c
.\dll_hash_calculator.exe C:\Windows\SysWOW64\kernel32.dll > hashes.txt
```

as result:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaQ9knlkN7O3XJV37j-%2F-MaQ9py5Iws01gaJyKkL%2Fhash.png?alt=media\&token=99a81bfd-3b5b-435c-bd09-0c469a5f768b)

let's first inspect our first exported function in memory using the following assembly code :

```c
	push 0x00059ba3
	xor ecx, ecx			  ;prepare counter
	mov edx, eax			  ;save eax into edx
	call _find_addr			

_find_addr:
	inc ecx							;increment name index counter
	lodsd								;load name rva into eax and increment esi by 4 to next RVA
	add eax,edx         ;add kernel32 base address to get VA of function name
```

(Line 1) we push the precomputed hash value of **LoadLibraryA** on the stack since we will use it after to find our targeted function.

(Line 2) we set **ecx** register to 0 for **mapping** the export name of the targeted WIN32 API function(**LoadLibraryA**) with his ordinal to retrieve his address in **AddressOfFunctions** array.&#x20;

(Line 3) we save **eax** register that actually holds the **base address** of kernel32.dll into **edx**, because after we will use **lodsd** that will overwrite our **eax** register.

{% hint style="info" %}
Loads a byte, word, or doubleword from the source operand into the AL, AX, or EAX register, respectively.
{% endhint %}

(Line 6-9) we create a procedure called "***\_find\_addr***" which is presented via a label in our asm code and we will use **lodsd** that will take **esi** register the pointer to the first function name. The **lodsd** instruction will place in **eax** the offset to the function name ( “**AcquireSRWLockExclusive**”) and we add this with the **edx** (kernel32 base address) to find the correct pointer. Note that the **lodsd** instruction will also **increment the esi register value with 4!** This helps us because we do not have to increment it manually, we just need to call again lodsd to get the next function name pointer:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaegCK4MaWjEmkitQDs%2F-MaehwZwZddGeYzztYEH%2Ffirstfunction.png?alt=media\&token=7755ad0f-0752-4e0a-9247-37b16ab4b224)

Remember we're incrementing **ecx** register, which will be the **counter** of our functions and **the function ordinal number**.

Next step, we need to calculate the hash of every exported function name as we did in the **C** language version:&#x9;

```c
find_addr:
	inc ecx
	lodsd
	add eax,edx
	call _calculate_hash
	
_calculate_hash:
	push ecx
	push edx

	xor ecx, ecx
	mov edi, ecx
	mov edx, edi
	
	_loop:
		shl edi,1			; 
		mov dl, BYTE [eax + ecx]
		add edi, edx
		inc ecx
		cmp BYTE[eax + ecx], 0
		jne _loop

	pop edx
	pop ecx
	ret
```

(Line 7-13) we saved all values set on **ecx** and **edx** on the stack since we will need them after, then we cleared respectively **ecx**, **edi**, and **edx.** For clarity, we will not clear **eax** since now it points into the **first BYTE** of the first exported function:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MarwwNxVvV7Y1cq2xpK%2F-Mas0fOPXzxBRXO_VPdO%2Fedit1.png?alt=media\&token=ed94ec4f-62da-4b69-9ea9-3cc5fff8e617)

(Line 15-21) Mainly the **\_loop** function it's an asm representation of C language hash function mentioned previously, the instruction`shl edi,1`shift by left the value stored in **edi**. We stored the first **BYTE** of the exported function name in **dl**(**8 bits** version of **edx**) and of course, **ecx** is used in this case to ensure that we're keeping tracking every **BYTE**, this is done via`mov dl, BYTE [eax + ecx]`then we should add **edx** to **edi.** However, we need to confirm that we reached the end of the exported function name and this is done via the following instruction`cmp [edi + ecx], 0`since every exported function name it's a **null-terminated string** and finally, we keep looping till the **ZF** is set to **1:**

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MarwwNxVvV7Y1cq2xpK%2F-Mas28A7Ief8-nFvWpVd%2Fedit2.png?alt=media\&token=ac173984-b782-46bc-8c19-e0711dc9774d)

Now, **edi** hold the hash of the first exported function name **2A992F1D**, we can confirm that by grepping into our hash.txt generated previously:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-Mak-UvQ9aNY7Shqg5cS%2F-Mak0SU7DfAiTuVGYKde%2Fhooo.png?alt=media\&token=4ff0c214-60e0-47eb-a3db-74db7dbdf885)

(Line 23-25)Finally, we need to restore all registers values that we pushed on that stack and get back to our function **fin\_addr** via **ret** instruction.

Now, we need to compare if the value stored in **edi** match the hash of **LoadLibraryA** which already pushed on the stack:

```c
_find_addr:
	inc ecx
	lodsd
	add eax,edx
	call _calculate_hash
	cmp edi, [esp + 4]
	jnz _find_addr
	ret
```

(Line 1- 5 ) Already explained previously.

(Line 6-8) You should put in mind that whenever the function **\_calculate\_hash** is called, it will return a hash value of an exported function name in **edi** register, and to make sure that that we find **LoadLibraryA** hash function we need to set this instruction :  `cmp edi, [esp + 4 ]` and keep looping till **ZF** is set to **1**. For debugging purposes, we will set a **BP** at **ret** instruction:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-Mas2SkCM4Flhjuq9sM5%2F-Mas6D28SIfH3ZwatB6f%2Fedit3.png?alt=media\&token=6a278ad0-f619-4832-a50c-52cac36f6b81)

At that point we're basically reaching our goal which is finding the hash value of **LoadLibraryA:**

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-ManeY4SHSn7XYQeUzI1%2F-Manw7begvm00w8wfWNw%2Fdone.png?alt=media\&token=6a8f4d59-a981-4880-a427-f02be623ad38)

* **eax** point at the beginning of **LoadLibraryA**.
* **edi** holds the hash value of LoadLibrary **0x00059ba3**.
* The most precious value for us to retrieve **the address** of LoadLibraryA is:
  * &#x20;**ecx** = **0x000003C6** which is **the function ordinal number.**

## 5.Find the address of LoadLibraryA function

At this point, we only found **the ordinal number** of the **LoadLibrayA** function, but we can use it to find the actual address of this function:

```c
call _get_addr
push edi

_get_addr:
	mov esi, [ebx + 0x24]     		; RVA of function ordinal table
	add esi, edx			  					; VA of function ordinal table
	mov cx, WORD[esi + ecx * 2]	  ; get LoadLibray biased_ordinal
	dec ecx												; get LoadLibray ordinal
	mov esi, [ebx + 0x1c]					; RVA of AddressOfFunctions = The Export Address Table
	add esi, edx									; VA of the Exported Table
	mov edi, [esi + ecx * 4]			; RVA of LoadLibrayA
	add edi, edx									; VA of LoadLibrayA
	ret
```

(Line 4-5) At this point, we have in **ebx** a pointer to the **IMAGE\_EXPORT\_DIRECTORY** structure. At the offset **0x24** of the structure, we can find the “**AddressOfNameOrdinals**” offset. In line 5, we add this offset to **edx** register which is the base address of the **kernel32.dll** so we get a valid pointer to the name ordinals table. Some of you may ask the logic behind **0x24:**

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-Mao2Wl-kFacK8ZK_gOd%2F-Mao6AKAJyf3lv8fryxk%2Fordinal.png?alt=media\&token=992dbcc9-2d98-44ca-93c0-ca2ab82265d0)

(Lines 6-7) The **esi** register contains the pointer to **the name ordinals array**.

{% hint style="info" %}
**The name ordinals array** (export ordinal table) is an array of **16-bit** unbiased indexes into the export address table. Ordinals are biased by the Ordinal Base field of the export directory table. In other words, the ordinal base must be subtracted from the ordinals to obtain true indexes into the export address table.
{% endhint %}

This array contains **two-byte numbers**. Up to now, we have the **biased\_ordinal** of LoadLibraryA function in the **ecx** register, so this way we get the function address ordinal (index). This will help us to get the function address.&#x20;

{% hint style="info" %}
May one of you get confused regarding this instruction `mov cx, [esi + ecx * 2].`In fact, we want the value of the **ecx**=**59ba3** element of **the name ordinals array** of type **T:** you do \[arraystart + (**ecx**\*sizeof(**T**))] --> \[**esi** + **ecx** \* **2**] and the ordinal array it stores ordinals in **2 bytes=T,** and finally we stored this value in 2 bytes version of **ecx** which is **cx.**&#x20;
{% endhint %}

We have to subtract **biased\_ordinal** from **OrdinalBase** to get **the ordinal number** of our function:

```
ordinal = biased_ordinal - OrdinalBase; //represented by the instruction dec ecx
```

Since in our case **OrdinalBase** equal to **1**:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaoQ3yz5eYbGzDQVxBX%2F-MaoSA7KnJudYvev_GyS%2Fbase.png?alt=media\&token=83518334-4899-438b-bddf-ff3403c63d3a)

Until now, we have **the ordinal number** stored in **ecx** register of **LoadLibrayA** in our hands as depicted below:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaoSpeCzM4DWTxn4qrx%2F-MaoTw4vH_1osO0cE2km%2Fordinall.png?alt=media\&token=faff2a8f-d937-4146-8799-12d288b6dab7)

(Lines 8-9) At the offset **0x1c,** we can find the “**Export Address Table**” array. We just add the **base address** of **kernel32.dll** and we are placed at the beginning of the array. Some of you may ask again the logic behind **0x1c:**

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaoUoaSsfe9YlliqAMS%2F-MaoWNTnzzXU6RTDSTBd%2Faddress.png?alt=media\&token=fae0e0ec-c593-4861-9a9f-b3b2a1503a13)

(Lines 10-11) Now that we have the correct index for the “**Export Address Table**” array in **ecx**, we can find the **LoadLibrayA** function pointer (**RVA** of LoadLibraryA) at the **AddressOfFunctions**\[**ecx**] location:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaoX-jO5LjmhzBorhfU%2F-MaoZ0RHpSIwzldyF0xF%2FRVALO.png?alt=media\&token=32bb10c4-85ce-4b67-b326-5e1f500138a1)

We use "**ecx** \* **4**" because each pointer has **4 bytes** and **esi** points to the beginning of the array.&#x20;

In the end, we add the base address so we will have in the **edi** the pointer to the **LoadLibraryA** function:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MaoZeZYyJwZoGSVAnGr%2F-Mao_J58JWSaB9B8tNtv%2F2021-05-28_210451.png?alt=media\&token=9ce88c23-d8ae-446d-9b22-565b40b44d48)

Finally, we resolve dynamically at runtime the address of **LoadLibraryA** is **75A60BD0** and of course, we set **ret** instruction to return from the actual procedure **\_*****get\_addr.***

(Line 2) We're basically pushing the address of **LoadLibrayA** on the stack through `push edi`because we will use it after in our **main** procedure.

## 6.Find the address of GetProcAddress function

Roughly the same steps that we deeply explain in the previous section to find **LoadLibrayA** address however, I will clarify some instructions:

```cpp
_start:
	xor eax, eax              ; Avoid Null Bytes
	mov eax, [fs:eax + 0x30]  ; EAX = PEB
	mov eax, [eax + 0xc]	  	; EAX = PEB->Ldr
	mov esi, [eax + 0x14]	  	; ESI = PEB->Ldr.InMemoryOrderModuleList
	lodsd					  					; EAX = Second module (ntdll.dll)
	xchg eax, esi			  			; move to next element
	lodsd					  					; EAX = Third(kernel32)
	mov eax, [eax + 0x10]	  	; EAX = Base address

  mov ebx, [eax + 0x3c]     ; RVA of PE signature
	add ebx, eax			 			  ; VA of PE signature
	mov ebx, [ebx + 0x78]	    ; RVA of the exported directory
	add ebx, eax			        ; VA of the exported directory
	mov esi, [ebx + 0x20]     ; RVA of the exported function names table
	add esi, eax			        ; VA of the exported function names table
	mov edx, eax              ; save eax into edx
	push esi				          ; save the VA of the exported function names table

	push 0x00059ba3						; hash of LoaLibraryA
	xor ecx, ecx			        ; counter for storing	the ordinal number of LoaLibraryA 
	call _find_addr
	call _get_addr
	push edi

	
	mov esi, [esp + 8]        ; restore VA of the exported function names table

	
	push 0x0015bdfd
	xor ecx, ecx			        ; prepare counter	
	call _find_addr
	call _get_addr
	push edi
	
_get_addr: ...
_find_addr: ...
_calculate_hash: ...
```

First, let's agree that the precomputed hash of GetProcAddress is: **0dfdx0015b**

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MasJJYkXB1U4H73PfV-%2F-MasJUQI1tBVKuFdt3ge%2Fproc.png?alt=media\&token=a028f714-e43c-4c63-bacd-3558fefde462)

(Line 18) We're saving the value of **esi** on the stack why? As you know **esi** register is holding the **address** of **exported function names** and since we will start the process of finding the address of **LoadLibray** this value will be overwritten the fact that **lodsd** instruction will **increment the esi register value with 4.** Hence, we push it on the stack then retrieve back after finding the address of **LoadLibraryA** via the following instruction: `mov esi, [esp + 8]` then, we can smoothly start the process of finding **GetProcAddress** **address**.

(Line 34) We're doing the same approach as before pushing the address of **GetProcAddress** on the stack through`push edi`since we will use it after in our **main** procedure.

## 7. Load user32.dll library & Get **MessageBox** function address

We previously found the **LoadLibraryA** function address, we will use it now to load into memory the "**user32.dll"** library which contains our **MessageBox** function that will use it as POC to leverage from the technique discussed on this blog:

```cpp
HMODULE LoadLibraryA(
  LPCSTR lpLibFileName
);
```

* `lpLibFileName` is the name of the module which will be in our case "**user32.dll".**

```cpp
_do_main:
	mov edi, [esp + 8]
	push "ll"
	push "32.d"
	push "user"
	push esp
	call edi
	
	push "oxA"
	push "ageB"
	push "Mess"
	push esp
	push eax
	mov edi, [esp + 32]
	call edi
```

(Lines 1-7) we set the procedure **\_*****do\_main***  which represents our main function. Then, as you notice previously we push on the stack the "LoadLibraryA" **address**. So we retrieve it through the stack pointer **esp**. Now, we want to call "**LoadLibraryA**("**user32.dll**")". So we need to place the **user32.dll** string on the stack.

At **esp**, we have the "**user32.dll**" string. We push this parameter on the stack to load the library and this will return in **eax** the user32.dll library **base address** where the DLL is loaded into memory. We will need it later:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MasTXHuI6wDYTYQVZ24%2F-MasWND9P5RoV_noTJE7%2FAnimation.gif?alt=media\&token=8ad15eed-b920-4020-8cee-7d0ebe1265eb)

We loaded into memory the **user32.dll** library, now we want to call **GetProcAddress** to get the address of the **MessageBox** function.

```cpp
FARPROC GetProcAddress(
  HMODULE hModule,
  LPCSTR  lpProcName
);
```

* `hModule` A handle to the DLL module that contains the function or variable. The LoadLibraryA, function returns this handle.
* `lpProcName`The function or variable name, or the function's ordinal value. If this parameter is an ordinal value, it must be in the low-order word; the high-order word must be zero.

(Line 9-15) We want to call "**GetProcAddress**(**user32.dll**, "**MessageBox"**)" so again we need to place the **MessageBox** string on the stack. At **esp**, we have the "**MessageBox** " string then we push this parameter on the stack as well as we push also the **eax** register which contains **the user32.dll base address**, and calls **edi** register which holds **GetProcAddress** function:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-MascJzdPLlG56HqS0KL%2F-Mase8ruXe6V5p_ZTeJQ%2Fmess.png?alt=media\&token=e3f7c42a-1acb-4682-922e-5d024b689dd6)

After, calling **GetProcAddress,** it will return in **eax** the **MessageBox** **base address** as depicted above, since we will need it after.

## 8. Call **MessageBox** function

Now we have all the ingredients to call **MessageBox** function, we just need to prepare the right parameters for it:

```cpp
int MessageBox(
  HWND    hWnd,
  LPCTSTR lpText,
  LPCTSTR lpCaption,
  UINT    uType
);
```

* &#x20;`hWnd` A handle to the owner window of the message box to be created. If this parameter is **NULL**, the message box has no owner window.&#x20;
* &#x20;`lpText` The message to be displayed. If the string consists of more than one line, you can separate the lines using a carriage return and/or linefeed character between each line.&#x20;
* &#x20;`lpCaption` The dialog box title. If this parameter is **NULL**, the default title is **Error**.
* &#x20;`uType`The contents and behavior of the dialog box. This parameter can be a combination of flags from the following groups of flags.

As an example, we want to call:

```cpp
int MessageBox(
        NULL,
        (LPCWSTR)L"T3nb3w",
        (LPCWSTR)L"T3nb3w",
         MB_OK 
    );
```

**Remember that the calling convention for x86, arguments are push in reverse order:**

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-Masx21vrY5UlgQpA2t5%2F-Masypzofjr0ytiQglCd%2Fni.png?alt=media\&token=b84ec4a2-7813-4d22-85a9-ba2bee932168)

Thus, we can do that via the following asm code:

```cpp
	push "3w"
	push "T3nb"
	mov esi, esp

	xor ebx, ebx
	push ebx     ; MB_OK  = 0x00000000L
	push esi		 ; (LPCWSTR)L"T3nb3w"
	push esi		 ; (LPCWSTR)L"T3nb3w"
	push ebx	   ; NULL
	call eax
```

(Line 1-3)So, we need to place the **"T3nb3w"** string on the stack.  At **esp**, we have the "**T3nb3w**" string then we move this parameter into **esi.**

&#x20;(Line 5-10) we cleared **ebx** register and we push respectively the following registers **ebx**, **esi**, **esi**, and **ebx,** finally we're calling eax that hold already the base address of **MessageBox**.

Below shows our assembly in a debugger. The MessageBox pops after `call eax` instruction is executed:

![](https://615064086-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MXlxki-LGPmhYCBAzg5%2F-Mat3h6VVSlCfh5ZVh_-%2F-Mat5xsfM0UwnFKQNr_o%2Fdoneeee.png?alt=media\&token=684cc167-1a93-48b3-ac5f-74731880673b)

## 8. Final Shellcode

Now we just need to add all parts together and the final shellcode is the following:

```cpp
BITS 32

global _start

section .text

_start:
	xor eax, eax              ; Avoid Null Bytes
	mov eax, [fs:eax + 0x30]  ; EAX = PEB
	mov eax, [eax + 0xc]	    ; EAX = PEB->Ldr
	mov esi, [eax + 0x14]	    ; ESI = PEB->Ldr.InMemoryOrderModuleList
	lodsd					            ; EAX = Second module (ntdll.dll)
	xchg eax, esi			        ; move to next element
	lodsd					            ; EAX = Third(kernel32)
	mov eax, [eax + 0x10]	    ; EAX = Base address

  	mov ebx, [eax + 0x3c]   ; RVA of PE signature
	add ebx, eax			        ; VA of PE signature
	mov ebx, [ebx + 0x78]	    ; RVA of the exported directory
	add ebx, eax			        ; VA of the exported directory
	mov esi, [ebx + 0x20]     ; RVA of the exported function names table
	add esi, eax			        ; VA of the exported function names table
	mov edx, eax              ; save eax into edx
	push esi				          ; save the VA of the exported function names table

	push 0x00059ba3
	xor ecx, ecx			        ; prepare counter		
	call _find_addr
	call _get_addr
	push edi

	
	mov esi, [esp + 8]        ; restore VA of the exported function names table

	
	push 0x0015bdfd
	xor ecx, ecx			        ; prepare counter	
	call _find_addr
	call _get_addr
	push edi
	jmp _do_main

_get_addr:
	mov esi, [ebx + 0x24]         ; RVA of function ordinal table
	add esi, edx			  		      ; VA of function ordinal table
	mov cx, WORD [esi + ecx * 2]	; get LoadLibray biased_ordinal
	dec ecx							          ; get LoadLibray ordinal
	mov esi, [ebx + 0x1c]					; RVA of AddressOfFunctions or the Export Table
	add esi, edx									; VA of the Exported Table
	mov edi, [esi + ecx * 4]			; RVA of LoadLibrayA
	add edi, edx									; VA of LoadLibrayA
	ret
	  

_find_addr:
	inc ecx												;increment name index counter
	lodsd													;load name rva into eax and increment esi by 4 to next rva
	add eax,edx										;add kernel32.dll base address to get va of function name
	call _calculate_hash					;get the hash
	cmp edi, [esp + 4]						;compare our hash
	jnz _find_addr								;loop if not matching
	ret														;return, ecx now holds the name array index of our function
	
_calculate_hash:
	push ecx
	push edx

	xor ecx, ecx
	mov edi, ecx
	mov edx, edi

	_loop:
		shl edi,1
		mov dl, BYTE [eax + ecx]
		add edi, edx
		inc ecx
		cmp BYTE[eax + ecx], 0
		jne _loop

	pop edx
	pop ecx
	ret

_do_main:
	mov edi, [esp + 8]
	push "ll"
	push "32.d"
	push "user"
	push esp
	call edi

	push "oxA"
	push "ageB"
	push "Mess"
	push esp
	push eax
	mov edi, [esp + 32]
	call edi
	
	push "3w"
	push "T3nb"
	mov esi, esp

	xor ebx, ebx
	push ebx
	push esi
	push esi
	push ebx
	call eax
```

Note this shellcode is used to learn **PE parsing Export Table** technique through debugging it's not 100% operational.

{% hint style="danger" %}
Our shellcode will only work on processes that have already **kernel32.dll loaded**. However, if you create a **suspended** process the only loaded modules will be the **exe** and **ntdll.dll**, so the shellcode wouldn't work if you inject it into a brand new suspended process. In this case, we could alter the shellcode to use **ntdll!LdrLoadDll**(**Undocumented Function**) instead of **kernel32!LoadLibrary.**
{% endhint %}

## 9. Conclusion

Frankly, it took time for me to understand this wonderful technique of **PE Parsing Export Table** used by sophisticated malware thus I tried to explain it from scratch. Using **the shellcode’s PE parsing** ability instead of **GetProcAddress** has the additional benefit of making reverse-engineering of the shellcode more difficult. Also, using a **hash** of WIN32 API function name was a good idea to hide them from **casual inspection**.&#x20;

I hope you have learned step by step how we can leverage from  **PE Parsing Export Table** to write Windows shellcode then, resolve all of the shellcode's libraries so that it can interact with the system.&#x20;

**Final Note**: I am not a shellcode developer expert I'm just a learner, If you think I said anything incorrect anywhere, feel free to reach out to me and correct me, I would highly appreciate that. And finally, thank you very much for taking your time to read this post.

## References

[Book: Michael Sikorski, Andrew Honig - Practical Malware Analysis\_ The Hands-On Guide to Dissecting Malicious Software-No Starch Press (2012)](https://www.amazon.fr/Practical-Malware-Analysis-Hands-Dissecting/dp/1593272901)

{% embed url="<https://blog.kowalczyk.info/articles/pefileformat.html>" %}

{% embed url="<https://docs.microsoft.com/en-us/windows/win32/debug/pe-format>" %}

{% embed url="<https://docs.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-image_nt_headers32>" %}

{% embed url="<https://stackoverflow.com/questions/1128150/win32-api-to-enumerate-dll-export-functions>" %}

{% embed url="<https://c9x.me/x86/html/file_module_x86_id_160.html>" %}

{% embed url="<https://securitycafe.ro/2016/02/15/introduction-to-windows-shellcode-development-part-3/>" %}

{% embed url="<https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-messagebox>" %}
