Friday, 19 June 2009

Retrieving Kernel32's Base Address

For shellcode, a common method to resolve the addresses of library functions needed, is to get the base address of the kernel32.dll image in memory and retrieve the addresses of GetProcAddress and LoadLibraryA by parsing the kernel32 images Export Address Table (EAT). These two functions can then be used to resolve the remaining functions needed by the shellcode. To retrieve the kernel32.dll base address most shellcodes use the Process Environment Block (PEB) structure to retrieve a list of modules currently loaded in the processes address space. The InInitializationOrder module list pointed to by the PEB's Ldr structure holds a linked list of modules. Typically the second entry in this list has always been that of kernel32.dll. The code used to retrieve the kernel32 base address based on this method is shown below:

xor ebx, ebx // clear ebx
mov ebx, fs:[ 0x30 ] // get a pointer to the PEB
mov ebx, [ ebx + 0x0C ] // get PEB->Ldr
mov ebx, [ ebx + 0x1C ] // get PEB->Ldr.InInitializationOrderModuleList.Flink (1st entry)
mov ebx, [ ebx ] // get the next entry (2nd entry)
mov ebx, [ ebx + 0x08 ] // get the 2nd entries base address (kernel32.dll)

This method has worked for all versions of Windows from Windows 2000 up to and including Windows Vista. The introduction of Windows 7 (rc1) has broken this method of retrieving the kernel32 base address due to the new MinWin kernel structure employed by Windows 7. A new module kernelbase.dll is loaded before kernel32.dll and as such appears in the second entry of the InInitializationOrder module list.

To retrieve the kernel32.dll base address in a generic manner on all versions of Windows from Windows 2000 up to and including Windows 7 (rc1) a slightly modified approach can be used. Instead of parsing the PEB's InInitializationOrder module list, the InMemoryOrder module list can be parsed instead. The third entry in this list will always be that of kernel32.dll (The first being that of the main module and the second being that of ntdll.dll). The code used to retrieve the kernel32 base address based on this method is shown below:

xor ebx, ebx // clear ebx
mov ebx, fs:[ 0x30 ] // get a pointer to the PEB
mov ebx, [ ebx + 0x0C ] // get PEB->Ldr
mov ebx, [ ebx + 0x14 ] // get PEB->Ldr.InMemoryOrderModuleList.Flink (1st entry)
mov ebx, [ ebx ] // get the next entry (2nd entry)
mov ebx, [ ebx ] // get the next entry (3rd entry)
mov ebx, [ ebx + 0x10 ] // get the 3rd entries base address (kernel32.dll)

Update: Their appears to be some cases on Windows 2000 whereby the above method will not yield the correct result. A more robust method, albeit a more lengthy one, can be seen below. We search the InMemoryOrder module list for the kernel32 module using a hash of the module name for comparison. We also normalise the module name to uppercase as some systems store module names in uppercase and some in lowercase.

cld // clear the direction flag for the loop
xor edx, edx // zero edx

mov edx, [fs:edx+0x30] // get a pointer to the PEB
mov edx, [edx+0x0C] // get PEB->Ldr
mov edx, [edx+0x14] // get the first module from the InMemoryOrder module list
mov esi, [edx+0x28] // get pointer to modules name (unicode string)
push byte 24 // push down the length we want to check
pop ecx // set ecx to this length for the loop
xor edi, edi // clear edi which will store the hash of the module name
xor eax, eax // clear eax
lodsb // read in the next byte of the name
cmp al, 'a' // some versions of Windows use lower case module names
jl not_lowercase
sub al, 0x20 // if so normalise to uppercase
ror edi, 13 // rotate right our hash value
add edi, eax // add the next byte of the name to the hash
loop loop_modname // loop until we have read enough
cmp edi, 0x6A4ABC5B // compare the hash with that of KERNEL32.DLL
mov ebx, [edx+0x10] // get this modules base address
mov edx, [edx] // get the next module
jne next_mod // if it doesn't match, process the next module

// when we get here EBX is the kernel32 base (or change to suit).

To verify these methods on your own system you can use the following tool:

This code has been verified on the following systems:

  • Windows 2000 SP4
  • Windows XP SP2
  • Windows XP SP3
  • Windows 2003 SP2
  • Windows Vista SP1
  • Windows 2008 SP1
  • Windows 7 RC1

The following WinDbg session shows how we can manually verify the above methods on a Windows 7 RC1 system:

0:004> version
Windows 7 Version 7100 UP Free x86 compatible
Product: WinNt, suite: SingleUserTS
kernel32.dll version: 6.1.7100.0 (winmain_win7rc.090421-1700)

// list the loaded modules...
0:004> lm
start end module name
00d20000 00de0000 calc (pdb symbols)
70930000 70a77000 msxml6 (pdb symbols)
725c0000 725fc000 oleacc (pdb symbols)
73e10000 73e42000 WINMM (pdb symbols)
73e50000 73f49000 WindowsCodecs (pdb symbols)
74170000 74183000 dwmapi (pdb symbols)
742c0000 74450000 gdiplus (pdb symbols)
74450000 74490000 UxTheme (pdb symbols)
745d0000 7476c000 COMCTL32 (pdb symbols)
74b50000 74b59000 VERSION (pdb symbols)
755a0000 755ac000 CRYPTBASE (pdb symbols)
756d0000 75718000 KERNELBASE (pdb symbols)
75950000 7596f000 IMM32 (pdb symbols)
75970000 759ff000 OLEAUT32 (pdb symbols)
75a00000 75ac9000 USER32 (pdb symbols)
75ae0000 75bac000 MSCTF (pdb symbols)
75d60000 75e02000 RPCRT4 (pdb symbols)
75e60000 75f0c000 msvcrt (pdb symbols)
75f50000 75ff0000 ADVAPI32 (pdb symbols)
75ff0000 7608d000 USP10 (pdb symbols)
76090000 76113000 CLBCatQ (pdb symbols)
76120000 7627b000 ole32 (pdb symbols)
76280000 762d7000 SHLWAPI (pdb symbols)
763e0000 77026000 SHELL32 (pdb symbols)
77030000 77049000 sechost (pdb symbols)
77050000 77124000 kernel32 (pdb symbols)
77160000 771ae000 GDI32 (pdb symbols)
77500000 7763c000 ntdll (pdb symbols)
77720000 7772a000 LPK (pdb symbols)

// dump the PEB...

0:004> !peb
PEB at 7ffdc000
InheritedAddressSpace: No
ReadImageFileExecOptions: No
BeingDebugged: Yes
ImageBaseAddress: 00d20000
Ldr 775d7880
Ldr.Initialized: Yes
Ldr.InInitializationOrderModuleList: 00221a28 . 002b13a0
Ldr.InLoadOrderModuleList: 00221988 . 002b1390
Ldr.InMemoryOrderModuleList: 00221990 . 002b1398

// show the Ldr.InInitializationOrderModuleList
// dump the first entry...
0:004> dd 00221a28
00221a28 00221e68 775d789c 77500000 00000000 // 77500000 = ntdll.dll
00221a38 0013c000 003c003a 002218e8 00140012
00221a48 7756835c 00004004 0000ffff 775da680
00221a58 775da680 49eea66e 00000000 00000000
// dump the second entry...
0:004> dd 00221e68
00221e68 00221d50 00221a28 756d0000 756d8005 // 756d0000 = KERNELBASE.dll
00221e78 00048000 00460044 00221df8 001e001c
00221e88 00221e20 00084004 0000ffff 0022a9b4
00221e98 775da690 49eea60f 00000000 00000000
// we can see the second entry is for kernelbase.dll and not kernel32.dll

// show the Ldr.InMemoryOrderModuleList
// dump the first entry...
0:004> dd 00221990
00221990 00221a20 775d7894 00000000 00000000
002219a0 00d20000 00d30140 000c0000 003a0038 // 00d20000 = calc.exe
002219b0 002217fa 00120010 00221822 00004000
002219c0 0000ffff 00222b84 775da6a8 49ee917f
// dump the second entry...
0:004> dd 00221a20
00221a20 00221d48 00221990 00221e68 775d789c
00221a30 77500000 00000000 0013c000 003c003a // 77500000 = ntdll.dll
00221a40 002218e8 00140012 7756835c 00004004
00221a50 0000ffff 775da680 775da680 49eea66e
// dump the third entry...
0:004> dd 00221d48
00221d48 00221e60 00221a20 002227e8 00221e68
00221d58 77050000 770a102d 000d4000 00420040 // 77050000 = kernel32.dll
00221d68 00221ce0 001a0018 00221d08 00084004
00221d78 0000ffff 002248a4 775da640 49eea60e
// we can see the third entry is for kernel32.dll


X-STAR said...

Good work!
Keep it on!

beistjin said...

nice. :)

dg_u said...

Very interesting article! Thank you!

Thierry Zoller said...

Congrats, very useful research there.

Anonymous said...

Good stuffs, thx for sharing.

SkyLined said...

Thanks for the info! I've created a different solution to this problem myself:

Nelson Brito said...

Great info!!!

masterducky said...

Awesome job!

Thank you very much!

Anonymous said...

thank's very much.
someone can tell me how you find hash value of kernel32.dll
(not asm code)

Stephen Fewer said...

@Anonymous: You can use the following Ruby snippet to calculate the hash value...

hash = 0
mod = "K\x00E\x00R\x00N\x00E\x00L\x003\x002\x00.\x00D\x00L\x00L\x00"

mod.each_byte do | byte |
hash = ( hash >> 13 | hash << ( 32 - 13 ) ) & 0xFFFFFFFF
hash += byte.to_i

print "%s -> 0x%08X" % [ mod.gsub( "\x00", '' ), hash & 0xFFFFFFFF ]

Anonymous said...

I dont understand where you retrieve offset of base address, in windbg I see:


+0x000 InLoadOrderLinks : _LIST_ENTRY
+0x008 InMemoryOrderLinks : _LIST_ENTRY
+0x010 InInitializationOrderLinks : _LIST_ENTRY

but in your code you use 0x10 for InMemoryOrderLinks and 0x08 for InMemoryOrderLinks.. why? I dont understand that step..

Stephen Fewer said...


Yup each of the 3 lists (PEB->Ldr.InLoadOrderLinks, PEB->Ldr.InMemoryOrderLinks and PEB->Ldr.InInitializationOrderLinks) point into a _LDR_DATA_TABLE_ENTRY structure[1]. But they point into the same structure at a different place:

The PEB->Ldr.InLoadOrderLinks entries point to the beginning of the InLoadOrderModuleList list in the _LDR_DATA_TABLE_ENTRY structure, the PEB->Ldr.InMemoryOrderLinks entries point to the begining of the InMemoryOrderModuleList list in the _LDR_DATA_TABLE_ENTRY structure and the PEB->Ldr.InInitializationOrderLinks entries point to the beginning of the InInitializationOrderModuleList list in the _LDR_DATA_TABLE_ENTRY structure, so using each different PEB->Ldr list puts the offset to the base address off by a certain ammount.

This is why the offset for the base address is 0x10 when we reference the _LDR_DATA_TABLE_ENTRY structure via the PEB->Ldr.InMemoryOrderModuleList and 0x8 when we go via PEB->Ldr.InInitializationOrderModuleList.


ThePirateCat said...

Good job!

Anonymous said...

Python code to generate hash:

def shift(x,shift):
return ((x>>shift) | (x << (32-shift))) & 0xFFFFFFFF

x = "KERNEL32.DLL"
print x
out = 0
for i in xrange(0,12):
out = shift(out,13) + ord(x[i])
out = shift(out,13) #For unicode
print "%x"%out

Blackcube said...

Thanks a lot for your information. I'm having a difficult situation with shellcode manual writing. This is very helpful.