Wednesday, February 24, 2010

GSCookie in CLR

When .NET managed code tries to call unmanaged native function, CLR adds GSCookie into the thread stack right before calling unmanaged code. The reason is obvious, it is trying to detect buffer overrun. If native code overwrites GSCookie, we can fairly tell that there is a stack overrun problem in native code. Here is an example of GSCookie corruption callstack.

0018e35c 714b8aac kernel32!UnhandledExceptionFilter(struct _EXCEPTION_POINTERS * ExceptionInfo = 0x00000000)+0x1b5

0018e690 712fa8a6 mscorwks!__report_gsfailure(void)+0xdf

0018e694 71353321 mscorwks!DoJITFailFast(void)+0x5

0018e69c 710bdb47 mscorwks!CrawlFrame::CheckGSCookies(void)+0x1c

0018e6ac 710bd6d7 mscorwks!CrawlFrame::SetCurGSCookie(unsigned long * pGSCookie = 0x0018f164)+0x36

0018e888 710bdc67 mscorwks!Thread::StackWalkFramesEx(struct _REGDISPLAY * pRD = <Memory access error>, <function> * pCallback = <Memory access error>, void * pData = <Memory access error>, unsigned int flags = <Memory access error>, class Frame * pStartFrame = <Memory access error>)+0xd3

0018ebc4 710ffbd8 mscorwks!Thread::StackWalkFrames(<function> * pCallback = <Memory access error>, void * pData = <Memory access error>, unsigned int flags = <Memory access error>, class Frame * pStartFrame = <Memory access error>)+0xb8

0018ebe0 710fface mscorwks!LookForHandler(struct _EXCEPTION_POINTERS * pExceptionPointers = 0x0018ecd8, class Thread * pThread = 0x0030cf58, struct ThrowCallbackType * tct = 0x0018ec7c)+0x26

0018ecf8 71100148 mscorwks!CPFH_RealFirstPassHandler(struct _EXCEPTION_RECORD * pExceptionRecord = 0x0018ee48, struct _EXCEPTION_REGISTRATION_RECORD * pEstablisherFrame = 0x0018f214, struct _CONTEXT * pContext = 0x0018ee98, void * pDispatcherContext = 0x0018ee1c, int bAsynchronousThreadStop = 0n0, int fPGCDisabledOnEntry = 0n0)+0x49f

0018ed38 711002cd mscorwks!CPFH_FirstPassHandler(struct _EXCEPTION_RECORD * pExceptionRecord = 0x0018ee48, struct _EXCEPTION_REGISTRATION_RECORD * pEstablisherFrame = 0x0018f214, struct _CONTEXT * pContext = 0x00000000, struct _DISPATCHER_CONTEXT * pDispatcherContext = 0x0018ee1c)+0x113

0018ed5c 772545f5 mscorwks!COMPlusFrameHandler(struct _EXCEPTION_RECORD * pExceptionRecord = 0x0018ee48, struct _EXCEPTION_REGISTRATION_RECORD * pEstablisherFrame = 0x0018f214, struct _CONTEXT * pContext = 0x0018ee98, struct _DISPATCHER_CONTEXT * pDispatcherContext = 0x0018ee1c)+0x15a

0018ed80 772545c7 ntdll!ExecuteHandler2(void)+0x26

0018ee30 7722e49f ntdll!ExecuteHandler(void)+0x24

0018ee30 00b9b627 ntdll!KiUserExceptionDispatcher(void)+0xf

0:000> dd 0018f164 L1

0018f164 ee9b0eb0

We see mscorwks!__report_gsfailure() in the second frame. We can also find the GSCookie value [ee9b0eb0] as shown above. If searching the thread stack, most likely the cookie value will be there. How to fix the GS failure can vary but native code will be major culprit, so try to dig into the native world. Sometimes mismatched calling convention can be a problem even if the native code itself is fine. Generally speaking the same approach for native stack overrun should apply to this case.

Now let’s look at the GSCookie mechanism. In order to look into the details, I wrote a tiny C# application that calls Win32 GetFocus() function (p/invoke scenario).

namespace TestGetFocus
{
   public partial class Form1 : Form
   {
      public Form1()
     {
         InitializeComponent();
      }

    private void button1_MouseClick(object sender, MouseEventArgs e)
   {
      Control ctrl = null;
      IntPtr handle = GetFocus(); // Call native Win32 function
      if (handle != IntPtr.Zero)
     {
         ctrl = Control.FromHandle(handle);
         if (ctrl is Button)
         {
           ctrl.Text = string.Format("{0},{1}", e.X, e.Y);
         }
      }
   }

  [DllImport("user32.dll")]
  static extern IntPtr GetFocus();
 }
}

After running my test application (Testgetfocus.exe), I attached the Windbg to the process. Firstly, tried to set breakpoint at MouseClick method, so looked for button1_MouseClick by using !name2ee.
0:008> !name2ee TestGetFocus!TestGetFocus.Form1.button1_MouseClick
Module: 00232c5c (TestGetFocus.exe)
Token: 0x0600000a
MethodDesc: 00235a20
Name: TestGetFocus.Form1.button1_MouseClick(System.Object, System.Windows.Forms.MouseEventArgs)
Not JITTED yet. Use !bpmd -md 00235a20 to break on run.

This method is not JITTED, which means not compiled yet and so there is no physical native code address in the memory. SOS's bpmd was used to set breakpoint.
0:008> !bpmd -md 00235a20
MethodDesc = 00235a20
Adding pending breakpoints...

0:008> g
CLR notification: method 'TestGetFocus.Form1.button1_MouseClick(System.Object, System.Windows.Forms.MouseEventArgs)' code generated
(1de0.c9c): CLR notification exception - code e0444143 (first chance)
JITTED TestGetFocus!TestGetFocus.Form1.button1_MouseClick(System.Object, System.Windows.Forms.MouseEventArgs)
Setting breakpoint: bp 00490338 [TestGetFocus.Form1.button1_MouseClick(System.Object, System.Windows.Forms.MouseEventArgs)]
Breakpoint 1 hit
eax=00235a20 ebx=01ddf574 ecx=01da9b2c edx=01dc8280 esi=01dc91b8 edi=01dc8280
eip=00490338 esp=0018ed28 ebp=0018ed40 iopl=0 nv up ei ng nz na po cy
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000283
TestGetFocus!TestGetFocus.Form1.button1_MouseClick(System.Object, System.Windows.Forms.MouseEventArgs):
00490338 55 push ebp

So now when I clicked button, the MouseClick method was compiled and thus it has physical code address 00490338 as shown above.
0:000> !u .
Normal JIT generated code
TestGetFocus.Form1.button1_MouseClick(System.Object, System.Windows.Forms.MouseEventArgs)
Begin 00490338, size 129
>>> 00490338 55 push ebp
00490339 8bec mov ebp,esp
0049033b 83ec48 sub esp,48h
0049033e 894dfc mov dword ptr [ebp-4],ecx
00490341 8955f8 mov dword ptr [ebp-8],edx
00490344 833d142e230000 cmp dword ptr ds:[232E14h],0
0049034b 7405 je TestGetFocus!TestGetFocus.Form1.button1_MouseClick(System.Object, System.Windows.Forms.MouseEventArgs)+0x1a (00490352)
0049034d e8d7af4b71 call mscorwks!JIT_DbgIsJustMyCode (7194b329)
00490352 33d2 xor edx,edx
00490354 8955dc mov dword ptr [ebp-24h],edx
00490357 c745f000000000 mov dword ptr [ebp-10h],0
0049035e 33d2 xor edx,edx
00490360 8955f4 mov dword ptr [ebp-0Ch],edx
00490363 90 nop
00490364 33d2 xor edx,edx
00490366 8955dc mov dword ptr [ebp-24h],edx
00490369 e822bddaff call TestGetFocus.Form1.GetFocus() (0023c090) <=== GetFocus
0049036e 8945ec mov dword ptr [ebp-14h],eax
...
The function of our interest is GetFocus(), so let's look into it.
0:000> u 0023c090
TestGetFocus.Form1.GetFocus():
0023c090 b82c5a2300 mov eax,235A2Ch
0023c095 89ed mov ebp,ebp
0023c097 e988dfffff jmp CLRStub[StubLinkStub]@5703680023a024 (0023a024)
0023c09c 0000 add byte ptr [eax],al

It calls CLR stub code 0023a024. In the CLR stub code, we can find GSCooke (2CB72BCF). So right before calling unmanaged code, CLR pushes GSCookie value onto the stack as a guard. The value of GSCookie is changed every time the application is newly run. By the way, CLR uses GSCookie multiple times, so actual GSCookie can come and go dynamically in the stack.
0:000> u 0023a024
CLRStub[StubLinkStub]@23a024:
0023a024 682c5a2300 push 235A2Ch
0023a029 52 push edx
0023a02a 68903b6f71 push offset mscorwks!InterceptorFrame::`vftable' (716f3b90)
0023a02f 55 push ebp
0023a030 53 push ebx
0023a031 56 push esi
0023a032 57 push edi
0023a033 8d742410 lea esi,[esp+10h]
0023a037 ff760c push dword ptr [esi+0Ch]
0023a03a 55 push ebp
0023a03b 89e5 mov ebp,esp
0023a03d 51 push ecx
0023a03e 52 push edx
0023a03f 648b1d380e0000 mov ebx,dword ptr fs:[0E38h]
0023a046 8b7b0c mov edi,dword ptr [ebx+0Ch]
0023a049 897e04 mov dword ptr [esi+4],edi
0023a04c 89730c mov dword ptr [ebx+0Ch],esi
0023a04f 6800000000 push 0
0023a054 68cf2bb72c push 2CB72BCFh
0023a059 e814d15071 call mscorwks!DoSpecialUnmanagedCodeDemand (71747172)
0:000> u 0022a5e4
CLRStub[StubLinkStub]@22a5bb:
0022a5bb 8b4608 mov eax,dword ptr [esi+8]
0022a5be 8b4014 mov eax,dword ptr [eax+14h]
0022a5c1 ff10 call dword ptr [eax] <== ds:0023:00235a64={USER32!GetFocus (768d5a1b)}
0022a5c3 c6430801 mov byte ptr [ebx+8],1
0022a5c7 833d9443c37100 cmp dword ptr [mscorwks!g_TrapReturningThreads (71c34394)],0
0022a5ce 751b jne CLRStub[StubLinkStub]@5703680022a5eb (0022a5eb)

When 0022a5c1 code is executed, the stack has the following values. Return address (0022a5c3) was inserted and we can see GSCookie at 0018eca4.
0:000> dps esp
0018eca0 0022a5c3 CLRStub[StubLinkStub]@22a5c3
0018eca4 2cb72bcf <== GSCookie
0018eca8 00000000
0018ecac 00000000
0018ecb0 01da9b2c
0018ecb4 0018ed24
0018ecb8 0049036e TestGetFocus!TestGetFocus.Form1.button1_MouseClick(System.Object, System.Windows.Forms.MouseEventArgs)+0x36
0018ecbc 01dc8280
0018ecc0 01dc91b8
0018ecc4 01ddf574
0018ecc8 0018ed24
0018eccc 716f3ac8 mscorwks!NDirectMethodFrameStandalone::`vftable'
0018ecd0 0018edf4
0018ecd4 00235a2c
0018ecd8 0049036e TestGetFocus!TestGetFocus.Form1.button1_MouseClick(System.Object, System.Windows.Forms.MouseEventArgs)+0x36

I think that explains pretty much of GSCookie story. Now to complete native Win32 call story, let’s look into little further. GetFocus is in User32.dll and its code is actually simple like this:

0:000> u USER32!GetFocus
USER32!GetFocus
768d5a1b 6a00 push 0
768d5a1d e8e5ffffff call USER32!NtUserGetThreadState (768d5a07)
768d5a22 c3 ret

Need to look at USER32!NtUserGetThreadState (768d5a07)…
0:000> uf 768d5a07
USER32!NtUserGetThreadState
3743 768d5a07 b8c7110000 mov eax,11C7h
3745 768d5a0c ba0003fe7f mov edx,offset SharedUserData!SystemCallStub (7ffe0300)
3746 768d5a11 ff12 call dword ptr [edx] <== KiFastSystemCall
3747 768d5a13 c20400 ret 4
As seen below, the SystemCallStub is eventually evaluated to ntdll!KiFastSystemCall, which in turn calls Sysenter.

SysEnter expects system service number in EAX (so 11C7 was set above) and parameters in EDX (so ESP was set to EDX in below KiFastSystemCall).
0:000> bp 768d5a11
0:000> g
Breakpoint 0 hit
eax=000011c7 ebx=0030a8e8 ecx=01da9b2c edx=7ffe0300 esi=0018eccc edi=0018edf4
eip=768d5a11 esp=0018ec98 ebp=0018ecb4 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
USER32!NtUserGetThreadState+0xa:
768d5a11 ff12 call dword ptr [edx] ds:0023:7ffe0300={ntdll!KiFastSystemCall (779a64f0)}
0:000> uf 779a64f0
ntdll!KiFastSystemCall
635 779a64f0 8bd4 mov edx,esp
638 779a64f2 0f34 sysenter
648 779a64f4 c3 ret

Now we entered OS kernel, and the system service function to be processed. Since we’re using user mode debugger, this is the last resort we can see.