CoreCLR源码探索(八) JIT的工作原理(详解篇)

时间:2023-11-28 10:42:50

上一篇我们对CoreCLR中的JIT有了一个基础的了解,

这一篇我们将更详细分析JIT的实现.

JIT的实现代码主要在https://github.com/dotnet/coreclr/tree/master/src/jit下,

要对一个的函数的JIT过程进行详细分析, 最好的办法是查看JitDump.

查看JitDump需要自己编译一个Debug版本的CoreCLR, windows可以看这里, linux可以看这里,

编译完以后定义环境变量COMPlus_JitDump=Main, Main可以换成其他函数的名称, 然后使用该Debug版本的CoreCLR执行程序即可.

JitDump的例子可以看这里, 包含了Debug模式和Release模式的输出.

接下来我们来结合代码一步步的看JIT中的各个过程.

以下的代码基于CoreCLR 1.1.0和x86/x64分析, 新版本可能会有变化.

(为什么是1.1.0? 因为JIT部分我看了半年时间, 开始看的时候2.0还未出来)

JIT的触发

在上一篇中我提到了, 触发JIT编译会在第一次调用函数时, 会从桩(Stub)触发:

CoreCLR源码探索(八) JIT的工作原理(详解篇)

这就是JIT Stub实际的样子, 函数第一次调用前Fixup Precode的状态:

Fixup Precode:

(lldb) di --frame --bytes
-> 0x7fff7c21f5a8: e8 2b 6c fe ff callq 0x7fff7c2061d8
0x7fff7c21f5ad: 5e popq %rsi
0x7fff7c21f5ae: 19 05 e8 23 6c fe sbbl %eax, -0x193dc18(%rip)
0x7fff7c21f5b4: ff 5e a8 lcalll *-0x58(%rsi)
0x7fff7c21f5b7: 04 e8 addb $-0x18, %al
0x7fff7c21f5b9: 1b 6c fe ff sbbl -0x1(%rsi,%rdi,8), %ebp
0x7fff7c21f5bd: 5e popq %rsi
0x7fff7c21f5be: 00 03 addb %al, (%rbx)
0x7fff7c21f5c0: e8 13 6c fe ff callq 0x7fff7c2061d8
0x7fff7c21f5c5: 5e popq %rsi
0x7fff7c21f5c6: b0 02 movb $0x2, %al
(lldb) di --frame --bytes
-> 0x7fff7c2061d8: e9 13 3f 9d 79 jmp 0x7ffff5bda0f0 ; PrecodeFixupThunk
0x7fff7c2061dd: cc int3
0x7fff7c2061de: cc int3
0x7fff7c2061df: cc int3
0x7fff7c2061e0: 49 ba 00 da d0 7b ff 7f 00 00 movabsq $0x7fff7bd0da00, %r10
0x7fff7c2061ea: 40 e9 e0 ff ff ff jmp 0x7fff7c2061d0

这两段代码只有第一条指令是相关的, 注意callq后面的5e 19 05, 这些并不是汇编指令而是函数的信息, 下面会提到.

接下来跳转到Fixup Precode Chunk, 从这里开始的代码所有函数都会共用:

Fixup Precode Chunk:

(lldb) di --frame --bytes
-> 0x7ffff5bda0f0 <PrecodeFixupThunk>: 58 popq %rax ; rax = 0x7fff7c21f5ad
0x7ffff5bda0f1 <PrecodeFixupThunk+1>: 4c 0f b6 50 02 movzbq 0x2(%rax), %r10 ; r10 = 0x05 (precode chunk index)
0x7ffff5bda0f6 <PrecodeFixupThunk+6>: 4c 0f b6 58 01 movzbq 0x1(%rax), %r11 ; r11 = 0x19 (methoddesc chunk index)
0x7ffff5bda0fb <PrecodeFixupThunk+11>: 4a 8b 44 d0 03 movq 0x3(%rax,%r10,8), %rax ; rax = 0x7fff7bdd5040 (methoddesc chunk)
0x7ffff5bda100 <PrecodeFixupThunk+16>: 4e 8d 14 d8 leaq (%rax,%r11,8), %r10 ; r10 = 0x7fff7bdd5108 (methoddesc)
0x7ffff5bda104 <PrecodeFixupThunk+20>: e9 37 ff ff ff jmp 0x7ffff5bda040 ; ThePreStub

这段代码的源代码在vm\amd64\unixasmhelpers.S:

LEAF_ENTRY PrecodeFixupThunk, _TEXT

        pop     rax         // Pop the return address. It points right after the call instruction in the precode.

        // Inline computation done by FixupPrecode::GetMethodDesc()
movzx r10,byte ptr [rax+2] // m_PrecodeChunkIndex
movzx r11,byte ptr [rax+1] // m_MethodDescChunkIndex
mov rax,qword ptr [rax+r10*8+3]
lea METHODDESC_REGISTER,[rax+r11*8] // Tail call to prestub
jmp C_FUNC(ThePreStub) LEAF_END PrecodeFixupThunk, _TEXT

popq %rax后rax会指向刚才callq后面的地址, 再根据后面储存的索引值可以得到编译函数的MethodDesc, 接下来跳转到The PreStub:

ThePreStub:

(lldb) di --frame --bytes
-> 0x7ffff5bda040 <ThePreStub>: 55 pushq %rbp
0x7ffff5bda041 <ThePreStub+1>: 48 89 e5 movq %rsp, %rbp
0x7ffff5bda044 <ThePreStub+4>: 53 pushq %rbx
0x7ffff5bda045 <ThePreStub+5>: 41 57 pushq %r15
0x7ffff5bda047 <ThePreStub+7>: 41 56 pushq %r14
0x7ffff5bda049 <ThePreStub+9>: 41 55 pushq %r13
0x7ffff5bda04b <ThePreStub+11>: 41 54 pushq %r12
0x7ffff5bda04d <ThePreStub+13>: 41 51 pushq %r9
0x7ffff5bda04f <ThePreStub+15>: 41 50 pushq %r8
0x7ffff5bda051 <ThePreStub+17>: 51 pushq %rcx
0x7ffff5bda052 <ThePreStub+18>: 52 pushq %rdx
0x7ffff5bda053 <ThePreStub+19>: 56 pushq %rsi
0x7ffff5bda054 <ThePreStub+20>: 57 pushq %rdi
0x7ffff5bda055 <ThePreStub+21>: 48 8d a4 24 78 ff ff ff leaq -0x88(%rsp), %rsp ; allocate transition block
0x7ffff5bda05d <ThePreStub+29>: 66 0f 7f 04 24 movdqa %xmm0, (%rsp) ; fill transition block
0x7ffff5bda062 <ThePreStub+34>: 66 0f 7f 4c 24 10 movdqa %xmm1, 0x10(%rsp) ; fill transition block
0x7ffff5bda068 <ThePreStub+40>: 66 0f 7f 54 24 20 movdqa %xmm2, 0x20(%rsp) ; fill transition block
0x7ffff5bda06e <ThePreStub+46>: 66 0f 7f 5c 24 30 movdqa %xmm3, 0x30(%rsp) ; fill transition block
0x7ffff5bda074 <ThePreStub+52>: 66 0f 7f 64 24 40 movdqa %xmm4, 0x40(%rsp) ; fill transition block
0x7ffff5bda07a <ThePreStub+58>: 66 0f 7f 6c 24 50 movdqa %xmm5, 0x50(%rsp) ; fill transition block
0x7ffff5bda080 <ThePreStub+64>: 66 0f 7f 74 24 60 movdqa %xmm6, 0x60(%rsp) ; fill transition block
0x7ffff5bda086 <ThePreStub+70>: 66 0f 7f 7c 24 70 movdqa %xmm7, 0x70(%rsp) ; fill transition block
0x7ffff5bda08c <ThePreStub+76>: 48 8d bc 24 88 00 00 00 leaq 0x88(%rsp), %rdi ; arg 1 = transition block*
0x7ffff5bda094 <ThePreStub+84>: 4c 89 d6 movq %r10, %rsi ; arg 2 = methoddesc
0x7ffff5bda097 <ThePreStub+87>: e8 44 7e 11 00 callq 0x7ffff5cf1ee0 ; PreStubWorker at prestub.cpp:958
0x7ffff5bda09c <ThePreStub+92>: 66 0f 6f 04 24 movdqa (%rsp), %xmm0
0x7ffff5bda0a1 <ThePreStub+97>: 66 0f 6f 4c 24 10 movdqa 0x10(%rsp), %xmm1
0x7ffff5bda0a7 <ThePreStub+103>: 66 0f 6f 54 24 20 movdqa 0x20(%rsp), %xmm2
0x7ffff5bda0ad <ThePreStub+109>: 66 0f 6f 5c 24 30 movdqa 0x30(%rsp), %xmm3
0x7ffff5bda0b3 <ThePreStub+115>: 66 0f 6f 64 24 40 movdqa 0x40(%rsp), %xmm4
0x7ffff5bda0b9 <ThePreStub+121>: 66 0f 6f 6c 24 50 movdqa 0x50(%rsp), %xmm5
0x7ffff5bda0bf <ThePreStub+127>: 66 0f 6f 74 24 60 movdqa 0x60(%rsp), %xmm6
0x7ffff5bda0c5 <ThePreStub+133>: 66 0f 6f 7c 24 70 movdqa 0x70(%rsp), %xmm7
0x7ffff5bda0cb <ThePreStub+139>: 48 8d a4 24 88 00 00 00 leaq 0x88(%rsp), %rsp
0x7ffff5bda0d3 <ThePreStub+147>: 5f popq %rdi
0x7ffff5bda0d4 <ThePreStub+148>: 5e popq %rsi
0x7ffff5bda0d5 <ThePreStub+149>: 5a popq %rdx
0x7ffff5bda0d6 <ThePreStub+150>: 59 popq %rcx
0x7ffff5bda0d7 <ThePreStub+151>: 41 58 popq %r8
0x7ffff5bda0d9 <ThePreStub+153>: 41 59 popq %r9
0x7ffff5bda0db <ThePreStub+155>: 41 5c popq %r12
0x7ffff5bda0dd <ThePreStub+157>: 41 5d popq %r13
0x7ffff5bda0df <ThePreStub+159>: 41 5e popq %r14
0x7ffff5bda0e1 <ThePreStub+161>: 41 5f popq %r15
0x7ffff5bda0e3 <ThePreStub+163>: 5b popq %rbx
0x7ffff5bda0e4 <ThePreStub+164>: 5d popq %rbp
0x7ffff5bda0e5 <ThePreStub+165>: 48 ff e0 jmpq *%rax
%rax should be patched fixup precode = 0x7fff7c21f5a8
(%rsp) should be the return address before calling "Fixup Precode"

看上去相当长但做的事情很简单, 它的源代码在vm\amd64\theprestubamd64.S:

NESTED_ENTRY ThePreStub, _TEXT, NoHandler
PROLOG_WITH_TRANSITION_BLOCK 0, 0, 0, 0, 0 //
// call PreStubWorker
//
lea rdi, [rsp + __PWTB_TransitionBlock] // pTransitionBlock*
mov rsi, METHODDESC_REGISTER
call C_FUNC(PreStubWorker) EPILOG_WITH_TRANSITION_BLOCK_TAILCALL
TAILJMP_RAX NESTED_END ThePreStub, _TEXT

它会备份寄存器到栈, 然后调用PreStubWorker这个函数, 调用完毕以后恢复栈上的寄存器,

跳转到PreStubWorker的返回结果, 也就是打完补丁后的Fixup Precode的地址(0x7fff7c21f5a8).

PreStubWorker是C编写的函数, 它会调用JIT的编译函数, 然后对Fixup Precode打补丁.

打补丁时会读取前面的5e, 5e代表precode的类型是PRECODE_FIXUP, 打补丁的函数是FixupPrecode::SetTargetInterlocked.

打完补丁以后的Fixup Precode如下:

Fixup Precode:

(lldb) di --bytes -s 0x7fff7c21f5a8
0x7fff7c21f5a8: e9 a3 87 3a 00 jmp 0x7fff7c5c7d50
0x7fff7c21f5ad: 5f popq %rdi
0x7fff7c21f5ae: 19 05 e8 23 6c fe sbbl %eax, -0x193dc18(%rip)
0x7fff7c21f5b4: ff 5e a8 lcalll *-0x58(%rsi)
0x7fff7c21f5b7: 04 e8 addb $-0x18, %al
0x7fff7c21f5b9: 1b 6c fe ff sbbl -0x1(%rsi,%rdi,8), %ebp
0x7fff7c21f5bd: 5e popq %rsi
0x7fff7c21f5be: 00 03 addb %al, (%rbx)
0x7fff7c21f5c0: e8 13 6c fe ff callq 0x7fff7c2061d8
0x7fff7c21f5c5: 5e popq %rsi
0x7fff7c21f5c6: b0 02 movb $0x2, %al

下次再调用函数时就可以直接jmp到编译结果了.

JIT Stub的实现可以让运行时只编译实际会运行的函数, 这样可以大幅减少程序的启动时间, 第二次调用时的消耗(1个jmp)也非常的小.

注意调用虚方法时的流程跟上面的流程有一点不同, 虚方法的地址会保存在函数表中,

打补丁时会对函数表而不是Precode打补丁, 下次调用时函数表中指向的地址是编译后的地址, 有兴趣可以自己试试分析.

接下来我们看看PreStubWorker的内部处理.

JIT的入口点

PreStubWorker的源代码如下:

extern "C" PCODE STDCALL PreStubWorker(TransitionBlock * pTransitionBlock, MethodDesc * pMD)
{
PCODE pbRetVal = NULL; BEGIN_PRESERVE_LAST_ERROR; STATIC_CONTRACT_THROWS;
STATIC_CONTRACT_GC_TRIGGERS;
STATIC_CONTRACT_MODE_COOPERATIVE;
STATIC_CONTRACT_ENTRY_POINT; MAKE_CURRENT_THREAD_AVAILABLE(); #ifdef _DEBUG
Thread::ObjectRefFlush(CURRENT_THREAD);
#endif FrameWithCookie<PrestubMethodFrame> frame(pTransitionBlock, pMD);
PrestubMethodFrame * pPFrame = &frame; pPFrame->Push(CURRENT_THREAD); INSTALL_MANAGED_EXCEPTION_DISPATCHER;
INSTALL_UNWIND_AND_CONTINUE_HANDLER; ETWOnStartup (PrestubWorker_V1,PrestubWorkerEnd_V1); _ASSERTE(!NingenEnabled() && "You cannot invoke managed code inside the ngen compilation process."); // Running the PreStubWorker on a method causes us to access its MethodTable
g_IBCLogger.LogMethodDescAccess(pMD); // Make sure the method table is restored, and method instantiation if present
pMD->CheckRestore(); CONSISTENCY_CHECK(GetAppDomain()->CheckCanExecuteManagedCode(pMD)); // Note this is redundant with the above check but we do it anyway for safety
//
// This has been disabled so we have a better chance of catching these. Note that this check is
// NOT sufficient for domain neutral and ngen cases.
//
// pMD->EnsureActive(); MethodTable *pDispatchingMT = NULL; if (pMD->IsVtableMethod())
{
OBJECTREF curobj = pPFrame->GetThis(); if (curobj != NULL) // Check for virtual function called non-virtually on a NULL object
{
pDispatchingMT = curobj->GetTrueMethodTable(); #ifdef FEATURE_ICASTABLE
if (pDispatchingMT->IsICastable())
{
MethodTable *pMDMT = pMD->GetMethodTable();
TypeHandle objectType(pDispatchingMT);
TypeHandle methodType(pMDMT); GCStress<cfg_any>::MaybeTrigger();
INDEBUG(curobj = NULL); // curobj is unprotected and CanCastTo() can trigger GC
if (!objectType.CanCastTo(methodType))
{
// Apperantly ICastable magic was involved when we chose this method to be called
// that's why we better stick to the MethodTable it belongs to, otherwise
// DoPrestub() will fail not being able to find implementation for pMD in pDispatchingMT. pDispatchingMT = pMDMT;
}
}
#endif // FEATURE_ICASTABLE // For value types, the only virtual methods are interface implementations.
// Thus pDispatching == pMT because there
// is no inheritance in value types. Note the BoxedEntryPointStubs are shared
// between all sharable generic instantiations, so the == test is on
// canonical method tables.
#ifdef _DEBUG
MethodTable *pMDMT = pMD->GetMethodTable(); // put this here to see what the MT is in debug mode
_ASSERTE(!pMD->GetMethodTable()->IsValueType() ||
(pMD->IsUnboxingStub() && (pDispatchingMT->GetCanonicalMethodTable() == pMDMT->GetCanonicalMethodTable())));
#endif // _DEBUG
}
} GCX_PREEMP_THREAD_EXISTS(CURRENT_THREAD);
pbRetVal = pMD->DoPrestub(pDispatchingMT); UNINSTALL_UNWIND_AND_CONTINUE_HANDLER;
UNINSTALL_MANAGED_EXCEPTION_DISPATCHER; {
HardwareExceptionHolder // Give debugger opportunity to stop here
ThePreStubPatch();
} pPFrame->Pop(CURRENT_THREAD); POSTCONDITION(pbRetVal != NULL); END_PRESERVE_LAST_ERROR; return pbRetVal;
}

这个函数接收了两个参数,

第一个是TransitionBlock, 其实就是一个指向栈的指针, 里面保存了备份的寄存器,

第二个是MethodDesc, 是当前编译函数的信息, lldb中使用dumpmd pMD即可看到具体信息.

之后会调用MethodDesc::DoPrestub, 如果函数是虚方法则传入this对象类型的MethodTable.

MethodDesc::DoPrestub的源代码如下:

PCODE MethodDesc::DoPrestub(MethodTable *pDispatchingMT)
{
CONTRACT(PCODE)
{
STANDARD_VM_CHECK;
POSTCONDITION(RETVAL != NULL);
}
CONTRACT_END; Stub *pStub = NULL;
PCODE pCode = NULL; Thread *pThread = GetThread(); MethodTable *pMT = GetMethodTable(); // Running a prestub on a method causes us to access its MethodTable
g_IBCLogger.LogMethodDescAccess(this); // A secondary layer of defense against executing code in inspection-only assembly.
// This should already have been taken care of by not allowing inspection assemblies
// to be activated. However, this is a very inexpensive piece of insurance in the name
// of security.
if (IsIntrospectionOnly())
{
_ASSERTE(!"A ReflectionOnly assembly reached the prestub. This should not have happened.");
COMPlusThrow(kInvalidOperationException, IDS_EE_CODEEXECUTION_IN_INTROSPECTIVE_ASSEMBLY);
} if (ContainsGenericVariables())
{
COMPlusThrow(kInvalidOperationException, IDS_EE_CODEEXECUTION_CONTAINSGENERICVAR);
} /************************** DEBUG CHECKS *************************/
/*-----------------------------------------------------------------
// Halt if needed, GC stress, check the sharing count etc.
*/ #ifdef _DEBUG
static unsigned ctr = 0;
ctr++; if (g_pConfig->ShouldPrestubHalt(this))
{
_ASSERTE(!"PreStubHalt");
} LOG((LF_CLASSLOADER, LL_INFO10000, "In PreStubWorker for %s::%s\n",
m_pszDebugClassName, m_pszDebugMethodName)); // This is a nice place to test out having some fatal EE errors. We do this only in a checked build, and only
// under the InjectFatalError key.
if (g_pConfig->InjectFatalError() == 1)
{
EEPOLICY_HANDLE_FATAL_ERROR(COR_E_EXECUTIONENGINE);
}
else if (g_pConfig->InjectFatalError() == 2)
{
EEPOLICY_HANDLE_FATAL_ERROR(COR_E_*);
}
else if (g_pConfig->InjectFatalError() == 3)
{
TestSEHGuardPageRestore();
} // Useful to test GC with the prestub on the call stack
if (g_pConfig->ShouldPrestubGC(this))
{
GCX_COOP();
GCHeap::GetGCHeap()->GarbageCollect(-1);
}
#endif // _DEBUG STRESS_LOG1(LF_CLASSLOADER, LL_INFO10000, "Prestubworker: method %pM\n", this); GCStress<cfg_any, EeconfigFastGcSPolicy, CoopGcModePolicy>::MaybeTrigger(); // Are we in the prestub because of a rejit request? If so, let the ReJitManager
// take it from here.
pCode = ReJitManager::DoReJitIfNecessary(this);
if (pCode != NULL)
{
// A ReJIT was performed, so nothing left for DoPrestub() to do. Return now.
//
// The stable entrypoint will either be a pointer to the original JITted code
// (with a jmp at the top to jump to the newly-rejitted code) OR a pointer to any
// stub code that must be executed first (e.g., a remoting stub), which in turn
// will call the original JITted code (which then jmps to the newly-rejitted
// code).
RETURN GetStableEntryPoint();
} #ifdef FEATURE_PREJIT
// If this method is the root of a CER call graph and we've recorded this fact in the ngen image then we're in the prestub in
// order to trip any runtime level preparation needed for this graph (P/Invoke stub generation/library binding, generic
// dictionary prepopulation etc.).
GetModule()->RestoreCer(this);
#endif // FEATURE_PREJIT #ifdef FEATURE_COMINTEROP
/************************** INTEROP *************************/
/*-----------------------------------------------------------------
// Some method descriptors are COMPLUS-to-COM call descriptors
// they are not your every day method descriptors, for example
// they don't have an IL or code.
*/
if (IsComPlusCall() || IsGenericComPlusCall())
{
pCode = GetStubForInteropMethod(this); GetPrecode()->SetTargetInterlocked(pCode); RETURN GetStableEntryPoint();
}
#endif // FEATURE_COMINTEROP // workaround: This is to handle a punted work item dealing with a skipped module constructor
// due to appdomain unload. Basically shared code was JITted in domain A, and then
// this caused a link to another shared module with a module CCTOR, which was skipped
// or aborted in another appdomain we were trying to propagate the activation to.
//
// Note that this is not a fix, but that it just minimizes the window in which the
// issue can occur.
if (pThread->IsAbortRequested())
{
pThread->HandleThreadAbort();
} /************************** CLASS CONSTRUCTOR ********************/
// Make sure .cctor has been run if (IsClassConstructorTriggeredViaPrestub())
{
pMT->CheckRunClassInitThrowing();
} /************************** BACKPATCHING *************************/
// See if the addr of code has changed from the pre-stub
#ifdef FEATURE_INTERPRETER
if (!IsReallyPointingToPrestub())
#else
if (!IsPointingToPrestub())
#endif
{
LOG((LF_CLASSLOADER, LL_INFO10000,
" In PreStubWorker, method already jitted, backpatching call point\n")); RETURN DoBackpatch(pMT, pDispatchingMT, TRUE);
} // record if remoting needs to intercept this call
BOOL fRemotingIntercepted = IsRemotingInterceptedViaPrestub(); BOOL fReportCompilationFinished = FALSE; /************************** CODE CREATION *************************/
if (IsUnboxingStub())
{
pStub = MakeUnboxingStubWorker(this);
}
#ifdef FEATURE_REMOTING
else if (pMT->IsInterface() && !IsStatic() && !IsFCall())
{
pCode = CRemotingServices::GetDispatchInterfaceHelper(this);
GetOrCreatePrecode();
}
#endif // FEATURE_REMOTING
#if defined(FEATURE_SHARE_GENERIC_CODE)
else if (IsInstantiatingStub())
{
pStub = MakeInstantiatingStubWorker(this);
}
#endif // defined(FEATURE_SHARE_GENERIC_CODE)
else if (IsIL() || IsNoMetadata())
{
// remember if we need to backpatch the MethodTable slot
BOOL fBackpatch = !fRemotingIntercepted
&& !IsEnCMethod(); #ifdef FEATURE_PREJIT
//
// See if we have any prejitted code to use.
// pCode = GetPreImplementedCode(); #ifdef PROFILING_SUPPORTED
if (pCode != NULL)
{
BOOL fShouldSearchCache = TRUE; {
BEGIN_PIN_PROFILER(CORProfilerTrackCacheSearches());
g_profControlBlock.pProfInterface->
JITCachedFunctionSearchStarted((FunctionID) this,
&fShouldSearchCache);
END_PIN_PROFILER();
} if (!fShouldSearchCache)
{
#ifdef FEATURE_INTERPRETER
SetNativeCodeInterlocked(NULL, pCode, FALSE);
#else
SetNativeCodeInterlocked(NULL, pCode);
#endif
_ASSERTE(!IsPreImplemented());
pCode = NULL;
}
}
#endif // PROFILING_SUPPORTED if (pCode != NULL)
{
LOG((LF_ZAP, LL_INFO10000,
"ZAP: Using code" FMT_ADDR "for %s.%s sig=\"%s\" (token %x).\n",
DBG_ADDR(pCode),
m_pszDebugClassName,
m_pszDebugMethodName,
m_pszDebugMethodSignature,
GetMemberDef())); TADDR pFixupList = GetFixupList();
if (pFixupList != NULL)
{
Module *pZapModule = GetZapModule();
_ASSERTE(pZapModule != NULL);
if (!pZapModule->FixupDelayList(pFixupList))
{
_ASSERTE(!"FixupDelayList failed");
ThrowHR(COR_E_BADIMAGEFORMAT);
}
} #ifdef HAVE_GCCOVER
if (GCStress<cfg_instr_ngen>::IsEnabled())
SetupGcCoverage(this, (BYTE*) pCode);
#endif // HAVE_GCCOVER #ifdef PROFILING_SUPPORTED
/*
* This notifies the profiler that a search to find a
* cached jitted function has been made.
*/
{
BEGIN_PIN_PROFILER(CORProfilerTrackCacheSearches());
g_profControlBlock.pProfInterface->
JITCachedFunctionSearchFinished((FunctionID) this, COR_PRF_CACHED_FUNCTION_FOUND);
END_PIN_PROFILER();
}
#endif // PROFILING_SUPPORTED
} //
// If not, try to jit it
// #endif // FEATURE_PREJIT #ifdef FEATURE_READYTORUN
if (pCode == NULL)
{
Module * pModule = GetModule();
if (pModule->IsReadyToRun())
{
pCode = pModule->GetReadyToRunInfo()->GetEntryPoint(this);
if (pCode != NULL)
fReportCompilationFinished = TRUE;
}
}
#endif // FEATURE_READYTORUN if (pCode == NULL)
{
NewHolder<COR_ILMETHOD_DECODER> pHeader(NULL);
// Get the information on the method
if (!IsNoMetadata())
{
COR_ILMETHOD* ilHeader = GetILHeader(TRUE);
if(ilHeader == NULL)
{
#ifdef FEATURE_COMINTEROP
// Abstract methods can be called through WinRT derivation if the deriving type
// is not implemented in managed code, and calls through the CCW to the abstract
// method. Throw a sensible exception in that case.
if (pMT->IsExportedToWinRT() && IsAbstract())
{
COMPlusThrowHR(E_NOTIMPL);
}
#endif // FEATURE_COMINTEROP COMPlusThrowHR(COR_E_BADIMAGEFORMAT, BFA_BAD_IL);
} COR_ILMETHOD_DECODER::DecoderStatus status = COR_ILMETHOD_DECODER::FORMAT_ERROR; {
// Decoder ctor can AV on a malformed method header
AVInRuntimeImplOkayHolder AVOkay;
pHeader = new COR_ILMETHOD_DECODER(ilHeader, GetMDImport(), &status);
if(pHeader == NULL)
status = COR_ILMETHOD_DECODER::FORMAT_ERROR;
} if (status == COR_ILMETHOD_DECODER::VERIFICATION_ERROR &&
Security::CanSkipVerification(GetModule()->GetDomainAssembly()))
{
status = COR_ILMETHOD_DECODER::SUCCESS;
} if (status != COR_ILMETHOD_DECODER::SUCCESS)
{
if (status == COR_ILMETHOD_DECODER::VERIFICATION_ERROR)
{
// Throw a verification HR
COMPlusThrowHR(COR_E_VERIFICATION);
}
else
{
COMPlusThrowHR(COR_E_BADIMAGEFORMAT, BFA_BAD_IL);
}
} #ifdef _VER_EE_VERIFICATION_ENABLED
static ConfigDWORD peVerify; if (peVerify.val(CLRConfig::EXTERNAL_PEVerify))
Verify(pHeader, TRUE, FALSE); // Throws a VerifierException if verification fails
#endif // _VER_EE_VERIFICATION_ENABLED
} // end if (!IsNoMetadata()) // JIT it
LOG((LF_CLASSLOADER, LL_INFO1000000,
" In PreStubWorker, calling MakeJitWorker\n")); // Create the precode eagerly if it is going to be needed later.
if (!fBackpatch)
{
GetOrCreatePrecode();
} // Mark the code as hot in case the method ends up in the native image
g_IBCLogger.LogMethodCodeAccess(this); pCode = MakeJitWorker(pHeader, 0, 0); #ifdef FEATURE_INTERPRETER
if ((pCode != NULL) && !HasStableEntryPoint())
{
// We don't yet have a stable entry point, so don't do backpatching yet.
// But we do have to handle some extra cases that occur in backpatching.
// (Perhaps I *should* get to the backpatching code, but in a mode where we know
// we're not dealing with the stable entry point...)
if (HasNativeCodeSlot())
{
// We called "SetNativeCodeInterlocked" in MakeJitWorker, which updated the native
// code slot, but I think we also want to update the regular slot...
PCODE tmpEntry = GetTemporaryEntryPoint();
PCODE pFound = FastInterlockCompareExchangePointer(GetAddrOfSlot(), pCode, tmpEntry);
// Doesn't matter if we failed -- if we did, it's because somebody else made progress.
if (pFound != tmpEntry) pCode = pFound;
} // Now we handle the case of a FuncPtrPrecode.
FuncPtrStubs * pFuncPtrStubs = GetLoaderAllocator()->GetFuncPtrStubsNoCreate();
if (pFuncPtrStubs != NULL)
{
Precode* pFuncPtrPrecode = pFuncPtrStubs->Lookup(this);
if (pFuncPtrPrecode != NULL)
{
// If there is a funcptr precode to patch, attempt to patch it. If we lose, that's OK,
// somebody else made progress.
pFuncPtrPrecode->SetTargetInterlocked(pCode);
}
}
}
#endif // FEATURE_INTERPRETER
} // end if (pCode == NULL)
} // end else if (IsIL() || IsNoMetadata())
else if (IsNDirect())
{
if (!GetModule()->GetSecurityDescriptor()->CanCallUnmanagedCode())
Security::ThrowSecurityException(g_SecurityPermissionClassName, SPFLAGSUNMANAGEDCODE); pCode = GetStubForInteropMethod(this);
GetOrCreatePrecode();
}
else if (IsFCall())
{
// Get the fcall implementation
BOOL fSharedOrDynamicFCallImpl;
pCode = ECall::GetFCallImpl(this, &fSharedOrDynamicFCallImpl); if (fSharedOrDynamicFCallImpl)
{
// Fake ctors share one implementation that has to be wrapped by prestub
GetOrCreatePrecode();
}
}
else if (IsArray())
{
pStub = GenerateArrayOpStub((ArrayMethodDesc*)this);
}
else if (IsEEImpl())
{
_ASSERTE(GetMethodTable()->IsDelegate());
pCode = COMDelegate::GetInvokeMethodStub((EEImplMethodDesc*)this);
GetOrCreatePrecode();
}
else
{
// This is a method type we don't handle yet
_ASSERTE(!"Unknown Method Type");
} /************************** POSTJIT *************************/
#ifndef FEATURE_INTERPRETER
_ASSERTE(pCode == NULL || GetNativeCode() == NULL || pCode == GetNativeCode());
#else // FEATURE_INTERPRETER
// Interpreter adds a new possiblity == someone else beat us to installing an intepreter stub.
_ASSERTE(pCode == NULL || GetNativeCode() == NULL || pCode == GetNativeCode()
|| Interpreter::InterpretationStubToMethodInfo(pCode) == this);
#endif // FEATURE_INTERPRETER // At this point we must have either a pointer to managed code or to a stub. All of the above code
// should have thrown an exception if it couldn't make a stub.
_ASSERTE((pStub != NULL) ^ (pCode != NULL)); /************************** SECURITY *************************/ // Lets check to see if we need declarative security on this stub, If we have
// security checks on this method or class then we need to add an intermediate
// stub that performs declarative checks prior to calling the real stub.
// record if security needs to intercept this call (also depends on whether we plan to use stubs for declarative security) #if !defined( HAS_REMOTING_PRECODE) && defined (FEATURE_REMOTING)
/************************** REMOTING *************************/ // check for MarshalByRef scenarios ... we need to intercept
// Non-virtual calls on MarshalByRef types
if (fRemotingIntercepted)
{
// let us setup a remoting stub to intercept all the calls
Stub *pRemotingStub = CRemotingServices::GetStubForNonVirtualMethod(this,
(pStub != NULL) ? (LPVOID)pStub->GetEntryPoint() : (LPVOID)pCode, pStub); if (pRemotingStub != NULL)
{
pStub = pRemotingStub;
pCode = NULL;
}
}
#endif // HAS_REMOTING_PRECODE _ASSERTE((pStub != NULL) ^ (pCode != NULL)); #if defined(_TARGET_X86_) || defined(_TARGET_AMD64_)
//
// We are seeing memory reordering race around fixups (see DDB 193514 and related bugs). We get into
// situation where the patched precode is visible by other threads, but the resolved fixups
// are not. IT SHOULD NEVER HAPPEN according to our current understanding of x86/x64 memory model.
// (see email thread attached to the bug for details).
//
// We suspect that there may be bug in the hardware or that hardware may have shortcuts that may be
// causing grief. We will try to avoid the race by executing an extra memory barrier.
//
MemoryBarrier();
#endif if (pCode != NULL)
{
if (HasPrecode())
GetPrecode()->SetTargetInterlocked(pCode);
else
if (!HasStableEntryPoint())
{
// Is the result an interpreter stub?
#ifdef FEATURE_INTERPRETER
if (Interpreter::InterpretationStubToMethodInfo(pCode) == this)
{
SetEntryPointInterlocked(pCode);
}
else
#endif // FEATURE_INTERPRETER
{
SetStableEntryPointInterlocked(pCode);
}
}
}
else
{
if (!GetOrCreatePrecode()->SetTargetInterlocked(pStub->GetEntryPoint()))
{
pStub->DecRef();
}
else
if (pStub->HasExternalEntryPoint())
{
// If the Stub wraps code that is outside of the Stub allocation, then we
// need to free the Stub allocation now.
pStub->DecRef();
}
} #ifdef FEATURE_INTERPRETER
_ASSERTE(!IsReallyPointingToPrestub());
#else // FEATURE_INTERPRETER
_ASSERTE(!IsPointingToPrestub());
_ASSERTE(HasStableEntryPoint());
#endif // FEATURE_INTERPRETER if (fReportCompilationFinished)
DACNotifyCompilationFinished(this); RETURN DoBackpatch(pMT, pDispatchingMT, FALSE);
}

这个函数比较长, 我们只需要关注两个地方:

pCode = MakeJitWorker(pHeader, 0, 0);

MakeJitWorker会调用JIT编译函数, pCode是编译后的机器代码地址.

if (HasPrecode())
GetPrecode()->SetTargetInterlocked(pCode);

SetTargetInterlocked会对Precode打补丁, 第二次调用函数时会直接跳转到编译结果.

MakeJitWorker的源代码如下:

PCODE MethodDesc::MakeJitWorker(COR_ILMETHOD_DECODER* ILHeader, DWORD flags, DWORD flags2)
{
STANDARD_VM_CONTRACT; BOOL fIsILStub = IsILStub(); // @TODO: understand the need for this special case LOG((LF_JIT, LL_INFO1000000,
"MakeJitWorker(" FMT_ADDR ", %s) for %s:%s\n",
DBG_ADDR(this),
fIsILStub ? " TRUE" : "FALSE",
GetMethodTable()->GetDebugClassName(),
m_pszDebugMethodName)); PCODE pCode = NULL;
ULONG sizeOfCode = 0;
#ifdef FEATURE_INTERPRETER
PCODE pPreviousInterpStub = NULL;
BOOL fInterpreted = FALSE;
BOOL fStable = TRUE; // True iff the new code address (to be stored in pCode), is a stable entry point.
#endif #ifdef FEATURE_MULTICOREJIT
MulticoreJitManager & mcJitManager = GetAppDomain()->GetMulticoreJitManager(); bool fBackgroundThread = (flags & CORJIT_FLG_MCJIT_BACKGROUND) != 0;
#endif {
// Enter the global lock which protects the list of all functions being JITd
ListLockHolder pJitLock (GetDomain()->GetJitLock()); // It is possible that another thread stepped in before we entered the global lock for the first time.
pCode = GetNativeCode();
if (pCode != NULL)
{
#ifdef FEATURE_INTERPRETER
if (Interpreter::InterpretationStubToMethodInfo(pCode) == this)
{
pPreviousInterpStub = pCode;
}
else
#endif // FEATURE_INTERPRETER
goto Done;
} const char *description = "jit lock";
INDEBUG(description = m_pszDebugMethodName;)
ListLockEntryHolder pEntry(ListLockEntry::Find(pJitLock, this, description)); // We have an entry now, we can release the global lock
pJitLock.Release(); // Take the entry lock
{
ListLockEntryLockHolder pEntryLock(pEntry, FALSE); if (pEntryLock.DeadlockAwareAcquire())
{
if (pEntry->m_hrResultCode == S_FALSE)
{
// Nobody has jitted the method yet
}
else
{
// We came in to jit but someone beat us so return the
// jitted method! // We can just fall through because we will notice below that
// the method has code. // @todo: Note that we may have a failed HRESULT here -
// we might want to return an early error rather than
// repeatedly failing the jit.
}
}
else
{
// Taking this lock would cause a deadlock (presumably because we
// are involved in a class constructor circular dependency.) For
// instance, another thread may be waiting to run the class constructor
// that we are jitting, but is currently jitting this function.
//
// To remedy this, we want to go ahead and do the jitting anyway.
// The other threads contending for the lock will then notice that
// the jit finished while they were running class constructors, and abort their
// current jit effort.
//
// We don't have to do anything special right here since we
// can check HasNativeCode() to detect this case later.
//
// Note that at this point we don't have the lock, but that's OK because the
// thread which does have the lock is blocked waiting for us.
} // It is possible that another thread stepped in before we entered the lock.
pCode = GetNativeCode();
#ifdef FEATURE_INTERPRETER
if (pCode != NULL && (pCode != pPreviousInterpStub))
#else
if (pCode != NULL)
#endif // FEATURE_INTERPRETER
{
goto Done;
} SString namespaceOrClassName, methodName, methodSignature; PCODE pOtherCode = NULL; // Need to move here due to 'goto GotNewCode' #ifdef FEATURE_MULTICOREJIT bool fCompiledInBackground = false; // If not called from multi-core JIT thread,
if (! fBackgroundThread)
{
// Quick check before calling expensive out of line function on this method's domain has code JITted by background thread
if (mcJitManager.GetMulticoreJitCodeStorage().GetRemainingMethodCount() > 0)
{
if (MulticoreJitManager::IsMethodSupported(this))
{
pCode = mcJitManager.RequestMethodCode(this); // Query multi-core JIT manager for compiled code // Multicore JIT manager starts background thread to pre-compile methods, but it does not back-patch it/notify profiler/notify DAC,
// Jumtp to GotNewCode to do so
if (pCode != NULL)
{
fCompiledInBackground = true; #ifdef DEBUGGING_SUPPORTED
// Notify the debugger of the jitted function
if (g_pDebugInterface != NULL)
{
g_pDebugInterface->JITComplete(this, pCode);
}
#endif goto GotNewCode;
}
}
}
}
#endif if (fIsILStub)
{
// we race with other threads to JIT the code for an IL stub and the
// IL header is released once one of the threads completes. As a result
// we must be inside the lock to reliably get the IL header for the
// stub. ILStubResolver* pResolver = AsDynamicMethodDesc()->GetILStubResolver();
ILHeader = pResolver->GetILHeader();
} #ifdef MDA_SUPPORTED
MdaJitCompilationStart* pProbe = MDA_GET_ASSISTANT(JitCompilationStart);
if (pProbe)
pProbe->NowCompiling(this);
#endif // MDA_SUPPORTED #ifdef PROFILING_SUPPORTED
// If profiling, need to give a chance for a tool to examine and modify
// the IL before it gets to the JIT. This allows one to add probe calls for
// things like code coverage, performance, or whatever.
{
BEGIN_PIN_PROFILER(CORProfilerTrackJITInfo()); // Multicore JIT should be disabled when CORProfilerTrackJITInfo is on
// But there could be corner case in which profiler is attached when multicore background thread is calling MakeJitWorker
// Disable this block when calling from multicore JIT background thread
if (!IsNoMetadata()
#ifdef FEATURE_MULTICOREJIT && (! fBackgroundThread)
#endif
)
{
g_profControlBlock.pProfInterface->JITCompilationStarted((FunctionID) this, TRUE);
// The profiler may have changed the code on the callback. Need to
// pick up the new code. Note that you have to be fully trusted in
// this mode and the code will not be verified.
COR_ILMETHOD *pilHeader = GetILHeader(TRUE);
new (ILHeader) COR_ILMETHOD_DECODER(pilHeader, GetMDImport(), NULL);
}
END_PIN_PROFILER();
}
#endif // PROFILING_SUPPORTED
#ifdef FEATURE_INTERPRETER
// We move the ETW event for start of JITting inward, after we make the decision
// to JIT rather than interpret.
#else // FEATURE_INTERPRETER
// Fire an ETW event to mark the beginning of JIT'ing
ETW::MethodLog::MethodJitting(this, &namespaceOrClassName, &methodName, &methodSignature);
#endif // FEATURE_INTERPRETER #ifdef FEATURE_STACK_SAMPLING
#ifdef FEATURE_MULTICOREJIT
if (!fBackgroundThread)
#endif // FEATURE_MULTICOREJIT
{
StackSampler::RecordJittingInfo(this, flags, flags2);
}
#endif // FEATURE_STACK_SAMPLING EX_TRY
{
pCode = UnsafeJitFunction(this, ILHeader, flags, flags2, &sizeOfCode);
}
EX_CATCH
{
// If the current thread threw an exception, but a competing thread
// somehow succeeded at JITting the same function (e.g., out of memory
// encountered on current thread but not competing thread), then go ahead
// and swallow this current thread's exception, since we somehow managed
// to successfully JIT the code on the other thread.
//
// Note that if a deadlock cycle is broken, that does not result in an
// exception--the thread would just pass through the lock and JIT the
// function in competition with the other thread (with the winner of the
// race decided later on when we do SetNativeCodeInterlocked). This
// try/catch is purely to deal with the (unusual) case where a competing
// thread succeeded where we aborted. pOtherCode = GetNativeCode(); if (pOtherCode == NULL)
{
pEntry->m_hrResultCode = E_FAIL;
EX_RETHROW;
}
}
EX_END_CATCH(RethrowTerminalExceptions) if (pOtherCode != NULL)
{
// Somebody finished jitting recursively while we were jitting the method.
// Just use their method & leak the one we finished. (Normally we hope
// not to finish our JIT in this case, as we will abort early if we notice
// a reentrant jit has occurred. But we may not catch every place so we
// do a definitive final check here.
pCode = pOtherCode;
goto Done;
} _ASSERTE(pCode != NULL); #ifdef HAVE_GCCOVER
if (GCStress<cfg_instr_jit>::IsEnabled())
{
SetupGcCoverage(this, (BYTE*) pCode);
}
#endif // HAVE_GCCOVER #ifdef FEATURE_INTERPRETER
// Determine whether the new code address is "stable"...= is not an interpreter stub.
fInterpreted = (Interpreter::InterpretationStubToMethodInfo(pCode) == this);
fStable = !fInterpreted;
#endif // FEATURE_INTERPRETER #ifdef FEATURE_MULTICOREJIT // If called from multi-core JIT background thread, store code under lock, delay patching until code is queried from application threads
if (fBackgroundThread)
{
// Fire an ETW event to mark the end of JIT'ing
ETW::MethodLog::MethodJitted(this, &namespaceOrClassName, &methodName, &methodSignature, pCode, 0 /* ReJITID */); #ifdef FEATURE_PERFMAP
// Save the JIT'd method information so that perf can resolve JIT'd call frames.
PerfMap::LogJITCompiledMethod(this, pCode, sizeOfCode);
#endif mcJitManager.GetMulticoreJitCodeStorage().StoreMethodCode(this, pCode); goto Done;
} GotNewCode:
#endif
// If this function had already been requested for rejit (before its original
// code was jitted), then give the rejit manager a chance to jump-stamp the
// code we just compiled so the first thread entering the function will jump
// to the prestub and trigger the rejit. Note that the PublishMethodHolder takes
// a lock to avoid a particular kind of rejit race. See
// code:ReJitManager::PublishMethodHolder::PublishMethodHolder#PublishCode for
// details on the rejit race.
//
// Aside from rejit, performing a SetNativeCodeInterlocked at this point
// generally ensures that there is only one winning version of the native
// code. This also avoid races with profiler overriding ngened code (see
// matching SetNativeCodeInterlocked done after
// JITCachedFunctionSearchStarted)
#ifdef FEATURE_INTERPRETER
PCODE pExpected = pPreviousInterpStub;
if (pExpected == NULL) pExpected = GetTemporaryEntryPoint();
#endif
{
ReJitPublishMethodHolder publishWorker(this, pCode);
if (!SetNativeCodeInterlocked(pCode
#ifdef FEATURE_INTERPRETER
, pExpected, fStable
#endif
))
{
// Another thread beat us to publishing its copy of the JITted code.
pCode = GetNativeCode();
goto Done;
}
} #ifdef FEATURE_INTERPRETER
// State for dynamic methods cannot be freed if the method was ever interpreted,
// since there is no way to ensure that it is not in use at the moment.
if (IsDynamicMethod() && !fInterpreted && (pPreviousInterpStub == NULL))
{
AsDynamicMethodDesc()->GetResolver()->FreeCompileTimeState();
}
#endif // FEATURE_INTERPRETER // We succeeded in jitting the code, and our jitted code is the one that's going to run now.
pEntry->m_hrResultCode = S_OK; #ifdef PROFILING_SUPPORTED
// Notify the profiler that JIT completed.
// Must do this after the address has been set.
// @ToDo: Why must we set the address before notifying the profiler ??
// Note that if IsInterceptedForDeclSecurity is set no one should access the jitted code address anyway.
{
BEGIN_PIN_PROFILER(CORProfilerTrackJITInfo());
if (!IsNoMetadata())
{
g_profControlBlock.pProfInterface->
JITCompilationFinished((FunctionID) this,
pEntry->m_hrResultCode,
TRUE);
}
END_PIN_PROFILER();
}
#endif // PROFILING_SUPPORTED #ifdef FEATURE_MULTICOREJIT
if (! fCompiledInBackground)
#endif
#ifdef FEATURE_INTERPRETER
// If we didn't JIT, but rather, created an interpreter stub (i.e., fStable is false), don't tell ETW that we did.
if (fStable)
#endif // FEATURE_INTERPRETER
{
// Fire an ETW event to mark the end of JIT'ing
ETW::MethodLog::MethodJitted(this, &namespaceOrClassName, &methodName, &methodSignature, pCode, 0 /* ReJITID */); #ifdef FEATURE_PERFMAP
// Save the JIT'd method information so that perf can resolve JIT'd call frames.
PerfMap::LogJITCompiledMethod(this, pCode, sizeOfCode);
#endif
} #ifdef FEATURE_MULTICOREJIT // If not called from multi-core JIT thread, not got code from storage, quick check before calling out of line function
if (! fBackgroundThread && ! fCompiledInBackground && mcJitManager.IsRecorderActive())
{
if (MulticoreJitManager::IsMethodSupported(this))
{
mcJitManager.RecordMethodJit(this); // Tell multi-core JIT manager to record method on successful JITting
}
}
#endif if (!fIsILStub)
{
// The notification will only occur if someone has registered for this method.
DACNotifyCompilationFinished(this);
}
}
} Done: // We must have a code by now.
_ASSERTE(pCode != NULL); LOG((LF_CORDB, LL_EVERYTHING, "MethodDesc::MakeJitWorker finished. Stub is" FMT_ADDR "\n",
DBG_ADDR(pCode))); return pCode;
}

这个函数是线程安全的JIT函数,

如果多个线程编译同一个函数, 其中一个线程会执行编译, 其他线程会等待编译完成.

每个AppDomain会有一个锁的集合, 一个正在编译的函数拥有一个ListLockEntry对象,

函数首先会对集合上锁, 获取或者创建函数对应的ListLockEntry, 然后释放对集合的锁,

这个时候所有线程对同一个函数都会获取到同一个ListLockEntry, 然后再对ListLockEntry上锁.

上锁后调用非线程安全的JIT函数:

pCode = UnsafeJitFunction(this, ILHeader, flags, flags2, &sizeOfCode)

接下来还有几层调用才会到JIT主函数, 我只简单说明他们的处理:

UnsafeJitFunction

这个函数会创建CEEJitInfo(JIT层给EE层反馈使用的类)的实例, 从函数信息中获取编译标志(是否以Debug模式编译),

调用CallCompileMethodWithSEHWrapper, 并且在相对地址溢出时禁止使用相对地址(fAllowRel32)然后重试编译.

CallCompileMethodWithSEHWrapper

这个函数会在try中调用invokeCompileMethod.

invokeCompileMethod

这个函数让当前线程进入Preemptive模式(GC可以不用挂起当前线程), 然后调用invokeCompileMethodHelper.

invokeCompileMethodHelper

这个函数一般情况下会调用jitMgr->m_jit->compileMethod.

CILJit::compileMethod

这个函数一般情况下会调用jitNativeCode.

jitNativeCode

创建和初始化Compiler的实例, 并调用pParam->pComp->compCompile(7参数版).

内联时也会从这个函数开始调用, 如果是内联则Compiler实例会在第一次创建后复用.

Compiler负责单个函数的整个JIT过程.

Compiler::compCompile(7参数版)

这个函数会对Compiler实例做出一些初始化处理, 然后调用Compiler::compCompileHelper.

compCompileHelper

这个函数会先创建本地变量表lvaTableBasicBlock的链表,

必要时添加一个内部使用的block(BB01), 然后解析IL代码添加更多的block, 具体将在下面说明.

然后调用compCompile(3参数版).

compCompile(3参数版)

这就是JIT的主函数, 这个函数负责调用JIT各个阶段的工作, 具体将在下面说明.

创建本地变量表

compCompileHelper会调用lvaInitTypeRef,

lvaInitTypeRef会创建本地变量表, 源代码如下:

void Compiler::lvaInitTypeRef()
{ /* x86 args look something like this:
[this ptr] [hidden return buffer] [declared arguments]* [generic context] [var arg cookie] x64 is closer to the native ABI:
[this ptr] [hidden return buffer] [generic context] [var arg cookie] [declared arguments]*
(Note: prior to .NET Framework 4.5.1 for Windows 8.1 (but not .NET Framework 4.5.1 "downlevel"),
the "hidden return buffer" came before the "this ptr". Now, the "this ptr" comes first. This
is different from the C++ order, where the "hidden return buffer" always comes first.) ARM and ARM64 are the same as the current x64 convention:
[this ptr] [hidden return buffer] [generic context] [var arg cookie] [declared arguments]* Key difference:
The var arg cookie and generic context are swapped with respect to the user arguments
*/ /* Set compArgsCount and compLocalsCount */ info.compArgsCount = info.compMethodInfo->args.numArgs; // Is there a 'this' pointer if (!info.compIsStatic)
{
info.compArgsCount++;
}
else
{
info.compThisArg = BAD_VAR_NUM;
} info.compILargsCount = info.compArgsCount; #ifdef FEATURE_SIMD
if (featureSIMD && (info.compRetNativeType == TYP_STRUCT))
{
var_types structType = impNormStructType(info.compMethodInfo->args.retTypeClass);
info.compRetType = structType;
}
#endif // FEATURE_SIMD // Are we returning a struct using a return buffer argument?
//
const bool hasRetBuffArg = impMethodInfo_hasRetBuffArg(info.compMethodInfo); // Possibly change the compRetNativeType from TYP_STRUCT to a "primitive" type
// when we are returning a struct by value and it fits in one register
//
if (!hasRetBuffArg && varTypeIsStruct(info.compRetNativeType))
{
CORINFO_CLASS_HANDLE retClsHnd = info.compMethodInfo->args.retTypeClass; Compiler::structPassingKind howToReturnStruct;
var_types returnType = getReturnTypeForStruct(retClsHnd, &howToReturnStruct); if (howToReturnStruct == SPK_PrimitiveType)
{
assert(returnType != TYP_UNKNOWN);
assert(returnType != TYP_STRUCT); info.compRetNativeType = returnType; // ToDo: Refactor this common code sequence into its own method as it is used 4+ times
if ((returnType == TYP_LONG) && (compLongUsed == false))
{
compLongUsed = true;
}
else if (((returnType == TYP_FLOAT) || (returnType == TYP_DOUBLE)) && (compFloatingPointUsed == false))
{
compFloatingPointUsed = true;
}
}
} // Do we have a RetBuffArg? if (hasRetBuffArg)
{
info.compArgsCount++;
}
else
{
info.compRetBuffArg = BAD_VAR_NUM;
} /* There is a 'hidden' cookie pushed last when the
calling convention is varargs */ if (info.compIsVarArgs)
{
info.compArgsCount++;
} // Is there an extra parameter used to pass instantiation info to
// shared generic methods and shared generic struct instance methods?
if (info.compMethodInfo->args.callConv & CORINFO_CALLCONV_PARAMTYPE)
{
info.compArgsCount++;
}
else
{
info.compTypeCtxtArg = BAD_VAR_NUM;
} lvaCount = info.compLocalsCount = info.compArgsCount + info.compMethodInfo->locals.numArgs; info.compILlocalsCount = info.compILargsCount + info.compMethodInfo->locals.numArgs; /* Now allocate the variable descriptor table */ if (compIsForInlining())
{
lvaTable = impInlineInfo->InlinerCompiler->lvaTable;
lvaCount = impInlineInfo->InlinerCompiler->lvaCount;
lvaTableCnt = impInlineInfo->InlinerCompiler->lvaTableCnt; // No more stuff needs to be done.
return;
} lvaTableCnt = lvaCount * 2; if (lvaTableCnt < 16)
{
lvaTableCnt = 16;
} lvaTable = (LclVarDsc*)compGetMemArray(lvaTableCnt, sizeof(*lvaTable), CMK_LvaTable);
size_t tableSize = lvaTableCnt * sizeof(*lvaTable);
memset(lvaTable, 0, tableSize);
for (unsigned i = 0; i < lvaTableCnt; i++)
{
new (&lvaTable[i], jitstd::placement_t()) LclVarDsc(this); // call the constructor.
} //-------------------------------------------------------------------------
// Count the arguments and initialize the respective lvaTable[] entries
//
// First the implicit arguments
//------------------------------------------------------------------------- InitVarDscInfo varDscInfo;
varDscInfo.Init(lvaTable, hasRetBuffArg); lvaInitArgs(&varDscInfo); //-------------------------------------------------------------------------
// Finally the local variables
//------------------------------------------------------------------------- unsigned varNum = varDscInfo.varNum;
LclVarDsc* varDsc = varDscInfo.varDsc;
CORINFO_ARG_LIST_HANDLE localsSig = info.compMethodInfo->locals.args; for (unsigned i = 0; i < info.compMethodInfo->locals.numArgs;
i++, varNum++, varDsc++, localsSig = info.compCompHnd->getArgNext(localsSig))
{
CORINFO_CLASS_HANDLE typeHnd;
CorInfoTypeWithMod corInfoType =
info.compCompHnd->getArgType(&info.compMethodInfo->locals, localsSig, &typeHnd);
lvaInitVarDsc(varDsc, varNum, strip(corInfoType), typeHnd, localsSig, &info.compMethodInfo->locals); varDsc->lvPinned = ((corInfoType & CORINFO_TYPE_MOD_PINNED) != 0);
varDsc->lvOnFrame = true; // The final home for this local variable might be our local stack frame
} if ( // If there already exist unsafe buffers, don't mark more structs as unsafe
// as that will cause them to be placed along with the real unsafe buffers,
// unnecessarily exposing them to overruns. This can affect GS tests which
// intentionally do buffer-overruns.
!getNeedsGSSecurityCookie() &&
// GS checks require the stack to be re-ordered, which can't be done with EnC
!opts.compDbgEnC && compStressCompile(STRESS_UNSAFE_BUFFER_CHECKS, 25))
{
setNeedsGSSecurityCookie();
compGSReorderStackLayout = true; for (unsigned i = 0; i < lvaCount; i++)
{
if ((lvaTable[i].lvType == TYP_STRUCT) && compStressCompile(STRESS_GENERIC_VARN, 60))
{
lvaTable[i].lvIsUnsafeBuffer = true;
}
}
} if (getNeedsGSSecurityCookie())
{
// Ensure that there will be at least one stack variable since
// we require that the GSCookie does not have a 0 stack offset.
unsigned dummy = lvaGrabTempWithImplicitUse(false DEBUGARG("GSCookie dummy"));
lvaTable[dummy].lvType = TYP_INT;
} #ifdef DEBUG
if (verbose)
{
lvaTableDump(INITIAL_FRAME_LAYOUT);
}
#endif
}

初始的本地变量数量是info.compArgsCount + info.compMethodInfo->locals.numArgs, 也就是IL中的参数数量+IL中的本地变量数量.

因为后面可能会添加更多的临时变量, 本地变量表的储存采用了length+capacity的方式,

本地变量表的指针是lvaTable, 当前长度是lvaCount, 最大长度是lvaTableCnt.

本地变量表的开头部分会先保存IL中的参数变量, 随后才是IL中的本地变量,

例如有3个参数, 2个本地变量时, 本地变量表是[参数0, 参数1, 参数2, 变量0, 变量1, 空, 空, 空, ... ].

此外如果对当前函数的编译是为了内联, 本地变量表会使用调用端(callsite)的对象.

根据IL创建BasicBlock

在进入JIT的主函数之前, compCompileHelper会先解析IL并且根据指令创建BasicBlock.

上一篇中也提到过,

BasicBlock是内部不包含跳转的逻辑块, 跳转指令原则只出现在block的最后, 同时跳转目标只能是block的开头.

创建BasicBlock的逻辑在函数fgFindBasicBlocks, 我们来看看它的源代码:

/*****************************************************************************
*
* Main entry point to discover the basic blocks for the current function.
*/ void Compiler::fgFindBasicBlocks()
{
#ifdef DEBUG
if (verbose)
{
printf("*************** In fgFindBasicBlocks() for %s\n", info.compFullName);
}
#endif /* Allocate the 'jump target' vector
*
* We need one extra byte as we mark
* jumpTarget[info.compILCodeSize] with JT_ADDR
* when we need to add a dummy block
* to record the end of a try or handler region.
*/
BYTE* jumpTarget = new (this, CMK_Unknown) BYTE[info.compILCodeSize + 1];
memset(jumpTarget, JT_NONE, info.compILCodeSize + 1);
noway_assert(JT_NONE == 0); /* Walk the instrs to find all jump targets */ fgFindJumpTargets(info.compCode, info.compILCodeSize, jumpTarget);
if (compDonotInline())
{
return;
} unsigned XTnum; /* Are there any exception handlers? */ if (info.compXcptnsCount > 0)
{
noway_assert(!compIsForInlining()); /* Check and mark all the exception handlers */ for (XTnum = 0; XTnum < info.compXcptnsCount; XTnum++)
{
DWORD tmpOffset;
CORINFO_EH_CLAUSE clause;
info.compCompHnd->getEHinfo(info.compMethodHnd, XTnum, &clause);
noway_assert(clause.HandlerLength != (unsigned)-1); if (clause.TryLength <= 0)
{
BADCODE("try block length <=0");
} /* Mark the 'try' block extent and the handler itself */ if (clause.TryOffset > info.compILCodeSize)
{
BADCODE("try offset is > codesize");
}
if (jumpTarget[clause.TryOffset] == JT_NONE)
{
jumpTarget[clause.TryOffset] = JT_ADDR;
} tmpOffset = clause.TryOffset + clause.TryLength;
if (tmpOffset > info.compILCodeSize)
{
BADCODE("try end is > codesize");
}
if (jumpTarget[tmpOffset] == JT_NONE)
{
jumpTarget[tmpOffset] = JT_ADDR;
} if (clause.HandlerOffset > info.compILCodeSize)
{
BADCODE("handler offset > codesize");
}
if (jumpTarget[clause.HandlerOffset] == JT_NONE)
{
jumpTarget[clause.HandlerOffset] = JT_ADDR;
} tmpOffset = clause.HandlerOffset + clause.HandlerLength;
if (tmpOffset > info.compILCodeSize)
{
BADCODE("handler end > codesize");
}
if (jumpTarget[tmpOffset] == JT_NONE)
{
jumpTarget[tmpOffset] = JT_ADDR;
} if (clause.Flags & CORINFO_EH_CLAUSE_FILTER)
{
if (clause.FilterOffset > info.compILCodeSize)
{
BADCODE("filter offset > codesize");
}
if (jumpTarget[clause.FilterOffset] == JT_NONE)
{
jumpTarget[clause.FilterOffset] = JT_ADDR;
}
}
}
} #ifdef DEBUG
if (verbose)
{
bool anyJumpTargets = false;
printf("Jump targets:\n");
for (unsigned i = 0; i < info.compILCodeSize + 1; i++)
{
if (jumpTarget[i] == JT_NONE)
{
continue;
} anyJumpTargets = true;
printf(" IL_%04x", i); if (jumpTarget[i] & JT_ADDR)
{
printf(" addr");
}
if (jumpTarget[i] & JT_MULTI)
{
printf(" multi");
}
printf("\n");
}
if (!anyJumpTargets)
{
printf(" none\n");
}
}
#endif // DEBUG /* Now create the basic blocks */ fgMakeBasicBlocks(info.compCode, info.compILCodeSize, jumpTarget); if (compIsForInlining())
{
if (compInlineResult->IsFailure())
{
return;
} bool hasReturnBlocks = false;
bool hasMoreThanOneReturnBlock = false; for (BasicBlock* block = fgFirstBB; block != nullptr; block = block->bbNext)
{
if (block->bbJumpKind == BBJ_RETURN)
{
if (hasReturnBlocks)
{
hasMoreThanOneReturnBlock = true;
break;
} hasReturnBlocks = true;
}
} if (!hasReturnBlocks && !compInlineResult->UsesLegacyPolicy())
{
//
// Mark the call node as "no return". The inliner might ignore CALLEE_DOES_NOT_RETURN and
// fail inline for a different reasons. In that case we still want to make the "no return"
// information available to the caller as it can impact caller's code quality.
// impInlineInfo->iciCall->gtCallMoreFlags |= GTF_CALL_M_DOES_NOT_RETURN;
} compInlineResult->NoteBool(InlineObservation::CALLEE_DOES_NOT_RETURN, !hasReturnBlocks); if (compInlineResult->IsFailure())
{
return;
} noway_assert(info.compXcptnsCount == 0);
compHndBBtab = impInlineInfo->InlinerCompiler->compHndBBtab;
compHndBBtabAllocCount =
impInlineInfo->InlinerCompiler->compHndBBtabAllocCount; // we probably only use the table, not add to it.
compHndBBtabCount = impInlineInfo->InlinerCompiler->compHndBBtabCount;
info.compXcptnsCount = impInlineInfo->InlinerCompiler->info.compXcptnsCount; if (info.compRetNativeType != TYP_VOID && hasMoreThanOneReturnBlock)
{
// The lifetime of this var might expand multiple BBs. So it is a long lifetime compiler temp.
lvaInlineeReturnSpillTemp = lvaGrabTemp(false DEBUGARG("Inline candidate multiple BBJ_RETURN spill temp"));
lvaTable[lvaInlineeReturnSpillTemp].lvType = info.compRetNativeType;
}
return;
} /* Mark all blocks within 'try' blocks as such */ if (info.compXcptnsCount == 0)
{
return;
} if (info.compXcptnsCount > MAX_XCPTN_INDEX)
{
IMPL_LIMITATION("too many exception clauses");
} /* Allocate the exception handler table */ fgAllocEHTable(); /* Assume we don't need to sort the EH table (such that nested try/catch
* appear before their try or handler parent). The EH verifier will notice
* when we do need to sort it.
*/ fgNeedToSortEHTable = false; verInitEHTree(info.compXcptnsCount);
EHNodeDsc* initRoot = ehnNext; // remember the original root since
// it may get modified during insertion // Annotate BBs with exception handling information required for generating correct eh code
// as well as checking for correct IL EHblkDsc* HBtab; for (XTnum = 0, HBtab = compHndBBtab; XTnum < compHndBBtabCount; XTnum++, HBtab++)
{
CORINFO_EH_CLAUSE clause;
info.compCompHnd->getEHinfo(info.compMethodHnd, XTnum, &clause);
noway_assert(clause.HandlerLength != (unsigned)-1); // @DEPRECATED #ifdef DEBUG
if (verbose)
{
dispIncomingEHClause(XTnum, clause);
}
#endif // DEBUG IL_OFFSET tryBegOff = clause.TryOffset;
IL_OFFSET tryEndOff = tryBegOff + clause.TryLength;
IL_OFFSET filterBegOff = 0;
IL_OFFSET hndBegOff = clause.HandlerOffset;
IL_OFFSET hndEndOff = hndBegOff + clause.HandlerLength; if (clause.Flags & CORINFO_EH_CLAUSE_FILTER)
{
filterBegOff = clause.FilterOffset;
} if (tryEndOff > info.compILCodeSize)
{
BADCODE3("end of try block beyond end of method for try", " at offset %04X", tryBegOff);
}
if (hndEndOff > info.compILCodeSize)
{
BADCODE3("end of hnd block beyond end of method for try", " at offset %04X", tryBegOff);
} HBtab->ebdTryBegOffset = tryBegOff;
HBtab->ebdTryEndOffset = tryEndOff;
HBtab->ebdFilterBegOffset = filterBegOff;
HBtab->ebdHndBegOffset = hndBegOff;
HBtab->ebdHndEndOffset = hndEndOff; /* Convert the various addresses to basic blocks */ BasicBlock* tryBegBB = fgLookupBB(tryBegOff);
BasicBlock* tryEndBB =
fgLookupBB(tryEndOff); // note: this can be NULL if the try region is at the end of the function
BasicBlock* hndBegBB = fgLookupBB(hndBegOff);
BasicBlock* hndEndBB = nullptr;
BasicBlock* filtBB = nullptr;
BasicBlock* block; //
// Assert that the try/hnd beginning blocks are set up correctly
//
if (tryBegBB == nullptr)
{
BADCODE("Try Clause is invalid");
} if (hndBegBB == nullptr)
{
BADCODE("Handler Clause is invalid");
} tryBegBB->bbFlags |= BBF_HAS_LABEL;
hndBegBB->bbFlags |= BBF_HAS_LABEL | BBF_JMP_TARGET; #if HANDLER_ENTRY_MUST_BE_IN_HOT_SECTION
// This will change the block weight from 0 to 1
// and clear the rarely run flag
hndBegBB->makeBlockHot();
#else
hndBegBB->bbSetRunRarely(); // handler entry points are rarely executed
#endif if (hndEndOff < info.compILCodeSize)
{
hndEndBB = fgLookupBB(hndEndOff);
} if (clause.Flags & CORINFO_EH_CLAUSE_FILTER)
{
filtBB = HBtab->ebdFilter = fgLookupBB(clause.FilterOffset); filtBB->bbCatchTyp = BBCT_FILTER;
filtBB->bbFlags |= BBF_HAS_LABEL | BBF_JMP_TARGET; hndBegBB->bbCatchTyp = BBCT_FILTER_HANDLER; #if HANDLER_ENTRY_MUST_BE_IN_HOT_SECTION
// This will change the block weight from 0 to 1
// and clear the rarely run flag
filtBB->makeBlockHot();
#else
filtBB->bbSetRunRarely(); // filter entry points are rarely executed
#endif // Mark all BBs that belong to the filter with the XTnum of the corresponding handler
for (block = filtBB; /**/; block = block->bbNext)
{
if (block == nullptr)
{
BADCODE3("Missing endfilter for filter", " at offset %04X", filtBB->bbCodeOffs);
return;
} // Still inside the filter
block->setHndIndex(XTnum); if (block->bbJumpKind == BBJ_EHFILTERRET)
{
// Mark catch handler as successor.
block->bbJumpDest = hndBegBB;
assert(block->bbJumpDest->bbCatchTyp == BBCT_FILTER_HANDLER);
break;
}
} if (!block->bbNext || block->bbNext != hndBegBB)
{
BADCODE3("Filter does not immediately precede handler for filter", " at offset %04X",
filtBB->bbCodeOffs);
}
}
else
{
HBtab->ebdTyp = clause.ClassToken; /* Set bbCatchTyp as appropriate */ if (clause.Flags & CORINFO_EH_CLAUSE_FINALLY)
{
hndBegBB->bbCatchTyp = BBCT_FINALLY;
}
else
{
if (clause.Flags & CORINFO_EH_CLAUSE_FAULT)
{
hndBegBB->bbCatchTyp = BBCT_FAULT;
}
else
{
hndBegBB->bbCatchTyp = clause.ClassToken; // These values should be non-zero value that will
// not collide with real tokens for bbCatchTyp
if (clause.ClassToken == 0)
{
BADCODE("Exception catch type is Null");
} noway_assert(clause.ClassToken != BBCT_FAULT);
noway_assert(clause.ClassToken != BBCT_FINALLY);
noway_assert(clause.ClassToken != BBCT_FILTER);
noway_assert(clause.ClassToken != BBCT_FILTER_HANDLER);
}
}
} /* Mark the initial block and last blocks in the 'try' region */ tryBegBB->bbFlags |= BBF_TRY_BEG | BBF_HAS_LABEL; /* Prevent future optimizations of removing the first block */
/* of a TRY block and the first block of an exception handler */ tryBegBB->bbFlags |= BBF_DONT_REMOVE;
hndBegBB->bbFlags |= BBF_DONT_REMOVE;
hndBegBB->bbRefs++; // The first block of a handler gets an extra, "artificial" reference count. if (clause.Flags & CORINFO_EH_CLAUSE_FILTER)
{
filtBB->bbFlags |= BBF_DONT_REMOVE;
filtBB->bbRefs++; // The first block of a filter gets an extra, "artificial" reference count.
} tryBegBB->bbFlags |= BBF_DONT_REMOVE;
hndBegBB->bbFlags |= BBF_DONT_REMOVE; //
// Store the info to the table of EH block handlers
// HBtab->ebdHandlerType = ToEHHandlerType(clause.Flags); HBtab->ebdTryBeg = tryBegBB;
HBtab->ebdTryLast = (tryEndBB == nullptr) ? fgLastBB : tryEndBB->bbPrev; HBtab->ebdHndBeg = hndBegBB;
HBtab->ebdHndLast = (hndEndBB == nullptr) ? fgLastBB : hndEndBB->bbPrev; //
// Assert that all of our try/hnd blocks are setup correctly.
//
if (HBtab->ebdTryLast == nullptr)
{
BADCODE("Try Clause is invalid");
} if (HBtab->ebdHndLast == nullptr)
{
BADCODE("Handler Clause is invalid");
} //
// Verify that it's legal
// verInsertEhNode(&clause, HBtab); } // end foreach handler table entry fgSortEHTable(); // Next, set things related to nesting that depend on the sorting being complete. for (XTnum = 0, HBtab = compHndBBtab; XTnum < compHndBBtabCount; XTnum++, HBtab++)
{
/* Mark all blocks in the finally/fault or catch clause */ BasicBlock* tryBegBB = HBtab->ebdTryBeg;
BasicBlock* hndBegBB = HBtab->ebdHndBeg; IL_OFFSET tryBegOff = HBtab->ebdTryBegOffset;
IL_OFFSET tryEndOff = HBtab->ebdTryEndOffset; IL_OFFSET hndBegOff = HBtab->ebdHndBegOffset;
IL_OFFSET hndEndOff = HBtab->ebdHndEndOffset; BasicBlock* block; for (block = hndBegBB; block && (block->bbCodeOffs < hndEndOff); block = block->bbNext)
{
if (!block->hasHndIndex())
{
block->setHndIndex(XTnum);
} // All blocks in a catch handler or filter are rarely run, except the entry
if ((block != hndBegBB) && (hndBegBB->bbCatchTyp != BBCT_FINALLY))
{
block->bbSetRunRarely();
}
} /* Mark all blocks within the covered range of the try */ for (block = tryBegBB; block && (block->bbCodeOffs < tryEndOff); block = block->bbNext)
{
/* Mark this BB as belonging to a 'try' block */ if (!block->hasTryIndex())
{
block->setTryIndex(XTnum);
} #ifdef DEBUG
/* Note: the BB can't span the 'try' block */ if (!(block->bbFlags & BBF_INTERNAL))
{
noway_assert(tryBegOff <= block->bbCodeOffs);
noway_assert(tryEndOff >= block->bbCodeOffsEnd || tryEndOff == tryBegOff);
}
#endif
} /* Init ebdHandlerNestingLevel of current clause, and bump up value for all
* enclosed clauses (which have to be before it in the table).
* Innermost try-finally blocks must precede outermost
* try-finally blocks.
*/ #if !FEATURE_EH_FUNCLETS
HBtab->ebdHandlerNestingLevel = 0;
#endif // !FEATURE_EH_FUNCLETS HBtab->ebdEnclosingTryIndex = EHblkDsc::NO_ENCLOSING_INDEX;
HBtab->ebdEnclosingHndIndex = EHblkDsc::NO_ENCLOSING_INDEX; noway_assert(XTnum < compHndBBtabCount);
noway_assert(XTnum == ehGetIndex(HBtab)); for (EHblkDsc* xtab = compHndBBtab; xtab < HBtab; xtab++)
{
#if !FEATURE_EH_FUNCLETS
if (jitIsBetween(xtab->ebdHndBegOffs(), hndBegOff, hndEndOff))
{
xtab->ebdHandlerNestingLevel++;
}
#endif // !FEATURE_EH_FUNCLETS /* If we haven't recorded an enclosing try index for xtab then see
* if this EH region should be recorded. We check if the
* first offset in the xtab lies within our region. If so,
* the last offset also must lie within the region, due to
* nesting rules. verInsertEhNode(), below, will check for proper nesting.
*/
if (xtab->ebdEnclosingTryIndex == EHblkDsc::NO_ENCLOSING_INDEX)
{
bool begBetween = jitIsBetween(xtab->ebdTryBegOffs(), tryBegOff, tryEndOff);
if (begBetween)
{
// Record the enclosing scope link
xtab->ebdEnclosingTryIndex = (unsigned short)XTnum;
}
} /* Do the same for the enclosing handler index.
*/
if (xtab->ebdEnclosingHndIndex == EHblkDsc::NO_ENCLOSING_INDEX)
{
bool begBetween = jitIsBetween(xtab->ebdTryBegOffs(), hndBegOff, hndEndOff);
if (begBetween)
{
// Record the enclosing scope link
xtab->ebdEnclosingHndIndex = (unsigned short)XTnum;
}
}
} } // end foreach handler table entry #if !FEATURE_EH_FUNCLETS EHblkDsc* HBtabEnd;
for (HBtab = compHndBBtab, HBtabEnd = compHndBBtab + compHndBBtabCount; HBtab < HBtabEnd; HBtab++)
{
if (ehMaxHndNestingCount <= HBtab->ebdHandlerNestingLevel)
ehMaxHndNestingCount = HBtab->ebdHandlerNestingLevel + 1;
} #endif // !FEATURE_EH_FUNCLETS #ifndef DEBUG
if (tiVerificationNeeded)
#endif
{
// always run these checks for a debug build
verCheckNestingLevel(initRoot);
} #ifndef DEBUG
// fgNormalizeEH assumes that this test has been passed. And Ssa assumes that fgNormalizeEHTable
// has been run. So do this unless we're in minOpts mode (and always in debug).
if (tiVerificationNeeded || !opts.MinOpts())
#endif
{
fgCheckBasicBlockControlFlow();
} #ifdef DEBUG
if (verbose)
{
JITDUMP("*************** After fgFindBasicBlocks() has created the EH table\n");
fgDispHandlerTab();
} // We can't verify the handler table until all the IL legality checks have been done (above), since bad IL
// (such as illegal nesting of regions) will trigger asserts here.
fgVerifyHandlerTab();
#endif fgNormalizeEH();
}

fgFindBasicBlocks首先创建了一个byte数组, 长度跟IL长度一样(也就是一个IL偏移值会对应一个byte),

然后调用fgFindJumpTargets查找跳转目标, 以这段IL为例:

IL_0000  00                nop
IL_0001 16 ldc.i4.0
IL_0002 0a stloc.0
IL_0003 2b 0d br.s 13 (IL_0012)
IL_0005 00 nop
IL_0006 06 ldloc.0
IL_0007 28 0c 00 00 0a call 0xA00000C
IL_000c 00 nop
IL_000d 00 nop
IL_000e 06 ldloc.0
IL_000f 17 ldc.i4.1
IL_0010 58 add
IL_0011 0a stloc.0
IL_0012 06 ldloc.0
IL_0013 19 ldc.i4.3
IL_0014 fe 04 clt
IL_0016 0b stloc.1
IL_0017 07 ldloc.1
IL_0018 2d eb brtrue.s -21 (IL_0005)
IL_001a 2a ret

这段IL可以找到两个跳转目标:

Jump targets:
IL_0005
IL_0012

然后fgFindBasicBlocks会根据函数的例外信息找到更多的跳转目标, 例如try的开始和catch的开始都会被视为跳转目标.

注意fgFindJumpTargets在解析IL的后会判断是否值得内联, 内联相关的处理将在下面说明.

之后调用fgMakeBasicBlocks创建BasicBlock, fgMakeBasicBlocks在遇到跳转指令或者跳转目标时会开始一个新的block.

调用fgMakeBasicBlocks后, compiler中就有了BasicBlock的链表(从fgFirstBB开始), 每个节点对应IL中的一段范围.

在创建完BasicBlock后还会根据例外信息创建一个例外信息表compHndBBtab(也称EH表), 长度是compHndBBtabCount.

表中每条记录都有try开始的block, handler(catch, finally, fault)开始的block, 和外层的try序号(如果try嵌套了).

如下图所示:

CoreCLR源码探索(八) JIT的工作原理(详解篇)

JIT主函数

compCompileHelperBasicBlock划分好以后, 就会调用3参数版的Compiler::compCompile, 这个函数就是JIT的主函数.

Compiler::compCompile的源代码如下:

//*********************************************************************************************
// #Phases
//
// This is the most interesting 'toplevel' function in the JIT. It goes through the operations of
// importing, morphing, optimizations and code generation. This is called from the EE through the
// code:CILJit::compileMethod function.
//
// For an overview of the structure of the JIT, see:
// https://github.com/dotnet/coreclr/blob/master/Documentation/botr/ryujit-overview.md
//
void Compiler::compCompile(void** methodCodePtr, ULONG* methodCodeSize, CORJIT_FLAGS* compileFlags)
{
if (compIsForInlining())
{
// Notify root instance that an inline attempt is about to import IL
impInlineRoot()->m_inlineStrategy->NoteImport();
} hashBv::Init(this); VarSetOps::AssignAllowUninitRhs(this, compCurLife, VarSetOps::UninitVal()); /* The temp holding the secret stub argument is used by fgImport() when importing the intrinsic. */ if (info.compPublishStubParam)
{
assert(lvaStubArgumentVar == BAD_VAR_NUM);
lvaStubArgumentVar = lvaGrabTempWithImplicitUse(false DEBUGARG("stub argument"));
lvaTable[lvaStubArgumentVar].lvType = TYP_I_IMPL;
} EndPhase(PHASE_PRE_IMPORT); compFunctionTraceStart(); /* Convert the instrs in each basic block to a tree based intermediate representation */ fgImport(); assert(!fgComputePredsDone);
if (fgCheapPredsValid)
{
// Remove cheap predecessors before inlining; allowing the cheap predecessor lists to be inserted
// with inlined blocks causes problems.
fgRemovePreds();
} if (compIsForInlining())
{
/* Quit inlining if fgImport() failed for any reason. */ if (compDonotInline())
{
return;
} /* Filter out unimported BBs */ fgRemoveEmptyBlocks(); return;
} assert(!compDonotInline()); EndPhase(PHASE_IMPORTATION); // Maybe the caller was not interested in generating code
if (compIsForImportOnly())
{
compFunctionTraceEnd(nullptr, 0, false);
return;
} #if !FEATURE_EH
// If we aren't yet supporting EH in a compiler bring-up, remove as many EH handlers as possible, so
// we can pass tests that contain try/catch EH, but don't actually throw any exceptions.
fgRemoveEH();
#endif // !FEATURE_EH if (compileFlags->corJitFlags & CORJIT_FLG_BBINSTR)
{
fgInstrumentMethod();
} // We could allow ESP frames. Just need to reserve space for
// pushing EBP if the method becomes an EBP-frame after an edit.
// Note that requiring a EBP Frame disallows double alignment. Thus if we change this
// we either have to disallow double alignment for E&C some other way or handle it in EETwain. if (opts.compDbgEnC)
{
codeGen->setFramePointerRequired(true); // Since we need a slots for security near ebp, its not possible
// to do this after an Edit without shifting all the locals.
// So we just always reserve space for these slots in case an Edit adds them
opts.compNeedSecurityCheck = true; // We don't care about localloc right now. If we do support it,
// EECodeManager::FixContextForEnC() needs to handle it smartly
// in case the localloc was actually executed.
//
// compLocallocUsed = true;
} EndPhase(PHASE_POST_IMPORT); /* Initialize the BlockSet epoch */ NewBasicBlockEpoch(); /* Massage the trees so that we can generate code out of them */ fgMorph();
EndPhase(PHASE_MORPH); /* GS security checks for unsafe buffers */
if (getNeedsGSSecurityCookie())
{
#ifdef DEBUG
if (verbose)
{
printf("\n*************** -GS checks for unsafe buffers \n");
}
#endif gsGSChecksInitCookie(); if (compGSReorderStackLayout)
{
gsCopyShadowParams();
} #ifdef DEBUG
if (verbose)
{
fgDispBasicBlocks(true);
printf("\n");
}
#endif
}
EndPhase(PHASE_GS_COOKIE); /* Compute bbNum, bbRefs and bbPreds */ JITDUMP("\nRenumbering the basic blocks for fgComputePred\n");
fgRenumberBlocks(); noway_assert(!fgComputePredsDone); // This is the first time full (not cheap) preds will be computed.
fgComputePreds();
EndPhase(PHASE_COMPUTE_PREDS); /* If we need to emit GC Poll calls, mark the blocks that need them now. This is conservative and can
* be optimized later. */
fgMarkGCPollBlocks();
EndPhase(PHASE_MARK_GC_POLL_BLOCKS); /* From this point on the flowgraph information such as bbNum,
* bbRefs or bbPreds has to be kept updated */ // Compute the edge weights (if we have profile data)
fgComputeEdgeWeights();
EndPhase(PHASE_COMPUTE_EDGE_WEIGHTS); #if FEATURE_EH_FUNCLETS /* Create funclets from the EH handlers. */ fgCreateFunclets();
EndPhase(PHASE_CREATE_FUNCLETS); #endif // FEATURE_EH_FUNCLETS if (!opts.MinOpts() && !opts.compDbgCode)
{
optOptimizeLayout();
EndPhase(PHASE_OPTIMIZE_LAYOUT); // Compute reachability sets and dominators.
fgComputeReachability();
} // Transform each GT_ALLOCOBJ node into either an allocation helper call or
// local variable allocation on the stack.
ObjectAllocator objectAllocator(this);
objectAllocator.Run(); if (!opts.MinOpts() && !opts.compDbgCode)
{
/* Perform loop inversion (i.e. transform "while" loops into
"repeat" loops) and discover and classify natural loops
(e.g. mark iterative loops as such). Also marks loop blocks
and sets bbWeight to the loop nesting levels
*/ optOptimizeLoops();
EndPhase(PHASE_OPTIMIZE_LOOPS); // Clone loops with optimization opportunities, and
// choose the one based on dynamic condition evaluation.
optCloneLoops();
EndPhase(PHASE_CLONE_LOOPS); /* Unroll loops */
optUnrollLoops();
EndPhase(PHASE_UNROLL_LOOPS);
} #ifdef DEBUG
fgDebugCheckLinks();
#endif /* Create the variable table (and compute variable ref counts) */ lvaMarkLocalVars();
EndPhase(PHASE_MARK_LOCAL_VARS); // IMPORTANT, after this point, every place where trees are modified or cloned
// the local variable reference counts must be updated
// You can test the value of the following variable to see if
// the local variable ref counts must be updated
//
assert(lvaLocalVarRefCounted == true); if (!opts.MinOpts() && !opts.compDbgCode)
{
/* Optimize boolean conditions */ optOptimizeBools();
EndPhase(PHASE_OPTIMIZE_BOOLS); // optOptimizeBools() might have changed the number of blocks; the dominators/reachability might be bad.
} /* Figure out the order in which operators are to be evaluated */
fgFindOperOrder();
EndPhase(PHASE_FIND_OPER_ORDER); // Weave the tree lists. Anyone who modifies the tree shapes after
// this point is responsible for calling fgSetStmtSeq() to keep the
// nodes properly linked.
// This can create GC poll calls, and create new BasicBlocks (without updating dominators/reachability).
fgSetBlockOrder();
EndPhase(PHASE_SET_BLOCK_ORDER); // IMPORTANT, after this point, every place where tree topology changes must redo evaluation
// order (gtSetStmtInfo) and relink nodes (fgSetStmtSeq) if required.
CLANG_FORMAT_COMMENT_ANCHOR; #ifdef DEBUG
// Now we have determined the order of evaluation and the gtCosts for every node.
// If verbose, dump the full set of trees here before the optimization phases mutate them
//
if (verbose)
{
fgDispBasicBlocks(true); // 'true' will call fgDumpTrees() after dumping the BasicBlocks
printf("\n");
}
#endif // At this point we know if we are fully interruptible or not
if (!opts.MinOpts() && !opts.compDbgCode)
{
bool doSsa = true;
bool doEarlyProp = true;
bool doValueNum = true;
bool doLoopHoisting = true;
bool doCopyProp = true;
bool doAssertionProp = true;
bool doRangeAnalysis = true; #ifdef DEBUG
doSsa = (JitConfig.JitDoSsa() != 0);
doEarlyProp = doSsa && (JitConfig.JitDoEarlyProp() != 0);
doValueNum = doSsa && (JitConfig.JitDoValueNumber() != 0);
doLoopHoisting = doValueNum && (JitConfig.JitDoLoopHoisting() != 0);
doCopyProp = doValueNum && (JitConfig.JitDoCopyProp() != 0);
doAssertionProp = doValueNum && (JitConfig.JitDoAssertionProp() != 0);
doRangeAnalysis = doAssertionProp && (JitConfig.JitDoRangeAnalysis() != 0);
#endif if (doSsa)
{
fgSsaBuild();
EndPhase(PHASE_BUILD_SSA);
} if (doEarlyProp)
{
/* Propagate array length and rewrite getType() method call */
optEarlyProp();
EndPhase(PHASE_EARLY_PROP);
} if (doValueNum)
{
fgValueNumber();
EndPhase(PHASE_VALUE_NUMBER);
} if (doLoopHoisting)
{
/* Hoist invariant code out of loops */
optHoistLoopCode();
EndPhase(PHASE_HOIST_LOOP_CODE);
} if (doCopyProp)
{
/* Perform VN based copy propagation */
optVnCopyProp();
EndPhase(PHASE_VN_COPY_PROP);
} #if FEATURE_ANYCSE
/* Remove common sub-expressions */
optOptimizeCSEs();
#endif // FEATURE_ANYCSE #if ASSERTION_PROP
if (doAssertionProp)
{
/* Assertion propagation */
optAssertionPropMain();
EndPhase(PHASE_ASSERTION_PROP_MAIN);
} if (doRangeAnalysis)
{
/* Optimize array index range checks */
RangeCheck rc(this);
rc.OptimizeRangeChecks();
EndPhase(PHASE_OPTIMIZE_INDEX_CHECKS);
}
#endif // ASSERTION_PROP /* update the flowgraph if we modified it during the optimization phase*/
if (fgModified)
{
fgUpdateFlowGraph();
EndPhase(PHASE_UPDATE_FLOW_GRAPH); // Recompute the edge weight if we have modified the flow graph
fgComputeEdgeWeights();
EndPhase(PHASE_COMPUTE_EDGE_WEIGHTS2);
}
} #ifdef _TARGET_AMD64_
// Check if we need to add the Quirk for the PPP backward compat issue
compQuirkForPPPflag = compQuirkForPPP();
#endif fgDetermineFirstColdBlock();
EndPhase(PHASE_DETERMINE_FIRST_COLD_BLOCK); #ifdef DEBUG
fgDebugCheckLinks(compStressCompile(STRESS_REMORPH_TREES, 50)); // Stash the current estimate of the function's size if necessary.
if (verbose)
{
compSizeEstimate = 0;
compCycleEstimate = 0;
for (BasicBlock* block = fgFirstBB; block != nullptr; block = block->bbNext)
{
for (GenTreeStmt* stmt = block->firstStmt(); stmt != nullptr; stmt = stmt->getNextStmt())
{
compSizeEstimate += stmt->GetCostSz();
compCycleEstimate += stmt->GetCostEx();
}
}
}
#endif #ifndef LEGACY_BACKEND
// rationalize trees
Rationalizer rat(this); // PHASE_RATIONALIZE
rat.Run();
#endif // !LEGACY_BACKEND // Here we do "simple lowering". When the RyuJIT backend works for all
// platforms, this will be part of the more general lowering phase. For now, though, we do a separate
// pass of "final lowering." We must do this before (final) liveness analysis, because this creates
// range check throw blocks, in which the liveness must be correct.
fgSimpleLowering();
EndPhase(PHASE_SIMPLE_LOWERING); #ifdef LEGACY_BACKEND
/* Local variable liveness */
fgLocalVarLiveness();
EndPhase(PHASE_LCLVARLIVENESS);
#endif // !LEGACY_BACKEND #ifdef DEBUG
fgDebugCheckBBlist();
fgDebugCheckLinks();
#endif /* Enable this to gather statistical data such as
* call and register argument info, flowgraph and loop info, etc. */ compJitStats(); #ifdef _TARGET_ARM_
if (compLocallocUsed)
{
// We reserve REG_SAVED_LOCALLOC_SP to store SP on entry for stack unwinding
codeGen->regSet.rsMaskResvd |= RBM_SAVED_LOCALLOC_SP;
}
#endif // _TARGET_ARM_
#ifdef _TARGET_ARMARCH_
if (compRsvdRegCheck(PRE_REGALLOC_FRAME_LAYOUT))
{
// We reserve R10/IP1 in this case to hold the offsets in load/store instructions
codeGen->regSet.rsMaskResvd |= RBM_OPT_RSVD;
assert(REG_OPT_RSVD != REG_FP);
} #ifdef DEBUG
//
// Display the pre-regalloc frame offsets that we have tentatively decided upon
//
if (verbose)
lvaTableDump();
#endif
#endif // _TARGET_ARMARCH_ /* Assign registers to variables, etc. */
CLANG_FORMAT_COMMENT_ANCHOR; #ifndef LEGACY_BACKEND
///////////////////////////////////////////////////////////////////////////////
// Dominator and reachability sets are no longer valid. They haven't been
// maintained up to here, and shouldn't be used (unless recomputed).
///////////////////////////////////////////////////////////////////////////////
fgDomsComputed = false; /* Create LSRA before Lowering, this way Lowering can initialize the TreeNode Map */
m_pLinearScan = getLinearScanAllocator(this); /* Lower */
Lowering lower(this, m_pLinearScan); // PHASE_LOWERING
lower.Run(); assert(lvaSortAgain == false); // We should have re-run fgLocalVarLiveness() in lower.Run()
lvaTrackedFixed = true; // We can not add any new tracked variables after this point. /* Now that lowering is completed we can proceed to perform register allocation */
m_pLinearScan->doLinearScan();
EndPhase(PHASE_LINEAR_SCAN); // Copied from rpPredictRegUse()
genFullPtrRegMap = (codeGen->genInterruptible || !codeGen->isFramePointerUsed());
#else // LEGACY_BACKEND lvaTrackedFixed = true; // We cannot add any new tracked variables after this point.
// For the classic JIT32 at this point lvaSortAgain can be set and raAssignVars() will call lvaSortOnly() // Now do "classic" register allocation.
raAssignVars();
EndPhase(PHASE_RA_ASSIGN_VARS);
#endif // LEGACY_BACKEND #ifdef DEBUG
fgDebugCheckLinks();
#endif /* Generate code */ codeGen->genGenerateCode(methodCodePtr, methodCodeSize); #ifdef FEATURE_JIT_METHOD_PERF
if (pCompJitTimer)
pCompJitTimer->Terminate(this, CompTimeSummaryInfo::s_compTimeSummary);
#endif RecordStateAtEndOfCompilation(); #ifdef FEATURE_TRACELOGGING
compJitTelemetry.NotifyEndOfCompilation();
#endif #if defined(DEBUG)
++Compiler::jitTotalMethodCompiled;
#endif // defined(DEBUG) compFunctionTraceEnd(*methodCodePtr, *methodCodeSize, false); #if FUNC_INFO_LOGGING
if (compJitFuncInfoFile != nullptr)
{
assert(!compIsForInlining());
#ifdef DEBUG // We only have access to info.compFullName in DEBUG builds.
fprintf(compJitFuncInfoFile, "%s\n", info.compFullName);
#elif FEATURE_SIMD
fprintf(compJitFuncInfoFile, " %s\n", eeGetMethodFullName(info.compMethodHnd));
#endif
fprintf(compJitFuncInfoFile, ""); // in our logic this causes a flush
}
#endif // FUNC_INFO_LOGGING
}

JIT主函数中包含了对各个阶段的调用, 例如EndPhase(PHASE_PRE_IMPORT)表示这个阶段的结束.

这里的阶段比微软列出的阶段要多出来一些:

CoreCLR源码探索(八) JIT的工作原理(详解篇)

接下来我们逐个分析这些阶段.

PHASE_PRE_IMPORT

这个阶段负责从IL导入HIR(GenTree)前的一些工作, 包含以下的代码:

if (compIsForInlining())
{
// Notify root instance that an inline attempt is about to import IL
impInlineRoot()->m_inlineStrategy->NoteImport();
} hashBv::Init(this); VarSetOps::AssignAllowUninitRhs(this, compCurLife, VarSetOps::UninitVal()); /* The temp holding the secret stub argument is used by fgImport() when importing the intrinsic. */ if (info.compPublishStubParam)
{
assert(lvaStubArgumentVar == BAD_VAR_NUM);
lvaStubArgumentVar = lvaGrabTempWithImplicitUse(false DEBUGARG("stub argument"));
lvaTable[lvaStubArgumentVar].lvType = TYP_I_IMPL;
} EndPhase(PHASE_PRE_IMPORT);

执行了import前的一些初始化工作,

hashBv::InitCompiler创建一个bitvector的分配器(allocator),

VarSetOps::AssignAllowUninitRhs设置compCurLife的值为未初始化(这个变量会用于保存当前活动的本地变量集合),

compPublishStubParam选项开启时会添加一个额外的本地变量(这个变量会保存函数进入时的rax值).

PHASE_IMPORTATION

这个阶段负责从IL导入HIR(GenTree), 包含以下的代码:

compFunctionTraceStart();

/* Convert the instrs in each basic block to a tree based intermediate representation */

fgImport();

assert(!fgComputePredsDone);
if (fgCheapPredsValid)
{
// Remove cheap predecessors before inlining; allowing the cheap predecessor lists to be inserted
// with inlined blocks causes problems.
fgRemovePreds();
} if (compIsForInlining())
{
/* Quit inlining if fgImport() failed for any reason. */ if (compDonotInline())
{
return;
} /* Filter out unimported BBs */ fgRemoveEmptyBlocks(); return;
} assert(!compDonotInline()); EndPhase(PHASE_IMPORTATION);

compFunctionTraceStart会打印一些除错信息.

fgImport会解析IL并添加GenTree节点, 因为此前已经创建了BasicBlock, 根据IL创建的GenTree会分别添加到对应的BasicBlock中.

BasicBlock + GenTree就是我们通常说的IR, IR有两种形式, 树形式的叫HIR(用于JIT前端), 列表形式的叫LIR(用于JIT后端), 这里构建的是HIR.

fgImport的源代码如下:

void Compiler::fgImport()
{
fgHasPostfix = false; impImport(fgFirstBB); if (!(opts.eeFlags & CORJIT_FLG_SKIP_VERIFICATION))
{
CorInfoMethodRuntimeFlags verFlag;
verFlag = tiIsVerifiableCode ? CORINFO_FLG_VERIFIABLE : CORINFO_FLG_UNVERIFIABLE;
info.compCompHnd->setMethodAttribs(info.compMethodHnd, verFlag);
}
}

对第一个BasicBlock调用了impImport.

impImport的源代码如下:

/*****************************************************************************
*
* Convert the instrs ("import") into our internal format (trees). The
* basic flowgraph has already been constructed and is passed in.
*/ void Compiler::impImport(BasicBlock* method)
{
#ifdef DEBUG
if (verbose)
{
printf("*************** In impImport() for %s\n", info.compFullName);
}
#endif /* Allocate the stack contents */ if (info.compMaxStack <= sizeof(impSmallStack) / sizeof(impSmallStack[0]))
{
/* Use local variable, don't waste time allocating on the heap */ impStkSize = sizeof(impSmallStack) / sizeof(impSmallStack[0]);
verCurrentState.esStack = impSmallStack;
}
else
{
impStkSize = info.compMaxStack;
verCurrentState.esStack = new (this, CMK_ImpStack) StackEntry[impStkSize];
} // initialize the entry state at start of method
verInitCurrentState(); // Initialize stuff related to figuring "spill cliques" (see spec comment for impGetSpillTmpBase).
Compiler* inlineRoot = impInlineRoot();
if (this == inlineRoot) // These are only used on the root of the inlining tree.
{
// We have initialized these previously, but to size 0. Make them larger.
impPendingBlockMembers.Init(getAllocator(), fgBBNumMax * 2);
impSpillCliquePredMembers.Init(getAllocator(), fgBBNumMax * 2);
impSpillCliqueSuccMembers.Init(getAllocator(), fgBBNumMax * 2);
}
inlineRoot->impPendingBlockMembers.Reset(fgBBNumMax * 2);
inlineRoot->impSpillCliquePredMembers.Reset(fgBBNumMax * 2);
inlineRoot->impSpillCliqueSuccMembers.Reset(fgBBNumMax * 2);
impBlockListNodeFreeList = nullptr; #ifdef DEBUG
impLastILoffsStmt = nullptr;
impNestedStackSpill = false;
#endif
impBoxTemp = BAD_VAR_NUM; impPendingList = impPendingFree = nullptr; /* Add the entry-point to the worker-list */ // Skip leading internal blocks. There can be one as a leading scratch BB, and more
// from EH normalization.
// NOTE: It might be possible to always just put fgFirstBB on the pending list, and let everything else just fall
// out.
for (; method->bbFlags & BBF_INTERNAL; method = method->bbNext)
{
// Treat these as imported.
assert(method->bbJumpKind == BBJ_NONE); // We assume all the leading ones are fallthrough.
JITDUMP("Marking leading BBF_INTERNAL block BB%02u as BBF_IMPORTED\n", method->bbNum);
method->bbFlags |= BBF_IMPORTED;
} impImportBlockPending(method); /* Import blocks in the worker-list until there are no more */ while (impPendingList)
{
/* Remove the entry at the front of the list */ PendingDsc* dsc = impPendingList;
impPendingList = impPendingList->pdNext;
impSetPendingBlockMember(dsc->pdBB, 0); /* Restore the stack state */ verCurrentState.thisInitialized = dsc->pdThisPtrInit;
verCurrentState.esStackDepth = dsc->pdSavedStack.ssDepth;
if (verCurrentState.esStackDepth)
{
impRestoreStackState(&dsc->pdSavedStack);
} /* Add the entry to the free list for reuse */ dsc->pdNext = impPendingFree;
impPendingFree = dsc; /* Now import the block */ if (dsc->pdBB->bbFlags & BBF_FAILED_VERIFICATION)
{ #ifdef _TARGET_64BIT_
// On AMD64, during verification we have to match JIT64 behavior since the VM is very tighly
// coupled with the JIT64 IL Verification logic. Look inside verHandleVerificationFailure
// method for further explanation on why we raise this exception instead of making the jitted
// code throw the verification exception during execution.
if (tiVerificationNeeded && (opts.eeFlags & CORJIT_FLG_IMPORT_ONLY) != 0)
{
BADCODE("Basic block marked as not verifiable");
}
else
#endif // _TARGET_64BIT_
{
verConvertBBToThrowVerificationException(dsc->pdBB DEBUGARG(true));
impEndTreeList(dsc->pdBB);
}
}
else
{
impImportBlock(dsc->pdBB); if (compDonotInline())
{
return;
}
if (compIsForImportOnly() && !tiVerificationNeeded)
{
return;
}
}
} #ifdef DEBUG
if (verbose && info.compXcptnsCount)
{
printf("\nAfter impImport() added block for try,catch,finally");
fgDispBasicBlocks();
printf("\n");
} // Used in impImportBlockPending() for STRESS_CHK_REIMPORT
for (BasicBlock* block = fgFirstBB; block; block = block->bbNext)
{
block->bbFlags &= ~BBF_VISITED;
}
#endif assert(!compIsForInlining() || !tiVerificationNeeded);
}

首先初始化运行堆栈(execution stack)verCurrentState.esStack, maxstack小于16时使用SmallStack, 否则new.

然后初始化记录"Spill Cliques"(Spill Temps的群体, 用于保存从运行堆栈spill出来的值的临时变量)所需的成员.

之后标记内部添加的(BBF_INTERNAL)BasicBlock为已导入(BBF_IMPORTED), 因为这些block并无对应的IL范围.

接下来会添加第一个非内部的BasicBlock到队列impPendingList, 然后一直处理这个队列直到它为空.

处理队列中的BasicBlock会调用函数impImportBlock(dsc->pdBB).

impImportBlock的源代码如下:

//***************************************************************
// Import the instructions for the given basic block. Perform
// verification, throwing an exception on failure. Push any successor blocks that are enabled for the first
// time, or whose verification pre-state is changed. #ifdef _PREFAST_
#pragma warning(push)
#pragma warning(disable : 21000) // Suppress PREFast warning about overly large function
#endif
void Compiler::impImportBlock(BasicBlock* block)
{
// BBF_INTERNAL blocks only exist during importation due to EH canonicalization. We need to
// handle them specially. In particular, there is no IL to import for them, but we do need
// to mark them as imported and put their successors on the pending import list.
if (block->bbFlags & BBF_INTERNAL)
{
JITDUMP("Marking BBF_INTERNAL block BB%02u as BBF_IMPORTED\n", block->bbNum);
block->bbFlags |= BBF_IMPORTED; for (unsigned i = 0; i < block->NumSucc(); i++)
{
impImportBlockPending(block->GetSucc(i));
} return;
} bool markImport; assert(block); /* Make the block globaly available */ compCurBB = block; #ifdef DEBUG
/* Initialize the debug variables */
impCurOpcName = "unknown";
impCurOpcOffs = block->bbCodeOffs;
#endif /* Set the current stack state to the merged result */
verResetCurrentState(block, &verCurrentState); /* Now walk the code and import the IL into GenTrees */ struct FilterVerificationExceptionsParam
{
Compiler* pThis;
BasicBlock* block;
};
FilterVerificationExceptionsParam param; param.pThis = this;
param.block = block; PAL_TRY(FilterVerificationExceptionsParam*, pParam, &param)
{
/* @VERIFICATION : For now, the only state propagation from try
to it's handler is "thisInit" state (stack is empty at start of try).
In general, for state that we track in verification, we need to
model the possibility that an exception might happen at any IL
instruction, so we really need to merge all states that obtain
between IL instructions in a try block into the start states of
all handlers. However we do not allow the 'this' pointer to be uninitialized when
entering most kinds try regions (only try/fault are allowed to have
an uninitialized this pointer on entry to the try) Fortunately, the stack is thrown away when an exception
leads to a handler, so we don't have to worry about that.
We DO, however, have to worry about the "thisInit" state.
But only for the try/fault case. The only allowed transition is from TIS_Uninit to TIS_Init. So for a try/fault region for the fault handler block
we will merge the start state of the try begin
and the post-state of each block that is part of this try region
*/ // merge the start state of the try begin
//
if (pParam->block->bbFlags & BBF_TRY_BEG)
{
pParam->pThis->impVerifyEHBlock(pParam->block, true);
} pParam->pThis->impImportBlockCode(pParam->block); // As discussed above:
// merge the post-state of each block that is part of this try region
//
if (pParam->block->hasTryIndex())
{
pParam->pThis->impVerifyEHBlock(pParam->block, false);
}
}
PAL_EXCEPT_FILTER(FilterVerificationExceptions)
{
verHandleVerificationFailure(block DEBUGARG(false));
}
PAL_ENDTRY if (compDonotInline())
{
return;
} assert(!compDonotInline()); markImport = false; SPILLSTACK: unsigned baseTmp = NO_BASE_TMP; // input temps assigned to successor blocks
bool reimportSpillClique = false;
BasicBlock* tgtBlock = nullptr; /* If the stack is non-empty, we might have to spill its contents */ if (verCurrentState.esStackDepth != 0)
{
impBoxTemp = BAD_VAR_NUM; // if a box temp is used in a block that leaves something
// on the stack, its lifetime is hard to determine, simply
// don't reuse such temps. GenTreePtr addStmt = nullptr; /* Do the successors of 'block' have any other predecessors ?
We do not want to do some of the optimizations related to multiRef
if we can reimport blocks */ unsigned multRef = impCanReimport ? unsigned(~0) : 0; switch (block->bbJumpKind)
{
case BBJ_COND: /* Temporarily remove the 'jtrue' from the end of the tree list */ assert(impTreeLast);
assert(impTreeLast->gtOper == GT_STMT);
assert(impTreeLast->gtStmt.gtStmtExpr->gtOper == GT_JTRUE); addStmt = impTreeLast;
impTreeLast = impTreeLast->gtPrev; /* Note if the next block has more than one ancestor */ multRef |= block->bbNext->bbRefs; /* Does the next block have temps assigned? */ baseTmp = block->bbNext->bbStkTempsIn;
tgtBlock = block->bbNext; if (baseTmp != NO_BASE_TMP)
{
break;
} /* Try the target of the jump then */ multRef |= block->bbJumpDest->bbRefs;
baseTmp = block->bbJumpDest->bbStkTempsIn;
tgtBlock = block->bbJumpDest;
break; case BBJ_ALWAYS:
multRef |= block->bbJumpDest->bbRefs;
baseTmp = block->bbJumpDest->bbStkTempsIn;
tgtBlock = block->bbJumpDest;
break; case BBJ_NONE:
multRef |= block->bbNext->bbRefs;
baseTmp = block->bbNext->bbStkTempsIn;
tgtBlock = block->bbNext;
break; case BBJ_SWITCH: BasicBlock** jmpTab;
unsigned jmpCnt; /* Temporarily remove the GT_SWITCH from the end of the tree list */ assert(impTreeLast);
assert(impTreeLast->gtOper == GT_STMT);
assert(impTreeLast->gtStmt.gtStmtExpr->gtOper == GT_SWITCH); addStmt = impTreeLast;
impTreeLast = impTreeLast->gtPrev; jmpCnt = block->bbJumpSwt->bbsCount;
jmpTab = block->bbJumpSwt->bbsDstTab; do
{
tgtBlock = (*jmpTab); multRef |= tgtBlock->bbRefs; // Thanks to spill cliques, we should have assigned all or none
assert((baseTmp == NO_BASE_TMP) || (baseTmp == tgtBlock->bbStkTempsIn));
baseTmp = tgtBlock->bbStkTempsIn;
if (multRef > 1)
{
break;
}
} while (++jmpTab, --jmpCnt); break; case BBJ_CALLFINALLY:
case BBJ_EHCATCHRET:
case BBJ_RETURN:
case BBJ_EHFINALLYRET:
case BBJ_EHFILTERRET:
case BBJ_THROW:
NO_WAY("can't have 'unreached' end of BB with non-empty stack");
break; default:
noway_assert(!"Unexpected bbJumpKind");
break;
} assert(multRef >= 1); /* Do we have a base temp number? */ bool newTemps = (baseTmp == NO_BASE_TMP); if (newTemps)
{
/* Grab enough temps for the whole stack */
baseTmp = impGetSpillTmpBase(block);
} /* Spill all stack entries into temps */
unsigned level, tempNum; JITDUMP("\nSpilling stack entries into temps\n");
for (level = 0, tempNum = baseTmp; level < verCurrentState.esStackDepth; level++, tempNum++)
{
GenTreePtr tree = verCurrentState.esStack[level].val; /* VC generates code where it pushes a byref from one branch, and an int (ldc.i4 0) from
the other. This should merge to a byref in unverifiable code.
However, if the branch which leaves the TYP_I_IMPL on the stack is imported first, the
successor would be imported assuming there was a TYP_I_IMPL on
the stack. Thus the value would not get GC-tracked. Hence,
change the temp to TYP_BYREF and reimport the successors.
Note: We should only allow this in unverifiable code.
*/
if (tree->gtType == TYP_BYREF && lvaTable[tempNum].lvType == TYP_I_IMPL && !verNeedsVerification())
{
lvaTable[tempNum].lvType = TYP_BYREF;
impReimportMarkSuccessors(block);
markImport = true;
} #ifdef _TARGET_64BIT_
if (genActualType(tree->gtType) == TYP_I_IMPL && lvaTable[tempNum].lvType == TYP_INT)
{
if (tiVerificationNeeded && tgtBlock->bbEntryState != nullptr &&
(tgtBlock->bbFlags & BBF_FAILED_VERIFICATION) == 0)
{
// Merge the current state into the entry state of block;
// the call to verMergeEntryStates must have changed
// the entry state of the block by merging the int local var
// and the native-int stack entry.
bool changed = false;
if (verMergeEntryStates(tgtBlock, &changed))
{
impRetypeEntryStateTemps(tgtBlock);
impReimportBlockPending(tgtBlock);
assert(changed);
}
else
{
tgtBlock->bbFlags |= BBF_FAILED_VERIFICATION;
break;
}
} // Some other block in the spill clique set this to "int", but now we have "native int".
// Change the type and go back to re-import any blocks that used the wrong type.
lvaTable[tempNum].lvType = TYP_I_IMPL;
reimportSpillClique = true;
}
else if (genActualType(tree->gtType) == TYP_INT && lvaTable[tempNum].lvType == TYP_I_IMPL)
{
// Spill clique has decided this should be "native int", but this block only pushes an "int".
// Insert a sign-extension to "native int" so we match the clique.
verCurrentState.esStack[level].val = gtNewCastNode(TYP_I_IMPL, tree, TYP_I_IMPL);
} // Consider the case where one branch left a 'byref' on the stack and the other leaves
// an 'int'. On 32-bit, this is allowed (in non-verifiable code) since they are the same
// size. JIT64 managed to make this work on 64-bit. For compatibility, we support JIT64
// behavior instead of asserting and then generating bad code (where we save/restore the
// low 32 bits of a byref pointer to an 'int' sized local). If the 'int' side has been
// imported already, we need to change the type of the local and reimport the spill clique.
// If the 'byref' side has imported, we insert a cast from int to 'native int' to match
// the 'byref' size.
if (!tiVerificationNeeded)
{
if (genActualType(tree->gtType) == TYP_BYREF && lvaTable[tempNum].lvType == TYP_INT)
{
// Some other block in the spill clique set this to "int", but now we have "byref".
// Change the type and go back to re-import any blocks that used the wrong type.
lvaTable[tempNum].lvType = TYP_BYREF;
reimportSpillClique = true;
}
else if (genActualType(tree->gtType) == TYP_INT && lvaTable[tempNum].lvType == TYP_BYREF)
{
// Spill clique has decided this should be "byref", but this block only pushes an "int".
// Insert a sign-extension to "native int" so we match the clique size.
verCurrentState.esStack[level].val = gtNewCastNode(TYP_I_IMPL, tree, TYP_I_IMPL);
}
}
#endif // _TARGET_64BIT_ #if FEATURE_X87_DOUBLES
// X87 stack doesn't differentiate between float/double
// so promoting is no big deal.
// For everybody else keep it as float until we have a collision and then promote
// Just like for x64's TYP_INT<->TYP_I_IMPL if (multRef > 1 && tree->gtType == TYP_FLOAT)
{
verCurrentState.esStack[level].val = gtNewCastNode(TYP_DOUBLE, tree, TYP_DOUBLE);
} #else // !FEATURE_X87_DOUBLES if (tree->gtType == TYP_DOUBLE && lvaTable[tempNum].lvType == TYP_FLOAT)
{
// Some other block in the spill clique set this to "float", but now we have "double".
// Change the type and go back to re-import any blocks that used the wrong type.
lvaTable[tempNum].lvType = TYP_DOUBLE;
reimportSpillClique = true;
}
else if (tree->gtType == TYP_FLOAT && lvaTable[tempNum].lvType == TYP_DOUBLE)
{
// Spill clique has decided this should be "double", but this block only pushes a "float".
// Insert a cast to "double" so we match the clique.
verCurrentState.esStack[level].val = gtNewCastNode(TYP_DOUBLE, tree, TYP_DOUBLE);
} #endif // FEATURE_X87_DOUBLES /* If addStmt has a reference to tempNum (can only happen if we
are spilling to the temps already used by a previous block),
we need to spill addStmt */ if (addStmt && !newTemps && gtHasRef(addStmt->gtStmt.gtStmtExpr, tempNum, false))
{
GenTreePtr addTree = addStmt->gtStmt.gtStmtExpr; if (addTree->gtOper == GT_JTRUE)
{
GenTreePtr relOp = addTree->gtOp.gtOp1;
assert(relOp->OperIsCompare()); var_types type = genActualType(relOp->gtOp.gtOp1->TypeGet()); if (gtHasRef(relOp->gtOp.gtOp1, tempNum, false))
{
unsigned temp = lvaGrabTemp(true DEBUGARG("spill addStmt JTRUE ref Op1"));
impAssignTempGen(temp, relOp->gtOp.gtOp1, level);
type = genActualType(lvaTable[temp].TypeGet());
relOp->gtOp.gtOp1 = gtNewLclvNode(temp, type);
} if (gtHasRef(relOp->gtOp.gtOp2, tempNum, false))
{
unsigned temp = lvaGrabTemp(true DEBUGARG("spill addStmt JTRUE ref Op2"));
impAssignTempGen(temp, relOp->gtOp.gtOp2, level);
type = genActualType(lvaTable[temp].TypeGet());
relOp->gtOp.gtOp2 = gtNewLclvNode(temp, type);
}
}
else
{
assert(addTree->gtOper == GT_SWITCH && genActualType(addTree->gtOp.gtOp1->gtType) == TYP_I_IMPL); unsigned temp = lvaGrabTemp(true DEBUGARG("spill addStmt SWITCH"));
impAssignTempGen(temp, addTree->gtOp.gtOp1, level);
addTree->gtOp.gtOp1 = gtNewLclvNode(temp, TYP_I_IMPL);
}
} /* Spill the stack entry, and replace with the temp */ if (!impSpillStackEntry(level, tempNum
#ifdef DEBUG
,
true, "Spill Stack Entry"
#endif
))
{
if (markImport)
{
BADCODE("bad stack state");
} // Oops. Something went wrong when spilling. Bad code.
verHandleVerificationFailure(block DEBUGARG(true)); goto SPILLSTACK;
}
} /* Put back the 'jtrue'/'switch' if we removed it earlier */ if (addStmt)
{
impAppendStmt(addStmt, (unsigned)CHECK_SPILL_NONE);
}
} // Some of the append/spill logic works on compCurBB assert(compCurBB == block); /* Save the tree list in the block */
impEndTreeList(block); // impEndTreeList sets BBF_IMPORTED on the block
// We do *NOT* want to set it later than this because
// impReimportSpillClique might clear it if this block is both a
// predecessor and successor in the current spill clique
assert(block->bbFlags & BBF_IMPORTED); // If we had a int/native int, or float/double collision, we need to re-import
if (reimportSpillClique)
{
// This will re-import all the successors of block (as well as each of their predecessors)
impReimportSpillClique(block); // For blocks that haven't been imported yet, we still need to mark them as pending import.
for (unsigned i = 0; i < block->NumSucc(); i++)
{
BasicBlock* succ = block->GetSucc(i);
if ((succ->bbFlags & BBF_IMPORTED) == 0)
{
impImportBlockPending(succ);
}
}
}
else // the normal case
{
// otherwise just import the successors of block /* Does this block jump to any other blocks? */
for (unsigned i = 0; i < block->NumSucc(); i++)
{
impImportBlockPending(block->GetSucc(i));
}
}
}
#ifdef _PREFAST_
#pragma warning(pop)
#endif

这个函数首先会调用impImportBlockCode, impImportBlockCode负责根据IL生成GenTree的主要处理.

导入block后, 如果运行堆栈不为空(跳转后的指令需要跳转前push进去的参数), 需要把运行堆栈中的值spill到临时变量.

block结束后spill的临时变量的索引开始值会保存在bbStkTempsOut, block开始时需要读取的临时变量的索引开始值保存在bbStkTempsIn.

因为运行堆栈中的值基本上不会跨越BasicBlock(从C#编译出来的IL), 就不详细分析这里的逻辑了.

接下来看impImportBlockCode.

impImportBlockCode的源代码如下:

这个函数有5000多行, 这里我只截取一部分.

#ifdef _PREFAST_
#pragma warning(push)
#pragma warning(disable : 21000) // Suppress PREFast warning about overly large function
#endif
/*****************************************************************************
* Import the instr for the given basic block
*/
void Compiler::impImportBlockCode(BasicBlock* block)
{
#define _impResolveToken(kind) impResolveToken(codeAddr, &resolvedToken, kind) #ifdef DEBUG if (verbose)
{
printf("\nImporting BB%02u (PC=%03u) of '%s'", block->bbNum, block->bbCodeOffs, info.compFullName);
}
#endif unsigned nxtStmtIndex = impInitBlockLineInfo();
IL_OFFSET nxtStmtOffs; GenTreePtr arrayNodeFrom, arrayNodeTo, arrayNodeToIndex;
bool expandInline;
CorInfoHelpFunc helper;
CorInfoIsAccessAllowedResult accessAllowedResult;
CORINFO_HELPER_DESC calloutHelper;
const BYTE* lastLoadToken = nullptr; // reject cyclic constraints
if (tiVerificationNeeded)
{
Verify(!info.hasCircularClassConstraints, "Method parent has circular class type parameter constraints.");
Verify(!info.hasCircularMethodConstraints, "Method has circular method type parameter constraints.");
} /* Get the tree list started */ impBeginTreeList(); /* Walk the opcodes that comprise the basic block */ const BYTE* codeAddr = info.compCode + block->bbCodeOffs;
const BYTE* codeEndp = info.compCode + block->bbCodeOffsEnd; IL_OFFSET opcodeOffs = block->bbCodeOffs;
IL_OFFSET lastSpillOffs = opcodeOffs; signed jmpDist; /* remember the start of the delegate creation sequence (used for verification) */
const BYTE* delegateCreateStart = nullptr; int prefixFlags = 0;
bool explicitTailCall, constraintCall, readonlyCall; bool insertLdloc = false; // set by CEE_DUP and cleared by following store
typeInfo tiRetVal; unsigned numArgs = info.compArgsCount; /* Now process all the opcodes in the block */ var_types callTyp = TYP_COUNT;
OPCODE prevOpcode = CEE_ILLEGAL; if (block->bbCatchTyp)
{
if (info.compStmtOffsetsImplicit & ICorDebugInfo::CALL_SITE_BOUNDARIES)
{
impCurStmtOffsSet(block->bbCodeOffs);
} // We will spill the GT_CATCH_ARG and the input of the BB_QMARK block
// to a temp. This is a trade off for code simplicity
impSpillSpecialSideEff();
} while (codeAddr < codeEndp)
{
bool usingReadyToRunHelper = false;
CORINFO_RESOLVED_TOKEN resolvedToken;
CORINFO_RESOLVED_TOKEN constrainedResolvedToken;
CORINFO_CALL_INFO callInfo;
CORINFO_FIELD_INFO fieldInfo; tiRetVal = typeInfo(); // Default type info //--------------------------------------------------------------------- /* We need to restrict the max tree depth as many of the Compiler
functions are recursive. We do this by spilling the stack */ if (verCurrentState.esStackDepth)
{
/* Has it been a while since we last saw a non-empty stack (which
guarantees that the tree depth isnt accumulating. */ if ((opcodeOffs - lastSpillOffs) > 200)
{
impSpillStackEnsure();
lastSpillOffs = opcodeOffs;
}
}
else
{
lastSpillOffs = opcodeOffs;
impBoxTempInUse = false; // nothing on the stack, box temp OK to use again
} /* Compute the current instr offset */ opcodeOffs = (IL_OFFSET)(codeAddr - info.compCode); #if defined(DEBUGGING_SUPPORT) || defined(DEBUG) #ifndef DEBUG
if (opts.compDbgInfo)
#endif
{
if (!compIsForInlining())
{
nxtStmtOffs =
(nxtStmtIndex < info.compStmtOffsetsCount) ? info.compStmtOffsets[nxtStmtIndex] : BAD_IL_OFFSET; /* Have we reached the next stmt boundary ? */ if (nxtStmtOffs != BAD_IL_OFFSET && opcodeOffs >= nxtStmtOffs)
{
assert(nxtStmtOffs == info.compStmtOffsets[nxtStmtIndex]); if (verCurrentState.esStackDepth != 0 && opts.compDbgCode)
{
/* We need to provide accurate IP-mapping at this point.
So spill anything on the stack so that it will form
gtStmts with the correct stmt offset noted */ impSpillStackEnsure(true);
} // Has impCurStmtOffs been reported in any tree? if (impCurStmtOffs != BAD_IL_OFFSET && opts.compDbgCode)
{
GenTreePtr placeHolder = new (this, GT_NO_OP) GenTree(GT_NO_OP, TYP_VOID);
impAppendTree(placeHolder, (unsigned)CHECK_SPILL_NONE, impCurStmtOffs); assert(impCurStmtOffs == BAD_IL_OFFSET);
} if (impCurStmtOffs == BAD_IL_OFFSET)
{
/* Make sure that nxtStmtIndex is in sync with opcodeOffs.
If opcodeOffs has gone past nxtStmtIndex, catch up */ while ((nxtStmtIndex + 1) < info.compStmtOffsetsCount &&
info.compStmtOffsets[nxtStmtIndex + 1] <= opcodeOffs)
{
nxtStmtIndex++;
} /* Go to the new stmt */ impCurStmtOffsSet(info.compStmtOffsets[nxtStmtIndex]); /* Update the stmt boundary index */ nxtStmtIndex++;
assert(nxtStmtIndex <= info.compStmtOffsetsCount); /* Are there any more line# entries after this one? */ if (nxtStmtIndex < info.compStmtOffsetsCount)
{
/* Remember where the next line# starts */ nxtStmtOffs = info.compStmtOffsets[nxtStmtIndex];
}
else
{
/* No more line# entries */ nxtStmtOffs = BAD_IL_OFFSET;
}
}
}
else if ((info.compStmtOffsetsImplicit & ICorDebugInfo::STACK_EMPTY_BOUNDARIES) &&
(verCurrentState.esStackDepth == 0))
{
/* At stack-empty locations, we have already added the tree to
the stmt list with the last offset. We just need to update
impCurStmtOffs
*/ impCurStmtOffsSet(opcodeOffs);
}
else if ((info.compStmtOffsetsImplicit & ICorDebugInfo::CALL_SITE_BOUNDARIES) &&
impOpcodeIsCallSiteBoundary(prevOpcode))
{
/* Make sure we have a type cached */
assert(callTyp != TYP_COUNT); if (callTyp == TYP_VOID)
{
impCurStmtOffsSet(opcodeOffs);
}
else if (opts.compDbgCode)
{
impSpillStackEnsure(true);
impCurStmtOffsSet(opcodeOffs);
}
}
else if ((info.compStmtOffsetsImplicit & ICorDebugInfo::NOP_BOUNDARIES) && (prevOpcode == CEE_NOP))
{
if (opts.compDbgCode)
{
impSpillStackEnsure(true);
} impCurStmtOffsSet(opcodeOffs);
} assert(impCurStmtOffs == BAD_IL_OFFSET || nxtStmtOffs == BAD_IL_OFFSET ||
jitGetILoffs(impCurStmtOffs) <= nxtStmtOffs);
}
} #endif // defined(DEBUGGING_SUPPORT) || defined(DEBUG) CORINFO_CLASS_HANDLE clsHnd = DUMMY_INIT(NULL);
CORINFO_CLASS_HANDLE ldelemClsHnd = DUMMY_INIT(NULL);
CORINFO_CLASS_HANDLE stelemClsHnd = DUMMY_INIT(NULL); var_types lclTyp, ovflType = TYP_UNKNOWN;
GenTreePtr op1 = DUMMY_INIT(NULL);
GenTreePtr op2 = DUMMY_INIT(NULL);
GenTreeArgList* args = nullptr; // What good do these "DUMMY_INIT"s do?
GenTreePtr newObjThisPtr = DUMMY_INIT(NULL);
bool uns = DUMMY_INIT(false); /* Get the next opcode and the size of its parameters */ OPCODE opcode = (OPCODE)getU1LittleEndian(codeAddr);
codeAddr += sizeof(__int8); #ifdef DEBUG
impCurOpcOffs = (IL_OFFSET)(codeAddr - info.compCode - 1);
JITDUMP("\n [%2u] %3u (0x%03x) ", verCurrentState.esStackDepth, impCurOpcOffs, impCurOpcOffs);
#endif DECODE_OPCODE: // Return if any previous code has caused inline to fail.
if (compDonotInline())
{
return;
} /* Get the size of additional parameters */ signed int sz = opcodeSizes[opcode]; #ifdef DEBUG
clsHnd = NO_CLASS_HANDLE;
lclTyp = TYP_COUNT;
callTyp = TYP_COUNT; impCurOpcOffs = (IL_OFFSET)(codeAddr - info.compCode - 1);
impCurOpcName = opcodeNames[opcode]; if (verbose && (opcode != CEE_PREFIX1))
{
printf("%s", impCurOpcName);
} /* Use assertImp() to display the opcode */ op1 = op2 = nullptr;
#endif /* See what kind of an opcode we have, then */ unsigned mflags = 0;
unsigned clsFlags = 0; switch (opcode)
{
unsigned lclNum;
var_types type; GenTreePtr op3;
genTreeOps oper;
unsigned size; int val; CORINFO_SIG_INFO sig;
unsigned flags;
IL_OFFSET jmpAddr;
bool ovfl, unordered, callNode;
bool ldstruct;
CORINFO_CLASS_HANDLE tokenType; union {
int intVal;
float fltVal;
__int64 lngVal;
double dblVal;
} cval; case CEE_PREFIX1:
opcode = (OPCODE)(getU1LittleEndian(codeAddr) + 256);
codeAddr += sizeof(__int8);
opcodeOffs = (IL_OFFSET)(codeAddr - info.compCode);
goto DECODE_OPCODE; SPILL_APPEND: /* Append 'op1' to the list of statements */
impAppendTree(op1, (unsigned)CHECK_SPILL_ALL, impCurStmtOffs);
goto DONE_APPEND; APPEND: /* Append 'op1' to the list of statements */ impAppendTree(op1, (unsigned)CHECK_SPILL_NONE, impCurStmtOffs);
goto DONE_APPEND; DONE_APPEND: #ifdef DEBUG
// Remember at which BC offset the tree was finished
impNoteLastILoffs();
#endif
break; case CEE_LDNULL:
impPushNullObjRefOnStack();
break; case CEE_LDC_I4_M1:
case CEE_LDC_I4_0:
case CEE_LDC_I4_1:
case CEE_LDC_I4_2:
case CEE_LDC_I4_3:
case CEE_LDC_I4_4:
case CEE_LDC_I4_5:
case CEE_LDC_I4_6:
case CEE_LDC_I4_7:
case CEE_LDC_I4_8:
cval.intVal = (opcode - CEE_LDC_I4_0);
assert(-1 <= cval.intVal && cval.intVal <= 8);
goto PUSH_I4CON; case CEE_LDC_I4_S:
cval.intVal = getI1LittleEndian(codeAddr);
goto PUSH_I4CON;
case CEE_LDC_I4:
cval.intVal = getI4LittleEndian(codeAddr);
goto PUSH_I4CON;
PUSH_I4CON:
JITDUMP(" %d", cval.intVal);
impPushOnStack(gtNewIconNode(cval.intVal), typeInfo(TI_INT));
break; case CEE_LDC_I8:
cval.lngVal = getI8LittleEndian(codeAddr);
JITDUMP(" 0x%016llx", cval.lngVal);
impPushOnStack(gtNewLconNode(cval.lngVal), typeInfo(TI_LONG));
break; case CEE_LDC_R8:
cval.dblVal = getR8LittleEndian(codeAddr);
JITDUMP(" %#.17g", cval.dblVal);
impPushOnStack(gtNewDconNode(cval.dblVal), typeInfo(TI_DOUBLE));
break; case CEE_LDC_R4:
cval.dblVal = getR4LittleEndian(codeAddr);
JITDUMP(" %#.17g", cval.dblVal);
{
GenTreePtr cnsOp = gtNewDconNode(cval.dblVal);
#if !FEATURE_X87_DOUBLES
// X87 stack doesn't differentiate between float/double
// so R4 is treated as R8, but everybody else does
cnsOp->gtType = TYP_FLOAT;
#endif // FEATURE_X87_DOUBLES
impPushOnStack(cnsOp, typeInfo(TI_DOUBLE));
}
break; case CEE_LDSTR: if (compIsForInlining())
{
if (impInlineInfo->inlineCandidateInfo->dwRestrictions & INLINE_NO_CALLEE_LDSTR)
{
compInlineResult->NoteFatal(InlineObservation::CALLSITE_HAS_LDSTR_RESTRICTION);
return;
}
} val = getU4LittleEndian(codeAddr);
JITDUMP(" %08X", val);
if (tiVerificationNeeded)
{
Verify(info.compCompHnd->isValidStringRef(info.compScopeHnd, val), "bad string");
tiRetVal = typeInfo(TI_REF, impGetStringClass());
}
impPushOnStack(gtNewSconNode(val, info.compScopeHnd), tiRetVal); break; case CEE_LDARG:
lclNum = getU2LittleEndian(codeAddr);
JITDUMP(" %u", lclNum);
impLoadArg(lclNum, opcodeOffs + sz + 1);
break; case CEE_LDARG_S:
lclNum = getU1LittleEndian(codeAddr);
JITDUMP(" %u", lclNum);
impLoadArg(lclNum, opcodeOffs + sz + 1);
break; case CEE_LDARG_0:
case CEE_LDARG_1:
case CEE_LDARG_2:
case CEE_LDARG_3:
lclNum = (opcode - CEE_LDARG_0);
assert(lclNum >= 0 && lclNum < 4);
impLoadArg(lclNum, opcodeOffs + sz + 1);
break; case CEE_LDLOC:
lclNum = getU2LittleEndian(codeAddr);
JITDUMP(" %u", lclNum);
impLoadLoc(lclNum, opcodeOffs + sz + 1);
break; case CEE_LDLOC_S:
lclNum = getU1LittleEndian(codeAddr);
JITDUMP(" %u", lclNum);
impLoadLoc(lclNum, opcodeOffs + sz + 1);
break; case CEE_LDLOC_0:
case CEE_LDLOC_1:
case CEE_LDLOC_2:
case CEE_LDLOC_3:
lclNum = (opcode - CEE_LDLOC_0);
assert(lclNum >= 0 && lclNum < 4);
impLoadLoc(lclNum, opcodeOffs + sz + 1);
break; case CEE_STARG:
lclNum = getU2LittleEndian(codeAddr);
goto STARG; case CEE_STARG_S:
lclNum = getU1LittleEndian(codeAddr);
STARG:
JITDUMP(" %u", lclNum); if (tiVerificationNeeded)
{
Verify(lclNum < info.compILargsCount, "bad arg num");
} if (compIsForInlining())
{
op1 = impInlineFetchArg(lclNum, impInlineInfo->inlArgInfo, impInlineInfo->lclVarInfo);
noway_assert(op1->gtOper == GT_LCL_VAR);
lclNum = op1->AsLclVar()->gtLclNum; goto VAR_ST_VALID;
} lclNum = compMapILargNum(lclNum); // account for possible hidden param
assertImp(lclNum < numArgs); if (lclNum == info.compThisArg)
{
lclNum = lvaArg0Var;
}
lvaTable[lclNum].lvArgWrite = 1; if (tiVerificationNeeded)
{
typeInfo& tiLclVar = lvaTable[lclNum].lvVerTypeInfo;
Verify(tiCompatibleWith(impStackTop().seTypeInfo, NormaliseForStack(tiLclVar), true),
"type mismatch"); if (verTrackObjCtorInitState && (verCurrentState.thisInitialized != TIS_Init))
{
Verify(!tiLclVar.IsThisPtr(), "storing to uninit this ptr");
}
} goto VAR_ST; case CEE_STLOC:
lclNum = getU2LittleEndian(codeAddr);
JITDUMP(" %u", lclNum);
goto LOC_ST; case CEE_STLOC_S:
lclNum = getU1LittleEndian(codeAddr);
JITDUMP(" %u", lclNum);
goto LOC_ST; case CEE_STLOC_0:
case CEE_STLOC_1:
case CEE_STLOC_2:
case CEE_STLOC_3:
lclNum = (opcode - CEE_STLOC_0);
assert(lclNum >= 0 && lclNum < 4); LOC_ST:
if (tiVerificationNeeded)
{
Verify(lclNum < info.compMethodInfo->locals.numArgs, "bad local num");
Verify(tiCompatibleWith(impStackTop().seTypeInfo,
NormaliseForStack(lvaTable[lclNum + numArgs].lvVerTypeInfo), true),
"type mismatch");
} if (compIsForInlining())
{
lclTyp = impInlineInfo->lclVarInfo[lclNum + impInlineInfo->argCnt].lclTypeInfo; /* Have we allocated a temp for this local? */ lclNum = impInlineFetchLocal(lclNum DEBUGARG("Inline stloc first use temp")); goto _PopValue;
} lclNum += numArgs; VAR_ST: if (lclNum >= info.compLocalsCount && lclNum != lvaArg0Var)
{
assert(!tiVerificationNeeded); // We should have thrown the VerificationException before.
BADCODE("Bad IL");
} VAR_ST_VALID: /* if it is a struct assignment, make certain we don't overflow the buffer */
assert(lclTyp != TYP_STRUCT || lvaLclSize(lclNum) >= info.compCompHnd->getClassSize(clsHnd)); if (lvaTable[lclNum].lvNormalizeOnLoad())
{
lclTyp = lvaGetRealType(lclNum);
}
else
{
lclTyp = lvaGetActualType(lclNum);
} _PopValue:
/* Pop the value being assigned */ {
StackEntry se = impPopStack(clsHnd);
op1 = se.val;
tiRetVal = se.seTypeInfo;
} #ifdef FEATURE_SIMD
if (varTypeIsSIMD(lclTyp) && (lclTyp != op1->TypeGet()))
{
assert(op1->TypeGet() == TYP_STRUCT);
op1->gtType = lclTyp;
}
#endif // FEATURE_SIMD op1 = impImplicitIorI4Cast(op1, lclTyp); #ifdef _TARGET_64BIT_
// Downcast the TYP_I_IMPL into a 32-bit Int for x86 JIT compatiblity
if (varTypeIsI(op1->TypeGet()) && (genActualType(lclTyp) == TYP_INT))
{
assert(!tiVerificationNeeded); // We should have thrown the VerificationException before.
op1 = gtNewCastNode(TYP_INT, op1, TYP_INT);
}
#endif // _TARGET_64BIT_ // We had better assign it a value of the correct type
assertImp(
genActualType(lclTyp) == genActualType(op1->gtType) ||
genActualType(lclTyp) == TYP_I_IMPL && op1->IsVarAddr() ||
(genActualType(lclTyp) == TYP_I_IMPL && (op1->gtType == TYP_BYREF || op1->gtType == TYP_REF)) ||
(genActualType(op1->gtType) == TYP_I_IMPL && lclTyp == TYP_BYREF) ||
(varTypeIsFloating(lclTyp) && varTypeIsFloating(op1->TypeGet())) ||
((genActualType(lclTyp) == TYP_BYREF) && genActualType(op1->TypeGet()) == TYP_REF)); /* If op1 is "&var" then its type is the transient "*" and it can
be used either as TYP_BYREF or TYP_I_IMPL */ if (op1->IsVarAddr())
{
assertImp(genActualType(lclTyp) == TYP_I_IMPL || lclTyp == TYP_BYREF); /* When "&var" is created, we assume it is a byref. If it is
being assigned to a TYP_I_IMPL var, change the type to
prevent unnecessary GC info */ if (genActualType(lclTyp) == TYP_I_IMPL)
{
op1->gtType = TYP_I_IMPL;
}
} /* Filter out simple assignments to itself */ if (op1->gtOper == GT_LCL_VAR && lclNum == op1->gtLclVarCommon.gtLclNum)
{
if (insertLdloc)
{
// This is a sequence of (ldloc, dup, stloc). Can simplify
// to (ldloc, stloc). Goto LDVAR to reconstruct the ldloc node.
CLANG_FORMAT_COMMENT_ANCHOR; #ifdef DEBUG
if (tiVerificationNeeded)
{
assert(
typeInfo::AreEquivalent(tiRetVal, NormaliseForStack(lvaTable[lclNum].lvVerTypeInfo)));
}
#endif op1 = nullptr;
insertLdloc = false; impLoadVar(lclNum, opcodeOffs + sz + 1);
break;
}
else if (opts.compDbgCode)
{
op1 = gtNewNothingNode();
goto SPILL_APPEND;
}
else
{
break;
}
} /* Create the assignment node */ op2 = gtNewLclvNode(lclNum, lclTyp, opcodeOffs + sz + 1); /* If the local is aliased, we need to spill calls and
indirections from the stack. */ if ((lvaTable[lclNum].lvAddrExposed || lvaTable[lclNum].lvHasLdAddrOp) &&
verCurrentState.esStackDepth > 0)
{
impSpillSideEffects(false, (unsigned)CHECK_SPILL_ALL DEBUGARG("Local could be aliased"));
} /* Spill any refs to the local from the stack */ impSpillLclRefs(lclNum); #if !FEATURE_X87_DOUBLES
// We can generate an assignment to a TYP_FLOAT from a TYP_DOUBLE
// We insert a cast to the dest 'op2' type
//
if ((op1->TypeGet() != op2->TypeGet()) && varTypeIsFloating(op1->gtType) &&
varTypeIsFloating(op2->gtType))
{
op1 = gtNewCastNode(op2->TypeGet(), op1, op2->TypeGet());
}
#endif // !FEATURE_X87_DOUBLES if (varTypeIsStruct(lclTyp))
{
op1 = impAssignStruct(op2, op1, clsHnd, (unsigned)CHECK_SPILL_ALL);
}
else
{
// The code generator generates GC tracking information
// based on the RHS of the assignment. Later the LHS (which is
// is a BYREF) gets used and the emitter checks that that variable
// is being tracked. It is not (since the RHS was an int and did
// not need tracking). To keep this assert happy, we change the RHS
if (lclTyp == TYP_BYREF && !varTypeIsGC(op1->gtType))
{
op1->gtType = TYP_BYREF;
}
op1 = gtNewAssignNode(op2, op1);
} /* If insertLdloc is true, then we need to insert a ldloc following the
stloc. This is done when converting a (dup, stloc) sequence into
a (stloc, ldloc) sequence. */ if (insertLdloc)
{
// From SPILL_APPEND
impAppendTree(op1, (unsigned)CHECK_SPILL_ALL, impCurStmtOffs); #ifdef DEBUG
// From DONE_APPEND
impNoteLastILoffs();
#endif
op1 = nullptr;
insertLdloc = false; impLoadVar(lclNum, opcodeOffs + sz + 1, tiRetVal);
break;
} goto SPILL_APPEND; // 省略了一堆case... case CEE_NOP:
if (opts.compDbgCode)
{
op1 = new (this, GT_NO_OP) GenTree(GT_NO_OP, TYP_VOID);
goto SPILL_APPEND;
}
break; /******************************** NYI *******************************/ case 0xCC:
OutputDebugStringA("CLR: Invalid x86 breakpoint in IL stream\n"); case CEE_ILLEGAL:
case CEE_MACRO_END: default:
BADCODE3("unknown opcode", ": %02X", (int)opcode);
} codeAddr += sz;
prevOpcode = opcode; prefixFlags = 0;
assert(!insertLdloc || opcode == CEE_DUP);
} assert(!insertLdloc); return;
#undef _impResolveToken
}
#ifdef _PREFAST_
#pragma warning(pop)
#endif

首先codeAddrcodeEndp是block对应的IL的开始和结束地址, opcode是当前地址对应的byte,

ldloc.0为例, 这个指令的二进制是06, 06是opcode CEE_LDLOC_0,

ldc.i4.s 100为例, 这个指令的二进制是1f 64, 1f是opcode CEE_LDC_I4_S, 64是参数也就是100的16进制.

这个函数会用一个循环来解析属于当前block的IL范围内的IL指令, 因为IL指令有很多, 我只能挑几个典型的来解释.

IL指令ldc.i4.s会向运行堆栈推入一个常量int, 常量的范围在1 byte以内, 解析的代码如下:

case CEE_LDC_I4_S:
cval.intVal = getI1LittleEndian(codeAddr);
goto PUSH_I4CON;
case CEE_LDC_I4:
cval.intVal = getI4LittleEndian(codeAddr);
goto PUSH_I4CON;
PUSH_I4CON:
JITDUMP(" %d", cval.intVal);
impPushOnStack(gtNewIconNode(cval.intVal), typeInfo(TI_INT));
break;

我们可以看到它会读取指令后的1 byte(无s的指令会读取4 byte), 然后调用impPushOnStack(gtNewIconNode(cval.intVal), typeInfo(TI_INT)).

gtNewIconNode函数(Icon是int constant的缩写)会创建一个GT_CNS_INT类型的GenTree, 表示int常量的节点.

创建节点后会把这个节点推到运行堆栈里, impPushOnStack的源代码如下:

/*****************************************************************************
*
* Pushes the given tree on the stack.
*/ void Compiler::impPushOnStack(GenTreePtr tree, typeInfo ti)
{
/* Check for overflow. If inlining, we may be using a bigger stack */ if ((verCurrentState.esStackDepth >= info.compMaxStack) &&
(verCurrentState.esStackDepth >= impStkSize || ((compCurBB->bbFlags & BBF_IMPORTED) == 0)))
{
BADCODE("stack overflow");
} #ifdef DEBUG
// If we are pushing a struct, make certain we know the precise type!
if (tree->TypeGet() == TYP_STRUCT)
{
assert(ti.IsType(TI_STRUCT));
CORINFO_CLASS_HANDLE clsHnd = ti.GetClassHandle();
assert(clsHnd != NO_CLASS_HANDLE);
} if (tiVerificationNeeded && !ti.IsDead())
{
assert(typeInfo::AreEquivalent(NormaliseForStack(ti), ti)); // types are normalized // The ti type is consistent with the tree type.
// // On 64-bit systems, nodes whose "proper" type is "native int" get labeled TYP_LONG.
// In the verification type system, we always transform "native int" to "TI_INT".
// Ideally, we would keep track of which nodes labeled "TYP_LONG" are really "native int", but
// attempts to do that have proved too difficult. Instead, we'll assume that in checks like this,
// when there's a mismatch, it's because of this reason -- the typeInfo::AreEquivalentModuloNativeInt
// method used in the last disjunct allows exactly this mismatch.
assert(ti.IsDead() || ti.IsByRef() && (tree->TypeGet() == TYP_I_IMPL || tree->TypeGet() == TYP_BYREF) ||
ti.IsUnboxedGenericTypeVar() && tree->TypeGet() == TYP_REF ||
ti.IsObjRef() && tree->TypeGet() == TYP_REF || ti.IsMethod() && tree->TypeGet() == TYP_I_IMPL ||
ti.IsType(TI_STRUCT) && tree->TypeGet() != TYP_REF ||
typeInfo::AreEquivalentModuloNativeInt(NormaliseForStack(ti),
NormaliseForStack(typeInfo(tree->TypeGet())))); // If it is a struct type, make certain we normalized the primitive types
assert(!ti.IsType(TI_STRUCT) ||
info.compCompHnd->getTypeForPrimitiveValueClass(ti.GetClassHandle()) == CORINFO_TYPE_UNDEF);
} #if VERBOSE_VERIFY
if (VERBOSE && tiVerificationNeeded)
{
printf("\n");
printf(TI_DUMP_PADDING);
printf("About to push to stack: ");
ti.Dump();
}
#endif // VERBOSE_VERIFY #endif // DEBUG verCurrentState.esStack[verCurrentState.esStackDepth].seTypeInfo = ti;
verCurrentState.esStack[verCurrentState.esStackDepth++].val = tree; if ((tree->gtType == TYP_LONG) && (compLongUsed == false))
{
compLongUsed = true;
}
else if (((tree->gtType == TYP_FLOAT) || (tree->gtType == TYP_DOUBLE)) && (compFloatingPointUsed == false))
{
compFloatingPointUsed = true;
}
}

impPushOnStack会把GenTree节点添加到运行堆栈verCurrentState.esStack, 包含类型信息和刚才建立的GT_CNS_INT节点.

假设ldc.i4.s 100后面的指令是stloc.0, 表示给本地变量0赋值100, 那么后面的stloc.0指令需要使用前面的值,

我们来看看CEE_STLOC_0是怎么处理的:

case CEE_STLOC_0:
case CEE_STLOC_1:
case CEE_STLOC_2:
case CEE_STLOC_3:
lclNum = (opcode - CEE_STLOC_0);
assert(lclNum >= 0 && lclNum < 4); LOC_ST:
if (tiVerificationNeeded)
{
Verify(lclNum < info.compMethodInfo->locals.numArgs, "bad local num");
Verify(tiCompatibleWith(impStackTop().seTypeInfo,
NormaliseForStack(lvaTable[lclNum + numArgs].lvVerTypeInfo), true),
"type mismatch");
} if (compIsForInlining())
{
lclTyp = impInlineInfo->lclVarInfo[lclNum + impInlineInfo->argCnt].lclTypeInfo; /* Have we allocated a temp for this local? */ lclNum = impInlineFetchLocal(lclNum DEBUGARG("Inline stloc first use temp")); goto _PopValue;
} lclNum += numArgs; VAR_ST: if (lclNum >= info.compLocalsCount && lclNum != lvaArg0Var)
{
assert(!tiVerificationNeeded); // We should have thrown the VerificationException before.
BADCODE("Bad IL");
} VAR_ST_VALID: /* if it is a struct assignment, make certain we don't overflow the buffer */
assert(lclTyp != TYP_STRUCT || lvaLclSize(lclNum) >= info.compCompHnd->getClassSize(clsHnd)); if (lvaTable[lclNum].lvNormalizeOnLoad())
{
lclTyp = lvaGetRealType(lclNum);
}
else
{
lclTyp = lvaGetActualType(lclNum);
} _PopValue:
/* Pop the value being assigned */ {
StackEntry se = impPopStack(clsHnd);
op1 = se.val;
tiRetVal = se.seTypeInfo;
} #ifdef FEATURE_SIMD
if (varTypeIsSIMD(lclTyp) && (lclTyp != op1->TypeGet()))
{
assert(op1->TypeGet() == TYP_STRUCT);
op1->gtType = lclTyp;
}
#endif // FEATURE_SIMD op1 = impImplicitIorI4Cast(op1, lclTyp); #ifdef _TARGET_64BIT_
// Downcast the TYP_I_IMPL into a 32-bit Int for x86 JIT compatiblity
if (varTypeIsI(op1->TypeGet()) && (genActualType(lclTyp) == TYP_INT))
{
assert(!tiVerificationNeeded); // We should have thrown the VerificationException before.
op1 = gtNewCastNode(TYP_INT, op1, TYP_INT);
}
#endif // _TARGET_64BIT_ // We had better assign it a value of the correct type
assertImp(
genActualType(lclTyp) == genActualType(op1->gtType) ||
genActualType(lclTyp) == TYP_I_IMPL && op1->IsVarAddr() ||
(genActualType(lclTyp) == TYP_I_IMPL && (op1->gtType == TYP_BYREF || op1->gtType == TYP_REF)) ||
(genActualType(op1->gtType) == TYP_I_IMPL && lclTyp == TYP_BYREF) ||
(varTypeIsFloating(lclTyp) && varTypeIsFloating(op1->TypeGet())) ||
((genActualType(lclTyp) == TYP_BYREF) && genActualType(op1->TypeGet()) == TYP_REF)); /* If op1 is "&var" then its type is the transient "*" and it can
be used either as TYP_BYREF or TYP_I_IMPL */ if (op1->IsVarAddr())
{
assertImp(genActualType(lclTyp) == TYP_I_IMPL || lclTyp == TYP_BYREF); /* When "&var" is created, we assume it is a byref. If it is
being assigned to a TYP_I_IMPL var, change the type to
prevent unnecessary GC info */ if (genActualType(lclTyp) == TYP_I_IMPL)
{
op1->gtType = TYP_I_IMPL;
}
} /* Filter out simple assignments to itself */ if (op1->gtOper == GT_LCL_VAR && lclNum == op1->gtLclVarCommon.gtLclNum)
{
if (insertLdloc)
{
// This is a sequence of (ldloc, dup, stloc). Can simplify
// to (ldloc, stloc). Goto LDVAR to reconstruct the ldloc node.
CLANG_FORMAT_COMMENT_ANCHOR; #ifdef DEBUG
if (tiVerificationNeeded)
{
assert(
typeInfo::AreEquivalent(tiRetVal, NormaliseForStack(lvaTable[lclNum].lvVerTypeInfo)));
}
#endif op1 = nullptr;
insertLdloc = false; impLoadVar(lclNum, opcodeOffs + sz + 1);
break;
}
else if (opts.compDbgCode)
{
op1 = gtNewNothingNode();
goto SPILL_APPEND;
}
else
{
break;
}
} /* Create the assignment node */ op2 = gtNewLclvNode(lclNum, lclTyp, opcodeOffs + sz + 1); /* If the local is aliased, we need to spill calls and
indirections from the stack. */ if ((lvaTable[lclNum].lvAddrExposed || lvaTable[lclNum].lvHasLdAddrOp) &&
verCurrentState.esStackDepth > 0)
{
impSpillSideEffects(false, (unsigned)CHECK_SPILL_ALL DEBUGARG("Local could be aliased"));
} /* Spill any refs to the local from the stack */ impSpillLclRefs(lclNum); #if !FEATURE_X87_DOUBLES
// We can generate an assignment to a TYP_FLOAT from a TYP_DOUBLE
// We insert a cast to the dest 'op2' type
//
if ((op1->TypeGet() != op2->TypeGet()) && varTypeIsFloating(op1->gtType) &&
varTypeIsFloating(op2->gtType))
{
op1 = gtNewCastNode(op2->TypeGet(), op1, op2->TypeGet());
}
#endif // !FEATURE_X87_DOUBLES if (varTypeIsStruct(lclTyp))
{
op1 = impAssignStruct(op2, op1, clsHnd, (unsigned)CHECK_SPILL_ALL);
}
else
{
// The code generator generates GC tracking information
// based on the RHS of the assignment. Later the LHS (which is
// is a BYREF) gets used and the emitter checks that that variable
// is being tracked. It is not (since the RHS was an int and did
// not need tracking). To keep this assert happy, we change the RHS
if (lclTyp == TYP_BYREF && !varTypeIsGC(op1->gtType))
{
op1->gtType = TYP_BYREF;
}
op1 = gtNewAssignNode(op2, op1);
} /* If insertLdloc is true, then we need to insert a ldloc following the
stloc. This is done when converting a (dup, stloc) sequence into
a (stloc, ldloc) sequence. */ if (insertLdloc)
{
// From SPILL_APPEND
impAppendTree(op1, (unsigned)CHECK_SPILL_ALL, impCurStmtOffs); #ifdef DEBUG
// From DONE_APPEND
impNoteLastILoffs();
#endif
op1 = nullptr;
insertLdloc = false; impLoadVar(lclNum, opcodeOffs + sz + 1, tiRetVal);
break;
} goto SPILL_APPEND; SPILL_APPEND: /* Append 'op1' to the list of statements */
impAppendTree(op1, (unsigned)CHECK_SPILL_ALL, impCurStmtOffs);
goto DONE_APPEND; DONE_APPEND: #ifdef DEBUG
// Remember at which BC offset the tree was finished
impNoteLastILoffs();
#endif
break;

处理CEE_STLOC_0的代码有点长, 请耐心看:

首先0~3的指令会共用处理, stloc.00a, stloc.10b, stloc.20c, stloc.30d.

得到保存的本地变量序号后还要知道它在本地变量表lvaTable中的索引值是多少, 因为本地变量表开头存的是参数, 所以这里的索引值是lclNum += numArgs.

然后创建赋值(GT_ASG)的节点, 赋值的节点有两个参数, 第一个是lclVar 0, 第二个是const 100(类型一致所以不需要cast), 如下:

   /--*  const     int    100
\--* = int
\--* lclVar int V01

现在我们创建了一颗GenTree树, 这个树是一个单独的语句, 我们可以把这个语句添加到BasicBlock中,

添加到BasicBlock使用的代码是impAppendTree(op1, (unsigned)CHECK_SPILL_ALL, impCurStmtOffs):

/*****************************************************************************
*
* Append the given expression tree to the current block's tree list.
* Return the newly created statement.
*/ GenTreePtr Compiler::impAppendTree(GenTreePtr tree, unsigned chkLevel, IL_OFFSETX offset)
{
assert(tree); /* Allocate an 'expression statement' node */ GenTreePtr expr = gtNewStmt(tree, offset); /* Append the statement to the current block's stmt list */ impAppendStmt(expr, chkLevel); return expr;
} /*****************************************************************************
*
* Append the given GT_STMT node to the current block's tree list.
* [0..chkLevel) is the portion of the stack which we will check for
* interference with stmt and spill if needed.
*/ inline void Compiler::impAppendStmt(GenTreePtr stmt, unsigned chkLevel)
{
assert(stmt->gtOper == GT_STMT);
noway_assert(impTreeLast != nullptr); /* If the statement being appended has any side-effects, check the stack
to see if anything needs to be spilled to preserve correct ordering. */ GenTreePtr expr = stmt->gtStmt.gtStmtExpr;
unsigned flags = expr->gtFlags & GTF_GLOB_EFFECT; // Assignment to (unaliased) locals don't count as a side-effect as
// we handle them specially using impSpillLclRefs(). Temp locals should
// be fine too.
// TODO-1stClassStructs: The check below should apply equally to struct assignments,
// but previously the block ops were always being marked GTF_GLOB_REF, even if
// the operands could not be global refs. if ((expr->gtOper == GT_ASG) && (expr->gtOp.gtOp1->gtOper == GT_LCL_VAR) &&
!(expr->gtOp.gtOp1->gtFlags & GTF_GLOB_REF) && !gtHasLocalsWithAddrOp(expr->gtOp.gtOp2) &&
!varTypeIsStruct(expr->gtOp.gtOp1))
{
unsigned op2Flags = expr->gtOp.gtOp2->gtFlags & GTF_GLOB_EFFECT;
assert(flags == (op2Flags | GTF_ASG));
flags = op2Flags;
} if (chkLevel == (unsigned)CHECK_SPILL_ALL)
{
chkLevel = verCurrentState.esStackDepth;
} if (chkLevel && chkLevel != (unsigned)CHECK_SPILL_NONE)
{
assert(chkLevel <= verCurrentState.esStackDepth); if (flags)
{
// If there is a call, we have to spill global refs
bool spillGlobEffects = (flags & GTF_CALL) ? true : false; if (expr->gtOper == GT_ASG)
{
GenTree* lhs = expr->gtGetOp1();
// If we are assigning to a global ref, we have to spill global refs on stack.
// TODO-1stClassStructs: Previously, spillGlobEffects was set to true for
// GT_INITBLK and GT_COPYBLK, but this is overly conservative, and should be
// revisited. (Note that it was NOT set to true for GT_COPYOBJ.)
if (!expr->OperIsBlkOp())
{
// If we are assigning to a global ref, we have to spill global refs on stack
if ((lhs->gtFlags & GTF_GLOB_REF) != 0)
{
spillGlobEffects = true;
}
}
else if ((lhs->OperIsBlk() && !lhs->AsBlk()->HasGCPtr()) ||
((lhs->OperGet() == GT_LCL_VAR) &&
(lvaTable[lhs->AsLclVarCommon()->gtLclNum].lvStructGcCount == 0)))
{
spillGlobEffects = true;
}
} impSpillSideEffects(spillGlobEffects, chkLevel DEBUGARG("impAppendStmt"));
}
else
{
impSpillSpecialSideEff();
}
} impAppendStmtCheck(stmt, chkLevel); /* Point 'prev' at the previous node, so that we can walk backwards */ stmt->gtPrev = impTreeLast; /* Append the expression statement to the list */ impTreeLast->gtNext = stmt;
impTreeLast = stmt; #ifdef FEATURE_SIMD
impMarkContiguousSIMDFieldAssignments(stmt);
#endif #ifdef DEBUGGING_SUPPORT /* Once we set impCurStmtOffs in an appended tree, we are ready to
report the following offsets. So reset impCurStmtOffs */ if (impTreeLast->gtStmt.gtStmtILoffsx == impCurStmtOffs)
{
impCurStmtOffsSet(BAD_IL_OFFSET);
} #endif #ifdef DEBUG
if (impLastILoffsStmt == nullptr)
{
impLastILoffsStmt = stmt;
} if (verbose)
{
printf("\n\n");
gtDispTree(stmt);
}
#endif
}

这段代码会添加一个GT_STMT节点到当前的impTreeLast链表中, 这个链表后面会在impEndTreeList分配给block->bbTreeList.

GT_STMT节点的内容如下:

*  stmtExpr  void
| /--* const int 100
\--* = int
\--* lclVar int V01

可以看到是把原来的分配节点GT_ASG放到了GT_STMT的下面.

微软提供了一张Compiler, BasicBlock, GenTree的结构图(HIR版):

CoreCLR源码探索(八) JIT的工作原理(详解篇)

这里给出了最简单的两个指令ldc.i4.sstloc.0的解析例子, 有兴趣可以自己分析更多类型的指令.

现在我们可以知道运行堆栈在JIT中用于关联各个指令, 让它们构建成一棵GenTree树, 实际生成的代码将不会有运行堆栈这个概念.

在处理完当前block后, 会添加block的后继blocksuccessors到队列impPendingList中:

for (unsigned i = 0; i < block->NumSucc(); i++)
{
impImportBlockPending(block->GetSucc(i));
}

处理完所有block后, 每个BasicBlock中就有了语句(GT_STMT)的链表, 每条语句下面都会有一个GenTree树.

fgImport的例子如下:

CoreCLR源码探索(八) JIT的工作原理(详解篇)

PHASE_POST_IMPORT

这个阶段负责从IL导入HIR(GenTree)后的一些工作, 包含以下的代码:

// Maybe the caller was not interested in generating code
if (compIsForImportOnly())
{
compFunctionTraceEnd(nullptr, 0, false);
return;
} #if !FEATURE_EH
// If we aren't yet supporting EH in a compiler bring-up, remove as many EH handlers as possible, so
// we can pass tests that contain try/catch EH, but don't actually throw any exceptions.
fgRemoveEH();
#endif // !FEATURE_EH if (compileFlags->corJitFlags & CORJIT_FLG_BBINSTR)
{
fgInstrumentMethod();
} // We could allow ESP frames. Just need to reserve space for
// pushing EBP if the method becomes an EBP-frame after an edit.
// Note that requiring a EBP Frame disallows double alignment. Thus if we change this
// we either have to disallow double alignment for E&C some other way or handle it in EETwain. if (opts.compDbgEnC)
{
codeGen->setFramePointerRequired(true); // Since we need a slots for security near ebp, its not possible
// to do this after an Edit without shifting all the locals.
// So we just always reserve space for these slots in case an Edit adds them
opts.compNeedSecurityCheck = true; // We don't care about localloc right now. If we do support it,
// EECodeManager::FixContextForEnC() needs to handle it smartly
// in case the localloc was actually executed.
//
// compLocallocUsed = true;
} EndPhase(PHASE_POST_IMPORT);

这个阶段负责import之后的一些零碎的处理.

如果只需要检查函数的IL是否合法, 那么编译时会带CORJIT_FLG_IMPORT_ONLY, 在经过import阶段后就不需要再继续了.

fgInstrumentMethod用于插入profiler需要的语句, 这里不详细分析.

opts.compDbgEnC启用时代表编译IL程序集时用的是Debug配置, 这里会标记需要使用frame pointer和需要安全检查.

(x64允许函数不使用rbp寄存器保存进入函数前堆栈地址, 这样可以多出一个空余的寄存器以生成更高效的代码, 但是会让debug更困难)

PHASE_MORPH

因为import阶段只是简单的把IL转换成HIR, 转换出来的HIR还需要进行加工.

这个阶段负责了HIR的加工, 包含以下的代码:

/* Initialize the BlockSet epoch */

NewBasicBlockEpoch();

/* Massage the trees so that we can generate code out of them */

fgMorph();
EndPhase(PHASE_MORPH);

NewBasicBlockEpoch更新了当前BasicBlock集合的epoch(fgCurBBEpoch), 这个值用于标识当前BasicBlock集合的版本.

fgMorph包含了这个阶段主要的处理, 源代码如下:

/*****************************************************************************
*
* Transform all basic blocks for codegen.
*/ void Compiler::fgMorph()
{
noway_assert(!compIsForInlining()); // Inlinee's compiler should never reach here. fgOutgoingArgTemps = nullptr; #ifdef DEBUG
if (verbose)
{
printf("*************** In fgMorph()\n");
}
if (verboseTrees)
{
fgDispBasicBlocks(true);
}
#endif // DEBUG // Insert call to class constructor as the first basic block if
// we were asked to do so.
if (info.compCompHnd->initClass(nullptr /* field */, info.compMethodHnd /* method */,
impTokenLookupContextHandle /* context */) &
CORINFO_INITCLASS_USE_HELPER)
{
fgEnsureFirstBBisScratch();
fgInsertStmtAtBeg(fgFirstBB, fgInitThisClass());
} #ifdef DEBUG
if (opts.compGcChecks)
{
for (unsigned i = 0; i < info.compArgsCount; i++)
{
if (lvaTable[i].TypeGet() == TYP_REF)
{
// confirm that the argument is a GC pointer (for debugging (GC stress))
GenTreePtr op = gtNewLclvNode(i, TYP_REF);
GenTreeArgList* args = gtNewArgList(op);
op = gtNewHelperCallNode(CORINFO_HELP_CHECK_OBJ, TYP_VOID, 0, args); fgEnsureFirstBBisScratch();
fgInsertStmtAtEnd(fgFirstBB, op);
}
}
} if (opts.compStackCheckOnRet)
{
lvaReturnEspCheck = lvaGrabTempWithImplicitUse(false DEBUGARG("ReturnEspCheck"));
lvaTable[lvaReturnEspCheck].lvType = TYP_INT;
} if (opts.compStackCheckOnCall)
{
lvaCallEspCheck = lvaGrabTempWithImplicitUse(false DEBUGARG("CallEspCheck"));
lvaTable[lvaCallEspCheck].lvType = TYP_INT;
}
#endif // DEBUG /* Filter out unimported BBs */ fgRemoveEmptyBlocks(); /* Add any internal blocks/trees we may need */ fgAddInternal(); #if OPT_BOOL_OPS
fgMultipleNots = false;
#endif #ifdef DEBUG
/* Inliner could add basic blocks. Check that the flowgraph data is up-to-date */
fgDebugCheckBBlist(false, false);
#endif // DEBUG /* Inline */
fgInline();
#if 0
JITDUMP("trees after inlining\n");
DBEXEC(VERBOSE, fgDispBasicBlocks(true));
#endif RecordStateAtEndOfInlining(); // Record "start" values for post-inlining cycles and elapsed time. #ifdef DEBUG
/* Inliner could add basic blocks. Check that the flowgraph data is up-to-date */
fgDebugCheckBBlist(false, false);
#endif // DEBUG /* For x64 and ARM64 we need to mark irregular parameters early so that they don't get promoted */
fgMarkImplicitByRefArgs(); /* Promote struct locals if necessary */
fgPromoteStructs(); /* Now it is the time to figure out what locals have address-taken. */
fgMarkAddressExposedLocals(); #ifdef DEBUG
/* Now that locals have address-taken marked, we can safely apply stress. */
lvaStressLclFld();
fgStress64RsltMul();
#endif // DEBUG /* Morph the trees in all the blocks of the method */ fgMorphBlocks(); #if 0
JITDUMP("trees after fgMorphBlocks\n");
DBEXEC(VERBOSE, fgDispBasicBlocks(true));
#endif /* Decide the kind of code we want to generate */ fgSetOptions(); fgExpandQmarkNodes(); #ifdef DEBUG
compCurBB = nullptr;
#endif // DEBUG
}

函数中的处理如下:

fgInsertStmtAtBeg(fgFirstBB, fgInitThisClass());

如果类型需要动态初始化(泛型并且有静态构造函数), 在第一个block插入调用JIT_ClassInitDynamicClass的代码

fgRemoveEmptyBlocks

枚举所有未import(也就是说这个block中的代码无法到达)的block,

如果有则更新block的序号和epoch.

fgAddInternal:

添加内部的BasicBlock和GenTree.

首先如果函数不是静态的, 且this变量需要传出地址(ref)或者修改, 则需要一个内部的本地变量(lvaArg0Var)储存this的值.

如果函数需要安全检查(compNeedSecurityCheck), 则添加一个临时变量(lvaSecurityObject).

如果当前平台不是x86(32位), 则为同步方法生成代码, 进入时调用JIT_MonEnterWorker, 退出时调用JIT_MonExitWorker.

判断是否要只生成一个return block(例如包含pinvoke的函数, 调用了非托管代码的函数, 或者同步函数),

如果需要只生成一个return block, 则添加一个合并用的BasicBlock和储存返回值用的本地变量, 这里还不会把其他return block重定向到新block.

如果函数有调用非托管函数, 则添加一个临时变量(lvaInlinedPInvokeFrameVar).

如果启用了JustMyCode, 则添加if (*pFlag != 0) { JIT_DbgIsJustMyCode() }到第一个block, 注意这里的节点是QMARK(?