注:本文基于Android 8.1進(jìn)行分析。
ART對(duì)象分配過程解析——內(nèi)存分配的準(zhǔn)備階段
本章我們將分析Android 8.1中ART虛擬機(jī)的對(duì)象創(chuàng)建時(shí)內(nèi)存分配過程的分析鸳兽。本節(jié)將介紹內(nèi)存分配相關(guān)的環(huán)境準(zhǔn)備及各種跳轉(zhuǎn)邏輯。
我們首先從Thread類開始分析疼邀。
Thread類
Thread類的Init()方法會(huì)進(jìn)行線程相關(guān)的所有初始化工作荠诬,例如,初始化Cpu信息隐锭,成員函數(shù)InitTlsEntryPoints初始化一個(gè)外部庫函數(shù)調(diào)用跳轉(zhuǎn)表。例如计贰,Thread類將外部庫函數(shù)調(diào)用跳轉(zhuǎn)表劃分為4個(gè)钦睡,其中,interpreter_entrypoints_描述的是解釋器要用到的跳轉(zhuǎn)表躁倒,jni_entrypoints_描述的是JNI調(diào)用相關(guān)的跳轉(zhuǎn)表荞怒,portable_entrypoints_描述的是Portable后端生成的本地機(jī)器指令要用到的跳轉(zhuǎn)表,而quick_entrypoints_描述的是Quick后端生成的本地機(jī)器指令要用到的跳轉(zhuǎn)表秧秉。這些函數(shù)跳轉(zhuǎn)入口通過訪問線程Thread對(duì)應(yīng)的偏移量進(jìn)入褐桌。
Thread的Init方法:
bool Thread::Init(ThreadList* thread_list, JavaVMExt* java_vm, JNIEnvExt* jni_env_ext) {
// This function does all the initialization that must be run by the native thread it applies to.
// (When we create a new thread from managed code, we allocate the Thread* in Thread::Create so
// we can handshake with the corresponding native thread when it's ready.) Check this native
// thread hasn't been through here already...
CHECK(Thread::Current() == nullptr);
// Set pthread_self_ ahead of pthread_setspecific, that makes Thread::Current function, this
// avoids pthread_self_ ever being invalid when discovered from Thread::Current().
tlsPtr_.pthread_self = pthread_self();
CHECK(is_started_);
SetUpAlternateSignalStack();
if (!InitStackHwm()) {
return false;
}
InitCpu();
InitTlsEntryPoints();
RemoveSuspendTrigger();
InitCardTable();
InitTid();
interpreter::InitInterpreterTls(this);
……
thread_list->Register(this);
return true;
}
Thread的InitTlsEntryPoints()方法:
void Thread::InitTlsEntryPoints() {
// Insert a placeholder so we can easily tell if we call an unimplemented entry point.
uintptr_t* begin = reinterpret_cast<uintptr_t*>(&tlsPtr_.jni_entrypoints);
uintptr_t* end = reinterpret_cast<uintptr_t*>(
reinterpret_cast<uint8_t*>(&tlsPtr_.quick_entrypoints) + sizeof(tlsPtr_.quick_entrypoints));
for (uintptr_t* it = begin; it != end; ++it) {
*it = reinterpret_cast<uintptr_t>(UnimplementedEntryPoint);
}
InitEntryPoints(&tlsPtr_.jni_entrypoints, &tlsPtr_.quick_entrypoints);
}
entrypoints目錄
Thread的InitTlsEntryPoints()方法調(diào)用InitEntryPoints()方法,并且把偏移地址傳遞進(jìn)去象迎。根據(jù)設(shè)備cpu架構(gòu)的不同荧嵌,該方法的實(shí)現(xiàn)也不同,我們來看ARM 64的實(shí)現(xiàn)(/art/runtime/arch/arm64/entrypoints_init_arm64.cc):
void InitEntryPoints(JniEntryPoints* jpoints, QuickEntryPoints* qpoints) {
DefaultInitEntryPoints(jpoints, qpoints);
……
}
調(diào)用DefaultInitEntryPoints()方法(/art/runtime/entrypoints/quick/quick_default_init_entrypoints.h):
static void DefaultInitEntryPoints(JniEntryPoints* jpoints, QuickEntryPoints* qpoints) {
// JNI
jpoints->pDlsymLookup = art_jni_dlsym_lookup_stub;
// Alloc
ResetQuickAllocEntryPoints(qpoints, /* is_marking */ true);
……
}
我們只關(guān)注Alloc部分砾淌。這里繼續(xù)調(diào)用ResetQuickAllocEntryPoints()方法啦撮。
位置:/art/runtime/entrypoints/quick/quick_alloc_entrypoints.cc
static gc::AllocatorType entry_points_allocator = gc::kAllocatorTypeDlMalloc;
void SetQuickAllocEntryPointsAllocator(gc::AllocatorType allocator) {
entry_points_allocator = allocator;
}
void ResetQuickAllocEntryPoints(QuickEntryPoints* qpoints, bool is_marking) {
#if !defined(__APPLE__) || !defined(__LP64__)
switch (entry_points_allocator) {
case gc::kAllocatorTypeDlMalloc: {
SetQuickAllocEntryPoints_dlmalloc(qpoints, entry_points_instrumented);
return;
}
case gc::kAllocatorTypeRosAlloc: {
SetQuickAllocEntryPoints_rosalloc(qpoints, entry_points_instrumented);
return;
}
case gc::kAllocatorTypeBumpPointer: {
CHECK(kMovingCollector);
SetQuickAllocEntryPoints_bump_pointer(qpoints, entry_points_instrumented);
return;
}
case gc::kAllocatorTypeTLAB: {
CHECK(kMovingCollector);
SetQuickAllocEntryPoints_tlab(qpoints, entry_points_instrumented);
return;
}
case gc::kAllocatorTypeRegion: {
CHECK(kMovingCollector);
SetQuickAllocEntryPoints_region(qpoints, entry_points_instrumented);
return;
}
case gc::kAllocatorTypeRegionTLAB: {
CHECK(kMovingCollector);
if (is_marking) {
SetQuickAllocEntryPoints_region_tlab(qpoints, entry_points_instrumented);
} else {
// Not marking means we need no read barriers and can just use the normal TLAB case.
SetQuickAllocEntryPoints_tlab(qpoints, entry_points_instrumented);
}
return;
}
default:
break;
}
#else
UNUSED(qpoints);
UNUSED(is_marking);
#endif
UNIMPLEMENTED(FATAL);
UNREACHABLE();
}
entry_points_allocator代表了內(nèi)存分配器的類型,初始值為kAllocatorTypeDlMalloc表示將會(huì)使用DlMalloc的分配器入口汪厨≡叽海可以在調(diào)用SetQuickAllocEntryPointsAllocator改變entry_points_allocator的值。大部分情況下entry_points_allocator這個(gè)值為kAllocatorTypeRosAlloc骄崩。
SetQuickAllocEntryPointsAllocator會(huì)在ChangeAllocator方法修改分配器時(shí)被調(diào)用聘鳞,ChangeAllocator會(huì)在ChangeCollector(修改垃圾收集方式)時(shí)被調(diào)用。
上面的代碼調(diào)用到了SetQuickAllocEntryPoints_+不同分配器后綴要拂,該方法又是在哪定義的呢抠璃?我們繼續(xù)來看。
/art/runtime/entrypoints/quick/quick_alloc_entrypoints.cc:
#define GENERATE_ENTRYPOINTS(suffix) \
extern "C" void* art_quick_alloc_array_resolved##suffix(mirror::Class* klass, int32_t); \
extern "C" void* art_quick_alloc_array_resolved8##suffix(mirror::Class* klass, int32_t); \
extern "C" void* art_quick_alloc_array_resolved16##suffix(mirror::Class* klass, int32_t); \
extern "C" void* art_quick_alloc_array_resolved32##suffix(mirror::Class* klass, int32_t); \
extern "C" void* art_quick_alloc_array_resolved64##suffix(mirror::Class* klass, int32_t); \
extern "C" void* art_quick_alloc_object_resolved##suffix(mirror::Class* klass); \
extern "C" void* art_quick_alloc_object_initialized##suffix(mirror::Class* klass); \
extern "C" void* art_quick_alloc_object_with_checks##suffix(mirror::Class* klass); \
extern "C" void* art_quick_alloc_string_from_bytes##suffix(void*, int32_t, int32_t, int32_t); \
extern "C" void* art_quick_alloc_string_from_chars##suffix(int32_t, int32_t, void*); \
extern "C" void* art_quick_alloc_string_from_string##suffix(void*); \
extern "C" void* art_quick_alloc_array_resolved##suffix##_instrumented(mirror::Class* klass, int32_t); \
extern "C" void* art_quick_alloc_array_resolved8##suffix##_instrumented(mirror::Class* klass, int32_t); \
extern "C" void* art_quick_alloc_array_resolved16##suffix##_instrumented(mirror::Class* klass, int32_t); \
extern "C" void* art_quick_alloc_array_resolved32##suffix##_instrumented(mirror::Class* klass, int32_t); \
extern "C" void* art_quick_alloc_array_resolved64##suffix##_instrumented(mirror::Class* klass, int32_t); \
extern "C" void* art_quick_alloc_object_resolved##suffix##_instrumented(mirror::Class* klass); \
extern "C" void* art_quick_alloc_object_initialized##suffix##_instrumented(mirror::Class* klass); \
extern "C" void* art_quick_alloc_object_with_checks##suffix##_instrumented(mirror::Class* klass); \
extern "C" void* art_quick_alloc_string_from_bytes##suffix##_instrumented(void*, int32_t, int32_t, int32_t); \
extern "C" void* art_quick_alloc_string_from_chars##suffix##_instrumented(int32_t, int32_t, void*); \
extern "C" void* art_quick_alloc_string_from_string##suffix##_instrumented(void*); \
void SetQuickAllocEntryPoints##suffix(QuickEntryPoints* qpoints, bool instrumented) { \
if (instrumented) { \
qpoints->pAllocArrayResolved = art_quick_alloc_array_resolved##suffix##_instrumented; \
qpoints->pAllocArrayResolved8 = art_quick_alloc_array_resolved8##suffix##_instrumented; \
qpoints->pAllocArrayResolved16 = art_quick_alloc_array_resolved16##suffix##_instrumented; \
qpoints->pAllocArrayResolved32 = art_quick_alloc_array_resolved32##suffix##_instrumented; \
qpoints->pAllocArrayResolved64 = art_quick_alloc_array_resolved64##suffix##_instrumented; \
qpoints->pAllocObjectResolved = art_quick_alloc_object_resolved##suffix##_instrumented; \
qpoints->pAllocObjectInitialized = art_quick_alloc_object_initialized##suffix##_instrumented; \
qpoints->pAllocObjectWithChecks = art_quick_alloc_object_with_checks##suffix##_instrumented; \
qpoints->pAllocStringFromBytes = art_quick_alloc_string_from_bytes##suffix##_instrumented; \
qpoints->pAllocStringFromChars = art_quick_alloc_string_from_chars##suffix##_instrumented; \
qpoints->pAllocStringFromString = art_quick_alloc_string_from_string##suffix##_instrumented; \
} else { \
qpoints->pAllocArrayResolved = art_quick_alloc_array_resolved##suffix; \
qpoints->pAllocArrayResolved8 = art_quick_alloc_array_resolved8##suffix; \
qpoints->pAllocArrayResolved16 = art_quick_alloc_array_resolved16##suffix; \
qpoints->pAllocArrayResolved32 = art_quick_alloc_array_resolved32##suffix; \
qpoints->pAllocArrayResolved64 = art_quick_alloc_array_resolved64##suffix; \
qpoints->pAllocObjectResolved = art_quick_alloc_object_resolved##suffix; \
qpoints->pAllocObjectInitialized = art_quick_alloc_object_initialized##suffix; \
qpoints->pAllocObjectWithChecks = art_quick_alloc_object_with_checks##suffix; \
qpoints->pAllocStringFromBytes = art_quick_alloc_string_from_bytes##suffix; \
qpoints->pAllocStringFromChars = art_quick_alloc_string_from_chars##suffix; \
qpoints->pAllocStringFromString = art_quick_alloc_string_from_string##suffix; \
} \
}
我們以pAllocObject為例脱惰,實(shí)際上art_quick_alloc_object_rosalloc使用bl指令跳轉(zhuǎn)到C函數(shù)artAllocObjectFromCodeRosAlloc搏嗡。參數(shù)type_idx描述的是要分配的對(duì)象的類型,通過寄存器r0傳遞,參數(shù)method描述的是當(dāng)前調(diào)用的類方法采盒,通過寄存器r1傳遞旧乞。
以函數(shù)artAllocObjectFromCodeRosAlloc為例,它是由以下代碼調(diào)用的:(/art/runtime/entrypoints/quick/quick_alloc_entrypoints.cc)
#define GENERATE_ENTRYPOINTS_FOR_ALLOCATOR_INST(suffix, suffix2, instrumented_bool, allocator_type) \
extern "C" mirror::Object* artAllocObjectFromCodeWithChecks##suffix##suffix2( \
mirror::Class* klass, Thread* self) \
REQUIRES_SHARED(Locks::mutator_lock_) { \
return artAllocObjectFromCode<false, true, instrumented_bool, allocator_type>(klass, self); \
} \
extern "C" mirror::Object* artAllocObjectFromCodeResolved##suffix##suffix2( \
mirror::Class* klass, Thread* self) \
REQUIRES_SHARED(Locks::mutator_lock_) { \
return artAllocObjectFromCode<false, false, instrumented_bool, allocator_type>(klass, self); \
} \
extern "C" mirror::Object* artAllocObjectFromCodeInitialized##suffix##suffix2( \
mirror::Class* klass, Thread* self) \
REQUIRES_SHARED(Locks::mutator_lock_) { \
return artAllocObjectFromCode<true, false, instrumented_bool, allocator_type>(klass, self); \
} \
extern "C" mirror::Array* artAllocArrayFromCodeResolved##suffix##suffix2( \
mirror::Class* klass, int32_t component_count, Thread* self) \
REQUIRES_SHARED(Locks::mutator_lock_) { \
ScopedQuickEntrypointChecks sqec(self); \
return AllocArrayFromCodeResolved<instrumented_bool>(klass, component_count, self, \
allocator_type); \
} \
extern "C" mirror::String* artAllocStringFromBytesFromCode##suffix##suffix2( \
mirror::ByteArray* byte_array, int32_t high, int32_t offset, int32_t byte_count, \
Thread* self) \
REQUIRES_SHARED(Locks::mutator_lock_) { \
ScopedQuickEntrypointChecks sqec(self); \
StackHandleScope<1> hs(self); \
Handle<mirror::ByteArray> handle_array(hs.NewHandle(byte_array)); \
return mirror::String::AllocFromByteArray<instrumented_bool>(self, byte_count, handle_array, \
offset, high, allocator_type); \
} \
extern "C" mirror::String* artAllocStringFromCharsFromCode##suffix##suffix2( \
int32_t offset, int32_t char_count, mirror::CharArray* char_array, Thread* self) \
REQUIRES_SHARED(Locks::mutator_lock_) { \
StackHandleScope<1> hs(self); \
Handle<mirror::CharArray> handle_array(hs.NewHandle(char_array)); \
return mirror::String::AllocFromCharArray<instrumented_bool>(self, char_count, handle_array, \
offset, allocator_type); \
} \
extern "C" mirror::String* artAllocStringFromStringFromCode##suffix##suffix2( /* NOLINT */ \
mirror::String* string, Thread* self) \
REQUIRES_SHARED(Locks::mutator_lock_) { \
StackHandleScope<1> hs(self); \
Handle<mirror::String> handle_string(hs.NewHandle(string)); \
return mirror::String::AllocFromString<instrumented_bool>(self, handle_string->GetLength(), \
handle_string, 0, allocator_type); \
}
#define GENERATE_ENTRYPOINTS_FOR_ALLOCATOR(suffix, allocator_type) \
GENERATE_ENTRYPOINTS_FOR_ALLOCATOR_INST(suffix, Instrumented, true, allocator_type) \
GENERATE_ENTRYPOINTS_FOR_ALLOCATOR_INST(suffix, , false, allocator_type)
最終都調(diào)用到了artAllocObjectFromCode()方法(/art/runtime/entrypoints/quick/quick_alloc_entrypoints.cc):
static constexpr bool kUseTlabFastPath = true;
template <bool kInitialized,
bool kFinalize,
bool kInstrumented,
gc::AllocatorType allocator_type>
static ALWAYS_INLINE inline mirror::Object* artAllocObjectFromCode(
mirror::Class* klass,
Thread* self) REQUIRES_SHARED(Locks::mutator_lock_) {
ScopedQuickEntrypointChecks sqec(self);
DCHECK(klass != nullptr);
if (kUseTlabFastPath && !kInstrumented && allocator_type == gc::kAllocatorTypeTLAB) {
if (kInitialized || klass->IsInitialized()) {
if (!kFinalize || !klass->IsFinalizable()) {
size_t byte_count = klass->GetObjectSize();
byte_count = RoundUp(byte_count, gc::space::BumpPointerSpace::kAlignment);
mirror::Object* obj;
if (LIKELY(byte_count < self->TlabSize())) {
obj = self->AllocTlab(byte_count);
DCHECK(obj != nullptr) << "AllocTlab can't fail";
obj->SetClass(klass);
if (kUseBakerReadBarrier) {
obj->AssertReadBarrierState();
}
QuasiAtomic::ThreadFenceForConstructor();
return obj;
}
}
}
}
if (kInitialized) {
return AllocObjectFromCodeInitialized<kInstrumented>(klass, self, allocator_type);
} else if (!kFinalize) {
return AllocObjectFromCodeResolved<kInstrumented>(klass, self, allocator_type);
} else {
return AllocObjectFromCode<kInstrumented>(klass, self, allocator_type);
}
}
該方法做了以下幾個(gè)事:
首先判斷是否可以使用TLAB方式分配內(nèi)存磅氨。TLAB是Android為了減少多線程之間同步尺栖,加快處理速度,使用Thread的本地存儲(chǔ)空間來進(jìn)行存儲(chǔ)烦租。如果可以使用TLAB分配延赌,最終會(huì)調(diào)用Thread對(duì)象的AllocTlab()方法進(jìn)行內(nèi)存分配。
接下來會(huì)根據(jù)參數(shù)kInitialized和kFinalize的值來進(jìn)行分支條件判斷叉橱。如果類已經(jīng)初始化挫以,執(zhí)行AllocObjectFromCodeInitialized()方法;否則窃祝,執(zhí)行AllocObjectFromCodeResolved()和AllocObjectFromCode()方法掐松。
我們來看AllocObjectFromCodeResolved方法( /art/runtime/entrypoints/entrypoint_utils-inl.h):
// Given the context of a calling Method and a resolved class, create an instance.
template <bool kInstrumented>
ALWAYS_INLINE
inline mirror::Object* AllocObjectFromCodeResolved(mirror::Class* klass,
Thread* self,
gc::AllocatorType allocator_type) {
DCHECK(klass != nullptr);
bool slow_path = false;
klass = CheckClassInitializedForObjectAlloc(klass, self, &slow_path);
if (UNLIKELY(slow_path)) {
if (klass == nullptr) {
return nullptr;
}
gc::Heap* heap = Runtime::Current()->GetHeap();
// Pass in false since the object cannot be finalizable.
// CheckClassInitializedForObjectAlloc can cause thread suspension which means we may now be
// instrumented.
return klass->Alloc</*kInstrumented*/true, false>(self, heap->GetCurrentAllocator()).Ptr();
}
// Pass in false since the object cannot be finalizable.
return klass->Alloc<kInstrumented, false>(self, allocator_type).Ptr();
}
判斷是否需要對(duì)類進(jìn)行解析(類沒有加載到虛擬機(jī)中),默認(rèn)不需要粪小,則slow_path為false大磺,如果需要解析,則slow_path為true糕再。CheckClassInitializedForObjectAlloc返回要分配的對(duì)象對(duì)應(yīng)的class量没。 如果klass不為null玉转,則進(jìn)行該類的對(duì)象的內(nèi)存分配:調(diào)用klass的Alloc方法突想。
Alloc方法:(/art/runtime/mirror/class-inl.h)
template<bool kIsInstrumented, bool kCheckAddFinalizer>
inline ObjPtr<Object> Class::Alloc(Thread* self, gc::AllocatorType allocator_type) {
CheckObjectAlloc();
gc::Heap* heap = Runtime::Current()->GetHeap();
const bool add_finalizer = kCheckAddFinalizer && IsFinalizable();
if (!kCheckAddFinalizer) {
DCHECK(!IsFinalizable());
}
// Note that the this pointer may be invalidated after the allocation.
ObjPtr<Object> obj =
heap->AllocObjectWithAllocator<kIsInstrumented, false>(self,
this,
this->object_size_,
allocator_type,
VoidFunctor());
if (add_finalizer && LIKELY(obj != nullptr)) {
heap->AddFinalizerReference(self, &obj);
if (UNLIKELY(self->IsExceptionPending())) {
// Failed to allocate finalizer reference, it means that the whole allocation failed.
obj = nullptr;
}
}
return obj.Ptr();
}
通過CheckObjectAlloc()方法檢查對(duì)象類型是否合法。
進(jìn)行finalize相關(guān)判斷究抓,如果這個(gè)類重寫了finalize()方法猾担,則需要調(diào)用heap->AddFinalizerReference(self, &obj),通過FinalizerReference.java的add()方法刺下,生成一個(gè)FinalizerReference對(duì)象绑嘹,并添加到一個(gè)鏈表結(jié)構(gòu)中。當(dāng)對(duì)象進(jìn)行銷毀時(shí)橘茉,會(huì)執(zhí)行調(diào)用該對(duì)象的finalize()方法工腋。
調(diào)用heap->AllocObjectWithAllocator進(jìn)行對(duì)象的內(nèi)存分配。
到了這里畅卓,對(duì)象的內(nèi)存分配就進(jìn)入到heap堆的相關(guān)分配階段了擅腰,我們將在下一節(jié)介紹heap堆中的內(nèi)存分配環(huán)節(jié)。
小結(jié)
Thread類初始化外部庫函數(shù)調(diào)用跳轉(zhuǎn)表翁潘。這些函數(shù)跳轉(zhuǎn)入口通過訪問線程Thread對(duì)應(yīng)的偏移量進(jìn)入趁冈。
Thread的InitTlsEntryPoints()方法調(diào)用InitEntryPoints()方法,并且把偏移地址傳遞進(jìn)去。根據(jù)設(shè)備cpu架構(gòu)的不同渗勘,該方法的實(shí)現(xiàn)也不同沐绒,例如ARM 64的實(shí)現(xiàn)/art/runtime/arch/arm64/entrypoints_init_arm64.cc。
entry_points_allocator代表了內(nèi)存分配器的類型旺坠,初始值為kAllocatorTypeDlMalloc表示將會(huì)使用DlMalloc的分配器入口乔遮。可以在調(diào)用SetQuickAllocEntryPointsAllocator改變entry_points_allocator的值取刃。大部分情況下entry_points_allocator這個(gè)值為kAllocatorTypeRosAlloc申眼。
artAllocObjectFromCode()方法(/art/runtime/entrypoints/quick/quick_alloc_entrypoints.cc)會(huì)根據(jù)條件(例如,是否需要對(duì)類進(jìn)行解析)調(diào)用不同分支條件的內(nèi)存分配蝉衣。
最終括尸,都調(diào)用heap->AllocObjectWithAllocator進(jìn)行對(duì)象的內(nèi)存分配。