作者:lds(lds2012@gmail.com)
日期:2017-04-11
前言
AndFix是阿里巴巴開源的Android熱修復(fù)框架元暴。其基本原理是利用JNI來實(shí)現(xiàn)方法的替換求晶,以實(shí)現(xiàn)Android APP的熱修復(fù)辜膝,即無需發(fā)版即可臨時(shí)修復(fù)在線BUG野揪。
熱修復(fù)技術(shù)有很多種,AndFix采取的native方法替換方案,優(yōu)點(diǎn)是即時(shí)生效涣觉,無性能損耗,缺點(diǎn)是只能修改方法血柳,且兼容性可能有問題官册。
雖然其原理比較簡單,但要深入理解难捌,還需要對JNI膝宁,以及dalvik和Art兩種虛擬機(jī),甚至art的多種版本源碼有比較深入的了解才行根吁。整體難度還是比較大员淫,因此本文并不深入到虛擬機(jī)實(shí)現(xiàn)細(xì)節(jié),只針對JNI的相關(guān)部分進(jìn)行了解击敌。
?
源碼地址:https://github.com/alibaba/AndFix
源碼版本:0.5.0
?
?
一. 注冊native方法
AndFix.java的native方法
package com.alipay.euler.andfix;
// ...
public class AndFix {
private static native boolean setup(boolean isArt, int apilevel);
private static native void replaceMethod(Method dest, Method src);
private static native void setFieldFlag(Field field);
}
?
這幾個(gè)native方法是通過動(dòng)態(tài)注冊的介返,而不是通過靜態(tài)注冊的。這兩種注冊方法愚争,據(jù)網(wǎng)傳是動(dòng)態(tài)注冊效率更高映皆,不需要每次都去jni通過函數(shù)名來查找挤聘。
static JNINativeMethod gMethods[] = {
/* name, signature, funcPtr */
{ "setup", "(ZI)Z",
(void*) setup },
{ "replaceMethod", "(Ljava/lang/reflect/Method;Ljava/lang/reflect/Method;)V",
(void*) replaceMethod },
{ "setFieldFlag", "(Ljava/lang/reflect/Field;)V",
(void*) setFieldFlag },
};
這里的三個(gè)native方法都根據(jù)當(dāng)前運(yùn)行時(shí)是dalvik還是art來路由到不同的實(shí)現(xiàn)函數(shù),甚至art還根據(jù)其版本不同路由到針對不同版本art的實(shí)現(xiàn)捅彻。
當(dāng)前運(yùn)行時(shí) | 實(shí)現(xiàn)源碼文件 |
---|---|
dalvik | /jni/dalvik/dalvik_method_replace.cpp |
android 4.4 (api 19) | /jni/dalvik/art_method_replace_4_4.cpp |
android 5.0 (> api 19) | /jni/dalvik/art_method_replace_5_0.cpp |
android 5.1 (> api 21) | /jni/dalvik/art_method_replace_5_1.cpp |
android 6.0 (> api 22) | /jni/dalvik/art_method_replace_6_0.cpp |
android 7.0 (> api 23) | /jni/dalvik/art_method_replace_7_0.cpp |
這里也可以看出來兩點(diǎn)组去,
- 第一:ART首發(fā)于Android 4.4。
- 第二步淹,基本上以后每一版Android的ART都進(jìn)行了修改从隆,而AndFix這種解決方案兼容性差的問題在這里則體現(xiàn)得比較明顯,一旦Android版本變化缭裆,則就必須針對其虛擬機(jī)來重寫實(shí)現(xiàn)方法键闺。
?
雖然針對不同虛擬機(jī)及版本有不同的實(shí)現(xiàn),但通過代碼來看澈驼,其原理比較一致辛燥,不同的實(shí)現(xiàn)僅為了調(diào)用不同虛擬機(jī)的不同API而已。所以下面只研究傳統(tǒng)的dalvik實(shí)現(xiàn)方式缝其。
?
二. 初始化(setup)
這里面有一個(gè)知識(shí)點(diǎn)挎塌,是如何檢查當(dāng)前運(yùn)行時(shí)是dalvik還是Art,官方文檔中的原文描述為:
您可以通過調(diào)用
System.getProperty("java.vm.version")
來驗(yàn)證正在使用哪種運(yùn)行時(shí)内边。 如果使用的是 ART榴都,則該屬性值將是"2.0.0"
或更高。
代碼實(shí)現(xiàn)為:
final String vmVersion = System.getProperty("java.vm.version");
boolean isArt = vmVersion != null && vmVersion.startsWith("2");
這代碼其實(shí)有點(diǎn)問題漠其,文檔里說明的是art的version為等于或大于2.0.0嘴高,但代碼只判斷了是否為2開頭,如果有天art版本號(hào)迭代到3了則會(huì)出現(xiàn)兼容性問題和屎,不太嚴(yán)謹(jǐn)拴驮。
?
jboolean setup(JNIEnv* env, jclass clazz, jboolean isart, jint apilevel);
setup函數(shù)主要是為了一些初始化工作,在dalvik的實(shí)現(xiàn)里柴信,主要是為了獲取 libdvm.so
里面的幾個(gè)函數(shù)指針莹汤,便于后面去調(diào)用。
一個(gè)是 dvmDecodeIndirectRef
函數(shù)颠印。一個(gè)是 dvmThreadSelf
函數(shù)。
?
2.1 dvmDecodeIndirectRef()
先來看dalvik虛擬機(jī)里面的 dvmDecodeIndirectRef
函數(shù)的定義:
/*
* Convert an indirect reference to an Object reference. The indirect
* reference may be local, global, or weak-global.
*
* If "jobj" is NULL, or is a weak global reference whose reference has
* been cleared, this returns NULL. If jobj is an invalid indirect
* reference, kInvalidIndirectRefObject is returned.
*
* Note "env" may be NULL when decoding global references.
*/
Object* dvmDecodeIndirectRef(Thread* self, jobject jobj) {}
這個(gè)函數(shù)把一個(gè)jobject轉(zhuǎn)換成了dalvik里面定義的 Object
對象抹竹,在dalvik里面 Object
對象线罕,可用于實(shí)現(xiàn):
- Class object
- Array Object
- data object
- String object
可用此函數(shù)獲取到 ClassObject
。例如 NewObject
函數(shù)的源碼:
static jobject NewObject(JNIEnv* env, jclass jclazz, jmethodID methodID, ...) {
ScopedJniThreadState ts(env);
ClassObject* clazz = (ClassObject*) dvmDecodeIndirectRef(ts.self(), jclazz);
if (!canAllocClass(clazz) || (!dvmIsClassInitialized(clazz) && !dvmInitClass(clazz))) {
assert(dvmCheckException(ts.self()));
return NULL;
}
Object* newObj = dvmAllocObject(clazz, ALLOC_DONT_TRACK);
jobject result = addLocalReference(ts.self(), newObj);
if (newObj != NULL) {
JValue unused;
va_list args;
va_start(args, methodID);
dvmCallMethodV(ts.self(), (Method*) methodID, newObj, true, &unused, args);
va_end(args);
}
return result;
}
?
2.2 dvmThreadSelf()
/*
* Like pthread_self(), but on a Thread*.
*/
Thread* dvmThreadSelf()
{
return (Thread*) pthread_getspecific(gDvm.pthreadKeySelf);
}
該方法用于獲取當(dāng)前線程窃判。
?
?
三. 設(shè)置成員域權(quán)限(setFieldFlag)
該函數(shù)的用處是將需要修復(fù)的類的所有成員域都設(shè)置為 public
钞楼。
實(shí)現(xiàn)方式比較簡單:
void dalvik_setFieldFlag(JNIEnv* env, jobject field) {
Field* dalvikField = (Field*) env->FromReflectedField(field);
dalvikField->accessFlags = dalvikField->accessFlags & (~ACC_PRIVATE)
| ACC_PUBLIC;
LOGD("dalvik_setFieldFlag: %d ", dalvikField->accessFlags);
}
?
?
四. 替換方法(replaceMethod)
第一步,將用于替換的class設(shè)置為已經(jīng)初始化好了的狀態(tài):
jobject clazz = env->CallObjectMethod(dest, jClassMethod);
ClassObject* clz = (ClassObject*) dvmDecodeIndirectRef_fnPtr(
dvmThreadSelf_fnPtr(), clazz);
clz->status = CLASS_INITIALIZED;
這里好像并沒有像xposed框架一樣調(diào)用 dvmInitClass
函數(shù)來真正初始化class袄琳,而只是設(shè)置了status询件。
TODO: 為什么不初始化class燃乍,為什么又必須要設(shè)置status值?
?
然后將方式直接替換掉:
Method* meth = (Method*) env->FromReflectedMethod(src);
Method* target = (Method*) env->FromReflectedMethod(dest);
LOGD("dalvikMethod: %s", meth->name);
// meth->clazz = target->clazz;
meth->accessFlags |= ACC_PUBLIC;
meth->methodIndex = target->methodIndex;
meth->jniArgInfo = target->jniArgInfo;
meth->registersSize = target->registersSize;
meth->outsSize = target->outsSize;
meth->insSize = target->insSize;
meth->prototype = target->prototype;
meth->insns = target->insns;
meth->nativeFunc = target->nativeFunc;
除了 clazz, name, shroty, fastJni, noRef, shouldTrace, registerMap, inProfile 幾個(gè)值以外的所有值都被替換成新的方法宛琅。
?
至于每個(gè)字段的含義刻蟹,可以參考一下 dalvik 的源碼中 Method
的結(jié)構(gòu)體定義:
struct Method {
/* the class we are a part of */
ClassObject* clazz;
/* access flags; low 16 bits are defined by spec (could be u2?) */
u4 accessFlags;
/*
* For concrete virtual methods, this is the offset of the method
* in "vtable".
*
* For abstract methods in an interface class, this is the offset
* of the method in "iftable[n]->methodIndexArray".
*/
u2 methodIndex;
/*
* Method bounds; not needed for an abstract method.
*
* For a native method, we compute the size of the argument list, and
* set "insSize" and "registerSize" equal to it.
*/
u2 registersSize; /* ins + locals */
u2 outsSize;
u2 insSize;
/* method name, e.g. "<init>" or "eatLunch" */
const char* name;
/*
* Method prototype descriptor string (return and argument types).
*
* TODO: This currently must specify the DexFile as well as the proto_ids
* index, because generated Proxy classes don't have a DexFile. We can
* remove the DexFile* and reduce the size of this struct if we generate
* a DEX for proxies.
*/
DexProto prototype;
/* short-form method descriptor string */
const char* shorty;
/*
* The remaining items are not used for abstract or native methods.
* (JNI is currently hijacking "insns" as a function pointer, set
* after the first call. For internal-native this stays null.)
*/
/* the actual code */
const u2* insns; /* instructions, in memory-mapped .dex */
/* JNI: cached argument and return-type hints */
int jniArgInfo;
/*
* JNI: native method ptr; could be actual function or a JNI bridge. We
* don't currently discriminate between DalvikBridgeFunc and
* DalvikNativeFunc; the former takes an argument superset (i.e. two
* extra args) which will be ignored. If necessary we can use
* insns==NULL to detect JNI bridge vs. internal native.
*/
DalvikBridgeFunc nativeFunc;
/*
* JNI: true if this static non-synchronized native method (that has no
* reference arguments) needs a JNIEnv* and jclass/jobject. Libcore
* uses this.
*/
bool fastJni;
/*
* JNI: true if this method has no reference arguments. This lets the JNI
* bridge avoid scanning the shorty for direct pointers that need to be
* converted to local references.
*
* TODO: replace this with a list of indexes of the reference arguments.
*/
bool noRef;
/*
* JNI: true if we should log entry and exit. This is the only way
* developers can log the local references that are passed into their code.
* Used for debugging JNI problems in third-party code.
*/
bool shouldTrace;
/*
* Register map data, if available. This will point into the DEX file
* if the data was computed during pre-verification, or into the
* linear alloc area if not.
*/
const RegisterMap* registerMap;
/* set if method was called during method profiling */
bool inProfile;
};
?
?
結(jié)語
除了Java代碼和NDK代碼以外,其實(shí)還有一塊比較重要嘿辟,就是自動(dòng)生產(chǎn)patch的工具舆瘪,理解它需要對dex文件由比較深入的了解,而且阿里并沒有直接開源該工具红伦,而且這個(gè)工具已經(jīng)有盡2年多沒有更新過英古。
?
總之,對于AndFix的實(shí)現(xiàn)機(jī)制的研究網(wǎng)上還是比較多的昙读,主要是因?yàn)樵摽蚣艿脑肀容^直接粗暴召调,比較好理解。但其實(shí)從細(xì)節(jié)來看蛮浑,如果自己開發(fā)這樣的一個(gè)框架唠叛,需要對 dalvik 虛擬機(jī), ART陵吸,Dex文件格式玻墅,JNI等知識(shí)都有一個(gè)比較全面而深入的了解才可能做出這樣一個(gè)看似簡單的解決方案,因此也說明了對于android底層的了解在很多情況下都是有比較大的幫助的壮虫,特別是在實(shí)現(xiàn)一些比較高級(jí)的功能時(shí)澳厢,例如熱修復(fù)這種。這點(diǎn)還是比較值得學(xué)習(xí)的囚似。
?
參考資料: