上一節(jié)我們從源碼的角度出發(fā)分析了proguard是怎么把class字節(jié)碼解析讀取出來少欺,并且通過LibraryClassPool跟ProgramClassPool兩個(gè)池子把項(xiàng)目里的所有類都管理起來芥颈,這節(jié)我們來分析下proguard是如何檢索類的依賴關(guān)系违崇,只有把類依賴關(guān)系都找出來了舅柜,下面才能做壓縮跟裁剪工作闺属。
回到proguard類的execute方法里贪庙,readInput讀取完類信息后抓半,接著initialize就開始著手一些初始化的事情,這里比較重要的事情就是索引類依賴關(guān)系囊嘉,只有把類的依賴關(guān)系索引出來了温技,才能知道哪些類是有用的,哪些類是無用可以被刪除掉的扭粱。
package com.nls.demo
import com.nls.demo.ParentClass
import com.nls.demo.Interface1
import com.nls.demo.Test
class Demo : ParentClass() , Interface1 {
@override fun test() {
Test().print()
}
}
假如有這樣一段偽代碼舵鳞,很顯然的Demo依賴的類有ParentClass
Test
接口有Interface1
那么proguard是怎么把它們的依賴關(guān)系給分析出來呢,下面我們來分析一下琢蛤。
父類與接口類依賴檢索
public void execute() throws IOException
{
//省略部分代碼...
readInput();
if (configuration.printSeeds != null ||
configuration.shrink ||
configuration.optimize ||
configuration.obfuscate ||
configuration.preverify)
{
initialize();
}
if (configuration.shrink)
{
shrink();
}
//省略部分代碼...
}
initialize方法比較簡單蜓堕,內(nèi)部new了Initializer對(duì)象,然后執(zhí)行它的execute方法博其,我們直接看Initializer的內(nèi)部實(shí)現(xiàn)就好套才。
private void initialize() throws IOException
{
if (configuration.verbose)
{
System.out.println("Initializing...");
}
new Initializer(configuration).execute(programClassPool, libraryClassPool);
}
execute方法比較長,我們抽取部分核心關(guān)鍵地方來分析
public void execute(ClassPool programClassPool,
ClassPool libraryClassPool) throws IOException
{
//省略部分代碼...
// Initialize the superclass hierarchies for program classes.
programClassPool.classesAccept(
new ClassSuperHierarchyInitializer(programClassPool,
libraryClassPool,
classReferenceWarningPrinter,
null));
// Initialize the superclass hierarchy of all library classes, without
// warnings.
libraryClassPool.classesAccept(
new ClassSuperHierarchyInitializer(programClassPool,
libraryClassPool,
null,
dependencyWarningPrinter));
programClassPool.classesAccept(
new ClassReferenceInitializer(programClassPool,
libraryClassPool,
classReferenceWarningPrinter,
programMemberReferenceWarningPrinter,
libraryMemberReferenceWarningPrinter,
null));
//省略部分代碼...
}
ClassPool的classesAccept方法會(huì)遍歷里面的所有Clazz去調(diào)用它的accept方法贺奠,
public void classesAccept(ClassVisitor classVisitor)
{
Iterator iterator = classes.values().iterator();
while (iterator.hasNext())
{
Clazz clazz = (Clazz)iterator.next();
clazz.accept(classVisitor);
}
}
programClassPool里面的是ProgramClass霜旧,libraryClassPool里面對(duì)應(yīng)的就是LibraryClass
//ProgramClass
public void accept(ClassVisitor classVisitor)
{
classVisitor.visitProgramClass(this);
}
//LibraryClass
public void accept(ClassVisitor classVisitor)
{
classVisitor.visitLibraryClass(this);
}
對(duì)于ProgramClass會(huì)回調(diào)ClassVisitor的visitProgramClass方法,而LibraryClass就會(huì)回調(diào)它的visitLibraryClass方法儡率,這里我們只分析ProgramClass挂据。ClassSuperHierarchyInitializer實(shí)現(xiàn)了ClassVisitor接口,它的visitProgramClass方法內(nèi)部如下
public void visitProgramClass(ProgramClass programClass)
{
// Link to the super class.
programClass.superClassConstantAccept(this);
// Link to the interfaces.
programClass.interfaceConstantsAccept(this);
}
跟在常量池后面的就是訪問標(biāo)記儿普,訪問標(biāo)記后面的就是類索引崎逃、父類索引、接口索引眉孩,這些索引會(huì)指向常量池里面對(duì)應(yīng)的常量對(duì)象个绍,這里的常量池對(duì)象是ClassConstant對(duì)象。
public void superClassConstantAccept(ConstantVisitor constantVisitor)
{
if (u2superClass != 0)
{
constantPool[u2superClass].accept(this, constantVisitor);
}
}
public void interfaceConstantsAccept(ConstantVisitor constantVisitor)
{
for (int index = 0; index < u2interfacesCount; index++)
{
constantPool[u2interfaces[index]].accept(this, constantVisitor);
}
}
//ClassConstant
public void accept(Clazz clazz, ConstantVisitor constantVisitor)
{
constantVisitor.visitClassConstant(clazz, this);
}
u2superClass指向的是常量池里面的ClassConstant對(duì)象浪汪,accept方法會(huì)回調(diào)ConstantVisitor接口的visitClassConstant方法巴柿,ClassSuperHierarchyInitializer實(shí)現(xiàn)了ConstantVisitor接口
// Implementations for ConstantVisitor.
public void visitClassConstant(Clazz clazz, ClassConstant classConstant)
{
classConstant.referencedClass =
findClass(clazz.getName(), classConstant.getName(clazz));
}
在class字節(jié)碼里,常量池的類常量對(duì)象里面并沒有referencedClass字段的死遭,這個(gè)是proguard為了檢索依賴鏈路而加上去的广恢,findClass方法的實(shí)現(xiàn)如下
private Clazz findClass(String referencingClassName, String name)
{
// First look for the class in the program class pool.
Clazz clazz = programClassPool.getClass(name);
// Otherwise look for the class in the library class pool.
if (clazz == null)
{
clazz = libraryClassPool.getClass(name);
//這里省去部分代碼....
}
return clazz;
}
先通過ClassConstant的getName方法獲取ClassConstant對(duì)象指向的類名稱,當(dāng)然getName方法本質(zhì)上也是在常量池里取字符常量得到的呀潭,findClass方法會(huì)根據(jù)類名稱從programClassPool或libraryClassPool里找到對(duì)應(yīng)的類對(duì)象钉迷,最后把獲取到的類對(duì)象賦值給ClassConstant的referencedClass字段至非,這樣就把兩個(gè)類的依賴關(guān)系給建立起來了,superClassConstantAccept(this) 調(diào)用完畢后接著會(huì)調(diào)用interfaceConstantsAccept(this) 負(fù)責(zé)來檢索interface接口的依賴關(guān)系,由于過程是類似的這里不再分析糠聪,這樣父類跟接口類的依賴就會(huì)被檢索出來了荒椭。
引用依賴類檢索
這里有個(gè)點(diǎn)需要注意的,并不是在頭文件里import了一個(gè)類就代表著把類的依賴關(guān)系帶了進(jìn)來舰蟆,依賴類必現(xiàn)是在一個(gè)類里面調(diào)用了另外一個(gè)類趣惠,有沒有import并不重要,譬如同包名下的就并不需要import了夭苗。
在完成了父類與接口類的依賴檢索后信卡,接著就是代碼中所依賴到的類的檢索了隔缀,這個(gè)事情由ClassReferenceInitializer來完成题造,代碼如下
programClassPool.classesAccept(
new ClassReferenceInitializer(programClassPool,
libraryClassPool,
classReferenceWarningPrinter,
programMemberReferenceWarningPrinter,
libraryMemberReferenceWarningPrinter,
null));
ClassReferenceInitializer實(shí)現(xiàn)了ClassVisitor接口,可以訪問ClassPool里面的所有類猾瘸,它的visitProgramClass (這里我們只分析程序類界赔,不分析library的類依賴)實(shí)現(xiàn)如下
public void visitProgramClass(ProgramClass programClass)
{
// Initialize the constant pool entries.
programClass.constantPoolEntriesAccept(this);
// Initialize all fields and methods.
programClass.fieldsAccept(this);
programClass.methodsAccept(this);
// Initialize the attributes.
programClass.attributesAccept(this);
}
-
常量池分析
首先是遍歷常量池,對(duì)常量池里每一項(xiàng)進(jìn)行依賴分析牵触,constantPoolEntriesAccept的實(shí)現(xiàn)如下:
public void constantPoolEntriesAccept(ConstantVisitor constantVisitor)
{
for (int index = 1; index < u2constantPoolCount; index++)
{
if (constantPool[index] != null)
{
constantPool[index].accept(this, constantVisitor);
}
}
}
ClassReferenceInitializer同時(shí)也實(shí)現(xiàn)了ConstantVisitor接口淮悼,它的定義如下
public interface ConstantVisitor
{
public void visitIntegerConstant( Clazz clazz, IntegerConstant integerConstant);
public void visitLongConstant( Clazz clazz, LongConstant longConstant);
public void visitFloatConstant( Clazz clazz, FloatConstant floatConstant);
public void visitDoubleConstant( Clazz clazz, DoubleConstant doubleConstant);
public void visitStringConstant( Clazz clazz, StringConstant stringConstant);
public void visitUtf8Constant( Clazz clazz, Utf8Constant utf8Constant);
public void visitInvokeDynamicConstant( Clazz clazz, InvokeDynamicConstant invokeDynamicConstant);
public void visitMethodHandleConstant( Clazz clazz, MethodHandleConstant methodHandleConstant);
public void visitFieldrefConstant( Clazz clazz, FieldrefConstant fieldrefConstant);
public void visitInterfaceMethodrefConstant(Clazz clazz, InterfaceMethodrefConstant interfaceMethodrefConstant);
public void visitMethodrefConstant( Clazz clazz, MethodrefConstant methodrefConstant);
public void visitClassConstant( Clazz clazz, ClassConstant classConstant);
public void visitMethodTypeConstant( Clazz clazz, MethodTypeConstant methodTypeConstant);
public void visitNameAndTypeConstant( Clazz clazz, NameAndTypeConstant nameAndTypeConstant);
}
本質(zhì)上就是常量池里每一項(xiàng)都有對(duì)應(yīng)的接口去解析,這里我們重點(diǎn)只看FieldrefConstant MethodrefConstant跟ClassConstant幾種類型揽思,在常量池里對(duì)應(yīng)的是CONSTANT_Fieldref_info CONSTANT_Methodref_info以及CONSTANT_Class_info類型
CONSTANT_Class_info
在proguard里類引用常量對(duì)應(yīng)的是ClassConstant結(jié)構(gòu)袜腥,這個(gè)結(jié)構(gòu)比較簡單就只有一個(gè)index索引指向了所對(duì)應(yīng)的類的名稱
結(jié)構(gòu)比較簡單,所以解析也是比較簡單钉汗,代碼如下:
public void visitClassConstant(Clazz clazz, ClassConstant classConstant)
{
// Fill out the referenced class.
classConstant.referencedClass =
findClass(clazz, ClassUtil.internalClassNameFromClassType(classConstant.getName(clazz)));
// Fill out the Class class.
classConstant.javaLangClassClass =
findClass(clazz, ClassConstants.NAME_JAVA_LANG_CLASS);
}
大概就是先拿到這個(gè)類常量所對(duì)應(yīng)的類名稱羹令,然后在ClassPool里面找到這個(gè)類,最后賦值給referencedClass字段损痰,這樣就能把兩個(gè)類的依賴關(guān)系給建立起來了福侈,如果想要獲取當(dāng)前類依賴了哪些類對(duì)象,只要遍歷常量池里面的所有類常量對(duì)象拿到它的referencedClass字段便可知道了(AGP 3.x版本 源碼里就是這樣做的)卢未。
CONSTANT_Methodref_info
在proguard里字段或方法的引用常量都是RefConstant的子類肪凛,跟類常量一樣他們都會(huì)有一個(gè)index索引字段,指向了引入此方法或字段的類名稱辽社,此外還有另外一個(gè)index索引字段伟墙,指向了此方法或者字段的名稱,如下
ClassReferenceInitializer的visitAnyRefConstant方法負(fù)責(zé)解析這些引用對(duì)象滴铅,實(shí)現(xiàn)如下
public void visitAnyRefConstant(Clazz clazz, RefConstant refConstant)
{
//部分代碼已被注釋只留關(guān)鍵代碼....
String className = refConstant.getClassName(clazz);
Clazz referencedClass = findClass(clazz, className);
if (referencedClass != null)
{
String name = refConstant.getName(clazz);
String type = refConstant.getType(clazz);
boolean isFieldRef = refConstant.getTag() == ClassConstants.CONSTANT_Fieldref;
// See if we can find the referenced class member somewhere in the
// hierarchy.
refConstant.referencedMember = memberFinder.findMember(clazz,
referencedClass,
name,
type,
isFieldRef);
refConstant.referencedClass = memberFinder.correspondingClass();
}
}
首先是通過getClassName獲取到了提供此方法或字段的類名稱戳葵,接著通過findClass方法找到了所對(duì)應(yīng)的類對(duì)象結(jié)構(gòu),最后memberFinder會(huì)遍歷這個(gè)類的所有方法或字段失息,找出對(duì)應(yīng)的方法或者字段譬淳,實(shí)現(xiàn)如下:
public Member findMember(Clazz referencingClass,
Clazz clazz,
String name,
String descriptor,
boolean isField)
{
//這個(gè)地方比較有意思档址,在遍歷clazz的所有方法跟字段時(shí),如找到了會(huì)回調(diào)visitAnyMember.
//進(jìn)行保存邻梆,同時(shí)會(huì)拋一個(gè)異常出來告訴主調(diào)用方查找結(jié)束
try
{
this.clazz = null;
this.member = null;
clazz.hierarchyAccept(true, true, true, false, isField ?
(ClassVisitor)new NamedFieldVisitor(name, descriptor,
new MemberClassAccessFilter(referencingClass, this)) :
(ClassVisitor)new NamedMethodVisitor(name, descriptor,
new MemberClassAccessFilter(referencingClass, this)));
}
catch (MemberFoundException ex)
{
// We've found the member we were looking for.
}
return member;
}
public void visitAnyMember(Clazz clazz, Member member)
{
this.clazz = clazz;
this.member = member;
throw MEMBER_FOUND;
}
這里有一個(gè)比較有意思的邏輯就是守伸,在遍歷clazz的所有方法或字段時(shí),若找到了需要通過拋異常的方式來通知結(jié)束查找浦妄。由于字段的查找跟方法的查找雷同尼摹,這里我們只分析方法的查找。
hierarchyAccept方法能根據(jù)傳參去訪問本類剂娄、父類蠢涝、接口、子類等等能力阅懦,NamedMethodVisitor實(shí)現(xiàn)了ClassVisitor接口和二,用做clazz類訪問,代碼如下:
public NamedMethodVisitor(String name,
String descriptor,
MemberVisitor memberVisitor)
{
this.name = name;
this.descriptor = descriptor;
this.memberVisitor = memberVisitor;
}
// Implementations for ClassVisitor.
public void visitProgramClass(ProgramClass programClass)
{
programClass.methodAccept(name, descriptor, memberVisitor);
}
NamedMethodVisitor類比較簡單耳胎,構(gòu)造的時(shí)候會(huì)傳入方法名惯吕,描述符等參數(shù),在visitProgramClass里會(huì)調(diào)用ProgramClass提供的methodAccept方法把對(duì)應(yīng)的方法找出來怕午,這里的memberVisitor對(duì)應(yīng)的就是外面構(gòu)造的MemberClassAccessFilter對(duì)象废登,負(fù)責(zé)做一些簡單的訪問權(quán)限校驗(yàn)操作,代碼如下:
public void visitProgramMethod(ProgramClass programClass, ProgramMethod programMethod)
{
if (accepted(programClass, programMethod.getAccessFlags()))
{
memberVisitor.visitProgramMethod(programClass, programMethod);
}
}
private boolean accepted(Clazz clazz, int memberAccessFlags)
{
int accessLevel = AccessUtil.accessLevel(memberAccessFlags);
return
(accessLevel >= AccessUtil.PUBLIC ) ||
(accessLevel >= AccessUtil.PRIVATE && referencingClass.equals(clazz) ) ||
(accessLevel >= AccessUtil.PACKAGE_VISIBLE && (ClassUtil.internalPackageName(referencingClass.getName()).equals(
ClassUtil.internalPackageName(clazz.getName())))) ||
(accessLevel >= AccessUtil.PROTECTED && (referencingClass.extends_(clazz) ||
referencingClass.extendsOrImplements(clazz)) );
}
最后把查找到的方法回調(diào)出去給MemberFinder的visitProgramMethod方法郁惜,MemberFinder做一些簡單的保存工作接著會(huì)拋出一個(gè)異常通知查找結(jié)束堡距,代碼如下:
public void visitAnyMember(Clazz clazz, Member member)
{
this.clazz = clazz;
this.member = member;
throw MEMBER_FOUND;
}
最后把查找出來的方法或者字段賦值給referencedMember,而所對(duì)應(yīng)的類對(duì)象就賦值給referencedClass兆蕉,這樣就能把本類依賴的哪些方法字段以及這些方法或字段的提供者類對(duì)象都給檢索出來了羽戒。
-
類方法字段分析
回到ClassReferenceInitializer類的visitProgramClass里,在分析完常量池后恨樟,接著就是分析本類的字段跟方法
public void visitProgramClass(ProgramClass programClass)
{
// Initialize the constant pool entries.
programClass.constantPoolEntriesAccept(this);
// Initialize all fields and methods.
programClass.fieldsAccept(this);
programClass.methodsAccept(this);
}
由于字段跟方法的解析類似半醉,這里只分析方法的解析過程。這里解析本類方法的主要原因是劝术,方法參數(shù)里可能會(huì)引入一些類對(duì)象缩多,如有一下測(cè)試代碼
class MyClass {
fun test(test: TestClass?) {
//test?.test1()
}
}
test方法會(huì)把TestClass類依賴引入進(jìn)來。
ClassReferenceInitializer類實(shí)現(xiàn)了MemberVisitor接口养晋,方法的解析在visitProgramMethod接口里衬吆,實(shí)現(xiàn)如下:
public void visitProgramMethod(ProgramClass programClass, ProgramMethod programMethod)
{
programMethod.referencedClasses =
findReferencedClasses(programClass,
programMethod.getDescriptor(programClass));
}
首先是通過findReferencedClasses把引入的方法類對(duì)象解析出來,最后賦值給referencedClasses绳泉,這樣就建立了方法對(duì)一組類的依賴關(guān)系了逊抡,findReferencedClasses的具體實(shí)現(xiàn)如下:
private Clazz[] findReferencedClasses(Clazz referencingClass,
String descriptor)
{
DescriptorClassEnumeration enumeration =
new DescriptorClassEnumeration(descriptor);
int classCount = enumeration.classCount();
if (classCount > 0)
{
Clazz[] referencedClasses = new Clazz[classCount];
boolean foundReferencedClasses = false;
for (int index = 0; index < classCount; index++)
{
String fluff = enumeration.nextFluff();
String name = enumeration.nextClassName();
Clazz referencedClass = findClass(referencingClass, name);
if (referencedClass != null)
{
referencedClasses[index] = referencedClass;
foundReferencedClasses = true;
}
}
if (foundReferencedClasses)
{
return referencedClasses;
}
}
return null;
}
核心解析是在DescriptorClassEnumeration類里,本質(zhì)上就是對(duì)string的一個(gè)解析操作,譬如上面的測(cè)試代碼里的test方法冒嫡,它的描述符是(Lcom/example/lib/TestClass;)V拇勃,DescriptorClassEnumeration類通過解析字符串最終會(huì)把com/example/lib/TestClass類給解析出來,最后findClass已經(jīng)是很熟悉的方法了孝凌,它根據(jù)傳入的類名稱方咆,從ClassPool里面把類對(duì)象給找出來,效果如下:
-
屬性分析
解析完了類方法跟字段后跟著的就是屬性集合的分析了蟀架,屬性表有很多類瓣赂,這里就不再進(jìn)行一一分析了。
總結(jié)
本節(jié)主要是從源碼的角度出發(fā)片拍,分析了下proguard是怎么把代碼里面的class類依賴關(guān)系給檢索出來的煌集,有了依賴關(guān)系才知道哪些類或者代碼是有用的,哪些是沒有被任何類使用到的可以直接刪除的捌省,有了這些關(guān)鍵的信息后便可以進(jìn)行下一步的壓縮優(yōu)化苫纤。