本文介紹了一種能夠有效解決WebView多進(jìn)程崩潰的方法梁沧。
《涼州詞》
黃河遠(yuǎn)上白云間郭宝,一片孤城萬仞山。
羌笛何須怨楊柳廷蓉,春風(fēng)不度玉門關(guān)全封。
-王之渙
問題
在android 9.0系統(tǒng)上如果多個進(jìn)程使用WebView需要使用官方提供的api在子進(jìn)程中給webview的數(shù)據(jù)文件夾設(shè)置后綴:
WebView.setDataDirectorySuffix(suffix);
否則將會報出以下錯誤:
Using WebView from more than one process at once with the same data directory is not supported. https://crbug.com/558377
1 com.android.webview.chromium.WebViewChromiumAwInit.startChromiumLocked(WebViewChromiumAwInit.java:63)
2 com.android.webview.chromium.WebViewChromiumAwInitForP.startChromiumLocked(WebViewChromiumAwInitForP.java:3)
3 com.android.webview.chromium.WebViewChromiumAwInit$3.run(WebViewChromiumAwInit.java:3)
4 android.os.Handler.handleCallback(Handler.java:873)
5 android.os.Handler.dispatchMessage(Handler.java:99)
6 android.os.Looper.loop(Looper.java:220)
7 android.app.ActivityThread.main(ActivityThread.java:7437)
8 java.lang.reflect.Method.invoke(Native Method)
9 com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:500)
10 com.android.internal.os.ZygoteInit.main(ZygoteInit.java:865)
通過使用官方提供的方法后問題只減少了一部分,從bugly后臺依然能收到此問題的大量崩潰信息桃犬,以至于都沖上了崩潰問題Top3刹悴。
問題分析
從源碼分析調(diào)用鏈最終調(diào)用到了AwDataDirLock類中的lock方法。
public class WebViewChromiumAwInit {
protected void startChromiumLocked() {
...
AwBrowserProcess.start();
...
}
}
public final class AwBrowserProcess {
public static void start() {
...
AwDataDirLock.lock(appContext);
}
AwDataDirLock.java
abstract class AwDataDirLock {
private static final String TAG = "AwDataDirLock";
private static final String EXCLUSIVE_LOCK_FILE = "webview_data.lock";
// This results in a maximum wait time of 1.5s
private static final int LOCK_RETRIES = 16;
private static final int LOCK_SLEEP_MS = 100;
private static RandomAccessFile sLockFile;
private static FileLock sExclusiveFileLock;
static void lock(final Context appContext) {
try (ScopedSysTraceEvent e1 = ScopedSysTraceEvent.scoped("AwDataDirLock.lock");
StrictModeContext ignored = StrictModeContext.allowDiskWrites()) {
if (sExclusiveFileLock != null) {
// We have already called lock() and successfully acquired the lock in this process.
// This shouldn't happen, but is likely to be the result of an app catching an
// exception thrown during initialization and discarding it, causing us to later
// attempt to initialize WebView again. There's no real advantage to failing the
// locking code when this happens; we may as well count this as the lock being
// acquired and let init continue (though the app may experience other problems
// later).
return;
}
// If we already called lock() but didn't succeed in getting the lock, it's possible the
// app caught the exception and tried again later. As above, there's no real advantage
// to failing here, so only open the lock file if we didn't already open it before.
if (sLockFile == null) {
String dataPath = PathUtils.getDataDirectory();
File lockFile = new File(dataPath, EXCLUSIVE_LOCK_FILE);
try {
// Note that the file is kept open intentionally.
sLockFile = new RandomAccessFile(lockFile, "rw");
} catch (IOException e) {
// Failing to create the lock file is always fatal; even if multiple processes
// are using the same data directory we should always be able to access the file
// itself.
throw new RuntimeException("Failed to create lock file " + lockFile, e);
}
}
// Android versions before 11 have edge cases where a new instance of an app process can
// be started while an existing one is still in the process of being killed. This can
// still happen on Android 11+ because the platform has a timeout for waiting, but it's
// much less likely. Retry the lock a few times to give the old process time to fully go
// away.
for (int attempts = 1; attempts <= LOCK_RETRIES; ++attempts) {
try {
sExclusiveFileLock = sLockFile.getChannel().tryLock();
} catch (IOException e) {
// Older versions of Android incorrectly throw IOException when the flock()
// call fails with EAGAIN, instead of returning null. Just ignore it.
}
if (sExclusiveFileLock != null) {
// We got the lock; write out info for debugging.
writeCurrentProcessInfo(sLockFile);
return;
}
// If we're not out of retries, sleep and try again.
if (attempts == LOCK_RETRIES) break;
try {
Thread.sleep(LOCK_SLEEP_MS);
} catch (InterruptedException e) {
}
}
// We failed to get the lock even after retrying.
// Many existing apps rely on this even though it's known to be unsafe.
// Make it fatal when on P for apps that target P or higher
String error = getLockFailureReason(sLockFile);
boolean dieOnFailure = Build.VERSION.SDK_INT >= Build.VERSION_CODES.P
&& appContext.getApplicationInfo().targetSdkVersion >= Build.VERSION_CODES.P;
if (dieOnFailure) {
throw new RuntimeException(error);
} else {
Log.w(TAG, error);
}
}
}
private static void writeCurrentProcessInfo(final RandomAccessFile file) {
try {
// Truncate the file first to get rid of old data.
file.setLength(0);
file.writeInt(Process.myPid());
file.writeUTF(ContextUtils.getProcessName());
} catch (IOException e) {
// Don't crash just because something failed here, as it's only for debugging.
Log.w(TAG, "Failed to write info to lock file", e);
}
}
private static String getLockFailureReason(final RandomAccessFile file) {
final StringBuilder error = new StringBuilder("Using WebView from more than one process at "
+ "once with the same data directory is not supported. https://crbug.com/558377 "
+ ": Current process ");
error.append(ContextUtils.getProcessName());
error.append(" (pid ").append(Process.myPid()).append("), lock owner ");
try {
int pid = file.readInt();
String processName = file.readUTF();
error.append(processName).append(" (pid ").append(pid).append(")");
// Check the status of the pid holding the lock by sending it a null signal.
// This doesn't actually send a signal, just runs the kernel access checks.
try {
Os.kill(pid, 0);
// No exception means the process exists and has the same uid as us, so is
// probably an instance of the same app. Leave the message alone.
} catch (ErrnoException e) {
if (e.errno == OsConstants.ESRCH) {
// pid did not exist - the lock should have been released by the kernel,
// so this process info is probably wrong.
error.append(" doesn't exist!");
} else if (e.errno == OsConstants.EPERM) {
// pid existed but didn't have the same uid as us.
// Most likely the pid has just been recycled for a new process
error.append(" pid has been reused!");
} else {
// EINVAL is the only other documented return value for kill(2) and should never
// happen for signal 0, so just complain generally.
error.append(" status unknown!");
}
}
} catch (IOException e) {
// We'll get IOException if we failed to read the pid and process name; e.g. if the
// lockfile is from an old version of WebView or an IO error occurred somewhere.
error.append(" unknown");
}
return error.toString();
}
}
lock方法會對webview數(shù)據(jù)目錄中的webview_data.lock
文件在for循環(huán)中嘗試加鎖16次攒暇,注釋中也說明了這么做的原因:可能出現(xiàn)的極端情況是一個舊進(jìn)程正在被殺死時一個新的進(jìn)程啟動了土匀,看來Google工程師對這個問題也很頭痛;如果加鎖成功會將該進(jìn)程id和進(jìn)程名寫入到文件形用,如果加鎖失敗則會拋出異常就轧。所以在android9.0以上檢測應(yīng)用是否存在多進(jìn)程共用WebView數(shù)據(jù)目錄的原理就是進(jìn)程持有WebView數(shù)據(jù)目錄中的webview_data.lock
文件的鎖。所以如果子進(jìn)程也對相同文件嘗試加鎖則會導(dǎo)致應(yīng)用崩潰田度。
解決方案
目前大部分手機(jī)會在應(yīng)用崩潰時自動重啟應(yīng)用妒御,猜測當(dāng)手機(jī)系統(tǒng)運(yùn)行較慢時這時就會出現(xiàn)注釋中提到的當(dāng)一個舊進(jìn)程正在被殺死時一個新的進(jìn)程啟動了的情況。既然獲取文件鎖失敗就會發(fā)生崩潰镇饺,并且該文件只是用于加鎖判斷是否存在多進(jìn)程共用WebView數(shù)據(jù)目錄乎莉,每次加鎖成功都會重新寫入對應(yīng)進(jìn)程信息,那么我們可以在應(yīng)用啟動時對該文件嘗試加鎖奸笤,如果加鎖失敗就刪除該文件并重新創(chuàng)建惋啃,加鎖成功就立即釋放鎖,這樣當(dāng)系統(tǒng)嘗試加鎖時理論上是可以加鎖成功的揭保,也就避免了這個問題的發(fā)生肥橙。
private static void handleWebviewDir(Context context) {
if (Build.VERSION.SDK_INT < Build.VERSION_CODES.P) {
return;
}
try {
String suffix = "";
String processName = getProcessName(context);
if (!TextUtils.equals(context.getPackageName(), processName)) {//判斷不等于默認(rèn)進(jìn)程名稱
suffix = TextUtils.isEmpty(processName) ? context.getPackageName() : processName;
WebView.setDataDirectorySuffix(suffix);
suffix = "_" + suffix;
}
tryLockOrRecreateFile(context,suffix);
} catch (Exception e) {
e.printStackTrace();
}
}
@TargetApi(Build.VERSION_CODES.P)
private static void tryLockOrRecreateFile(Context context,String suffix) {
String sb = context.getDataDir().getAbsolutePath() +
"/app_webview"+suffix+"/webview_data.lock";
File file = new File(sb);
if (file.exists()) {
try {
FileLock tryLock = new RandomAccessFile(file, "rw").getChannel().tryLock();
if (tryLock != null) {
tryLock.close();
} else {
createFile(file, file.delete());
}
} catch (Exception e) {
e.printStackTrace();
boolean deleted = false;
if (file.exists()) {
deleted = file.delete();
}
createFile(file, deleted);
}
}
}
private static void createFile(File file, boolean deleted){
try {
if (deleted && !file.exists()) {
file.createNewFile();
}
} catch (Exception e) {
e.printStackTrace();
}
}
使用此方案應(yīng)用上線后該問題崩潰次數(shù)減少了90%以上。也許Google工程師應(yīng)該考慮下?lián)Q一種技術(shù)方案檢測應(yīng)用是否存在多進(jìn)程共用WebView數(shù)據(jù)目錄秸侣。