上一節(jié)我們通過源碼詳細(xì)剖析了spark資源調(diào)度的算法虚倒,其中涉到Master分別過方法LaunchDriver,LaunchExecutor發(fā)送Driver,Eecutor到Worker上啟動(dòng)。本節(jié)就以這兩方面進(jìn)行原理深入剖析
1:Master要求Worker啟動(dòng)Driver與Executor.調(diào)用方法分別是LaunchDriver,LaunchExecutor
case LaunchDriver(driverId, driverDesc) => {
logInfo(s"Asked to launch driver $driverId")
val driver = new DriverRunner(
conf,
driverId,
workDir,
sparkHome,
driverDesc.copy(command = Worker.maybeUpdateSSLSettings(driverDesc.command, conf)),
self,
akkaUrl)
drivers(driverId) = driver
driver.start()
coresUsed += driverDesc.cores
memoryUsed += driverDesc.mem
}
通過上面的代碼炫彩,我們可以看到創(chuàng)建了一個(gè)DriverRunner對(duì)象澈蟆,并且driver.start().不難看出,這個(gè)方法本身就是一個(gè)線程应民,接著看下面的代碼
/** Starts a thread to run and manage the driver. */
def start() = {
//啟動(dòng)一個(gè)線程但校,調(diào)用start
new Thread("DriverRunner for " + driverId) {
override def run() {
try {
//創(chuàng)建driver的工作目錄
val driverDir = createWorkingDirectory()
//下載用戶上傳的jar(我們編寫的application程序)
val localJarFilename = downloadUserJar(driverDi
def substituteVariables(argument: String): String = argument match {
case "{{WORKER_URL}}" => workerUrl
case "{{USER_JAR}}" => localJarFilename
case other => other
不難看出,這還是一個(gè)java線程枷踏,所以spark源碼中菩暗,其實(shí)大量用了java的代碼,這個(gè)后面我們都會(huì)提到的旭蠕。所以我們?cè)陂_發(fā)中勋眯,不一定學(xué)了scala就一定全是用scala開發(fā)Applicaiton。
在上面的代碼中下梢,首先通過createWorkingDirectory()創(chuàng)建了工作目錄,其中driverDir=new File(...)這也是JAVA中的FILE
private def createWorkingDirectory(): File = {
val driverDir = new File(workDir, driverId)
if (!driverDir.exists() && !driverDir.mkdirs()) {
throw new IOException("Failed to create directory " + driverDir)
}
driverDir
}
接下來看代碼:這就是創(chuàng)建一個(gè)ProcessBuilder,用這個(gè)對(duì)象啟動(dòng)driver進(jìn)程
val builder = CommandUtils.buildProcessBuilder(driverDesc.command, driverDesc.mem,
sparkHome.getAbsolutePath, substituteVariables)
launchDriver(builder, driverDir, driverDesc.supervise)
}
。塞蹭。孽江。
val processStart = clock.getTimeMillis()
val exitCode = process.get.waitFor()
接下來看代碼,當(dāng)driver啟動(dòng)番电,或者被kill,會(huì)調(diào)用worker中的DriverStateChanged()岗屏,來通知Master改變driver的狀態(tài)
finalState = Some(state)
worker ! DriverStateChanged(driverId, state, finalException)
下面是worker中的DriverStateChanged()源碼:
case DriverStateChanged(driverId, state, exception) => {
state match {
case DriverState.ERROR =>
logWarning(s"Driver $driverId failed with unrecoverable exception: ${exception.get}")
case DriverState.FAILED =>
logWarning(s"Driver $driverId exited with failure")
case DriverState.FINISHED =>
logInfo(s"Driver $driverId exited successfully")
case DriverState.KILLED =>
logInfo(s"Driver $driverId was killed by user")
case _ =>
logDebug(s"Driver $driverId changed state to $state")
}
//向Master通知辆琅,修改driver的狀態(tài)信息
master ! DriverStateChanged(driverId, state, exception)
val driver = drivers.remove(driverId).get
finishedDrivers(driverId) = driver
memoryUsed -= driver.driverDesc.mem
不難看出,我們現(xiàn)在分析到這里这刷,是不是與前面幾節(jié)我們分析的都已經(jīng)連起來了婉烟。當(dāng)Master收到Worker的狀態(tài)改變時(shí),更新在自己的內(nèi)存區(qū)的Driver信息.暇屋。以上就是Driver在Worker的運(yùn)行原理.
二:Executor在Worker的啟動(dòng)過程: