本文基于Spark 2.1.0版本蹄葱、Hadoop 2.7.3版本
如無(wú)特殊說(shuō)明滥壕,本文的Spark Web UI,特指: [Driver Web UI](默認(rèn)是http://運(yùn)行Driver程序的主機(jī)IP:4040)
Spark提供了幾個(gè)可配置的屬性,允許用戶控制Web UI使用的安全性:
- ACL機(jī)制狂打,允許指定的用戶查看Web UI的內(nèi)容书斜、終止JOB诬辈、Stage的運(yùn)行
- Filter機(jī)制,允許用戶使用自定義的過(guò)濾器來(lái)控制Web UI的使用權(quán)限(本文重點(diǎn))
1荐吉,ACL機(jī)制:
相關(guān)屬性:
Property Name | Default | Meaning |
---|---|---|
spark.acls.enable | false | Whether Spark acls should be enabled. If enabled, this checks to see if the user has access permissions to view or modify the job. Note this requires the user to be known, so if the user comes across as null no checks are done. |
spark.ui.view.acls | Empty | Comma separated list of users that have view access to the Spark web ui. By default only the user that started the Spark job has view access. Putting a "*" in the list means any user can have view access to this Spark job. |
spark.modify.acls | Empty | Comma separated list of users that have modify access to the Spark job. By default only the user that started the Spark job has access to modify it (kill it for example). Putting a "*" in the list means any user can have access to modify it. |
spark.admin.acls | Empty | Comma separated list of users/administrators that have view and modify access to all Spark jobs. This can be used if you run on a shared cluster and have a set of administrators or devs who help debug when things do not work. Putting a "*" in the list means any user can have the privilege of admin. |
這個(gè)機(jī)制很簡(jiǎn)單焙糟,大家看明白每個(gè)屬性的意思,就可以很快上手了样屠,我簡(jiǎn)單舉幾例子說(shuō)明一下穿撮。
首先,hadoop用戶下痪欲,使用YARN的方式啟動(dòng)spark shell應(yīng)用程序:
[hadoop@wl1 ~]$ spark-shell --master yarn
觀察上圖中兩個(gè)橢圓擴(kuò)起來(lái)的地方:
- 使用YARN時(shí)悦穿,默認(rèn)的登陸用戶是dr.who(可以在Hadoop的core-site.xml中使用hadoop.http.staticuser.user屬性來(lái)指定登陸用戶)
- 而spark-shell的應(yīng)用程序,是用hadoop用戶啟動(dòng)的
由于沒(méi)有使能ACL機(jī)制业踢,此時(shí)點(diǎn)擊Tracking UI: ApplicationMaster栗柒,是可以進(jìn)入該Spark應(yīng)用程序的Driver Web UI的(此處圖省略)。
使用如下命令重新提交該Spark應(yīng)用程序:
[hadoop@wl1 ~]$ spark-shell --master yarn --conf spark.acls.enable=true
點(diǎn)擊YARN Web UI界面的Tracking UI: ApplicationMaster知举,發(fā)現(xiàn)訪問(wèn)失敗
說(shuō)明spark.acls.enable屬性為true時(shí)瞬沦,開(kāi)啟了ACL機(jī)制。當(dāng)訪問(wèn)Spark應(yīng)用程序 Web UI的用戶不是啟動(dòng)該應(yīng)用程序的用戶時(shí)雇锡,會(huì)被拒絕訪問(wèn)(本例的訪問(wèn)者是dr.who)逛钻。
使用如下命令重新提交該Spark應(yīng)用程序:
[hadoop@wl1 ~]$ spark-shell --master yarn --conf spark.acls.enable=true --conf spark.ui.view.acls=dr.who
點(diǎn)擊YARN Web UI界面的Tracking UI: ApplicationMaster,發(fā)現(xiàn)可以正常訪問(wèn)Spark應(yīng)用程序的Driver Web UI了
但是點(diǎn)擊Job kill(圖中橢圓框位置)時(shí)锰提,并沒(méi)有終止該Job(Stage同理)曙痘,在終端有錯(cuò)誤信息的輸出
說(shuō)明通過(guò)spark.ui.view.acls屬性加入ACL的用戶,只有view的權(quán)限欲账,沒(méi)有modify的權(quán)限屡江。
使用如下命令重新提交該Spark應(yīng)用程序:
[hadoop@wl1 ~]$ spark-shell --master yarn --conf spark.acls.enable=true --conf spark.ui.view.acls=dr.who --conf spark.modify.acls=dr.who
點(diǎn)擊YARN Web UI界面的Tracking UI: ApplicationMaster,發(fā)現(xiàn)可以正常訪問(wèn)Spark應(yīng)用程序的Driver Web UI
同時(shí)也可以通過(guò)點(diǎn)擊Job kill來(lái)終止該Job(Stage同理)
要注意一點(diǎn):spark.modify.acls屬性需要和spark.ui.view.acls屬性配合使用赛不。
使用如下命令重新提交該Spark應(yīng)用程序:
[hadoop@wl1 ~]$ spark-shell --master yarn --conf spark.acls.enable=true --conf spark.admin.acls=dr.who
(此處圖省略)點(diǎn)擊YARN Web UI界面的Tracking UI: ApplicationMaster惩嘉,發(fā)現(xiàn)可以正常訪問(wèn)Spark應(yīng)用程序的Driver Web UI,同時(shí)也可以通過(guò)點(diǎn)擊Job kill(圖中橢圓框位置)來(lái)終止該Job(Stage同理)踢故。
說(shuō)明通過(guò)spark.admin.acls屬性加入ACL的用戶文黎,具有Admin的權(quán)限惹苗,可以通過(guò)Spark Driver Web UI 來(lái)view和modify Spark的應(yīng)用程序。
2耸峭,F(xiàn)ilter機(jī)制:
相關(guān)屬性:
Property Name | Default | Meaning |
---|---|---|
spark.ui.filters | None | Comma separated list of filter class names to apply to the Spark web UI. The filter should be a standard javax servlet Filter. Parameters to each filter can also be specified by setting a java system property of: spark.<class name of filter>.params='param1=value1,param2=value2' |
用戶可以通過(guò)自定義的Filter過(guò)濾器桩蓉,來(lái)控制Spark Driver Web UI的訪問(wèn)規(guī)則。
首先劳闹,實(shí)現(xiàn)一個(gè)符合標(biāo)準(zhǔn)javax servlet Filter的類院究,源碼如下:
這是一個(gè)對(duì)用戶名和密碼進(jìn)行校驗(yàn)的過(guò)濾器,也是HTTP訪問(wèn)時(shí)常用的權(quán)限核實(shí)方式
/**
* Created by wangliang on 2017/4/29.
*/
import org.apache.commons.codec.binary.Base64;
import org.apache.commons.lang.StringUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import javax.servlet.*;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.io.UnsupportedEncodingException;
import java.util.StringTokenizer;
public class BasicAuthFilter implements Filter {
/** Logger */
private static final Logger LOG = LoggerFactory.getLogger(BasicAuthFilter.class);
private String username = "";
private String password = "";
private String realm = "Protected";
@Override
public void init(FilterConfig filterConfig) throws ServletException {
username = filterConfig.getInitParameter("username");
password = filterConfig.getInitParameter("password");
String paramRealm = filterConfig.getInitParameter("realm");
if (StringUtils.isNotBlank(paramRealm)) {
realm = paramRealm;
}
}
@Override
public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain filterChain)
throws IOException, ServletException {
HttpServletRequest request = (HttpServletRequest) servletRequest;
HttpServletResponse response = (HttpServletResponse) servletResponse;
String authHeader = request.getHeader("Authorization");
if (authHeader != null) {
StringTokenizer st = new StringTokenizer(authHeader);
if (st.hasMoreTokens()) {
String basic = st.nextToken();
if (basic.equalsIgnoreCase("Basic")) {
try {
String credentials = new String(Base64.decodeBase64(st.nextToken()), "UTF-8");
LOG.debug("Credentials: " + credentials);
int p = credentials.indexOf(":");
if (p != -1) {
String _username = credentials.substring(0, p).trim();
String _password = credentials.substring(p + 1).trim();
if (!username.equals(_username) || !password.equals(_password)) {
unauthorized(response, "Bad credentials");
}
filterChain.doFilter(servletRequest, servletResponse);
} else {
unauthorized(response, "Invalid authentication token");
}
} catch (UnsupportedEncodingException e) {
throw new Error("Couldn't retrieve authentication", e);
}
}
}
} else {
unauthorized(response);
}
}
@Override
public void destroy() {
}
private void unauthorized(HttpServletResponse response, String message) throws IOException {
response.setHeader("WWW-Authenticate", "Basic realm=\"" + realm + "\"");
response.sendError(401, message);
}
private void unauthorized(HttpServletResponse response) throws IOException {
unauthorized(response, "Unauthorized");
}
}
上述代碼生成spark_filter.jar本涕,放置在Spark集群中业汰,使用如下命令提交Spark應(yīng)用程序:(spark.ui.filters通過(guò)driver java屬性來(lái)設(shè)置)
[hadoop@wl1 ~]$ spark-shell --master spark://wl1:7077 --driver-class-path /home/hadoop/testjar/spark_filter.jar --driver-java-options "-Dspark.ui.filters=BasicAuthFilter -Dspark.BasicAuthFilter.params='username=admin,password=admin,realm=20170429'"
這次使用Standalone的Client方式提交的應(yīng)用程序,所以通過(guò)Spark Master的Web UI來(lái)訪問(wèn)Driver Web UI(下圖橢圓框處)
神奇的一幕出現(xiàn)了菩颖,彈出了認(rèn)證對(duì)話框样漆,這個(gè)就是上面自定義的BasicAuthFilter實(shí)現(xiàn)的,輸入正確的用戶名和密碼
(就是提交應(yīng)用程序時(shí)-Dspark.BasicAuthFilter.params指定的)
就可以訪問(wèn)Spark Driver Web UI了
需要注意的地方:
如果使用YARN的方式來(lái)提交應(yīng)用程序晦闰,Spark默認(rèn)會(huì)加載Hadoop的org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter過(guò)濾器放祟,當(dāng)用戶使用YARN的Web UI來(lái)訪問(wèn)Spark應(yīng)用程序的Web UI時(shí),使用的地址是過(guò)濾器生成的8088端口的代理地址呻右,導(dǎo)致如果同時(shí)使用上面的
BasicAuthFilter過(guò)濾器時(shí)跪妥,用戶認(rèn)證總是失敗,因?yàn)樵撜J(rèn)證需要和Spark 4040端口的Web Server交互才行窿冯。
所以骗奖,如果想基于YARN來(lái)控制Spark應(yīng)用程序的Web UI,可以用Hadoop提供的Filter或者HTTP Kerberos的方式來(lái)實(shí)現(xiàn)醒串。
相關(guān)鏈接:
[Spark 2.1.0 configuration] (http://spark.apache.org/docs/latest/configuration.html)
[javax servlet Filter] (http://docs.oracle.com/javaee/6/api/javax/servlet/Filter.html)
[Spark 2.1.0 security]
(http://spark.apache.org/docs/latest/security.html)
喜歡這篇文章执桌,就點(diǎn)一下??吧??