Abstract
Many real-world regression problems demand a measure of the uncertainty associated with each prediction. Standard decision forests deliver efficient state-of-the-art predictive performance, but high-quality uncertainty estimates are lacking. Gaussian processes (GPs) deliver uncertainty estimates, but scaling GPs to large-scale datasets comes at the cost of approximating the uncertainty estimates. We extend Mondrian forests, rst proposed by Lakshminarayanan et al. (2014) for classication problems, to the large-scale non-parametric regression setting. Using a novel hierarchical Gaussian prior that dovetails with the Mondrian forest framework, we obtain principled uncertainty estimates, while still retaining the computational advantages of decision forests. Through a combination of illustrative examples, real-world large-scale datasets, and Bayesian optimization benchmarks, we demonstrate that Mondrian forests outperform approximate GPs on large-scale regression tasks and deliver better-calibrated uncertainty assessments than decision-forest-based methods.
思路概覽
高斯分布(Gaussian process, GP)回歸十分熱門泼疑,它不僅對(duì)非參數(shù)化預(yù)測準(zhǔn)確樊破,同時(shí)保留了對(duì)未觀測到數(shù)據(jù)的預(yù)測能力配紫。然而尾序,GP計(jì)算量相當(dāng)大。本文的目的是結(jié)合GP的屬性(good uncertainty estimates, probabilistic setup)和決策森林的屬性(computational speed)陌僵。具體做法如下:本文應(yīng)用了Mondrian Forest(MF),因?yàn)镸F每棵樹都有一個(gè)概率模型,而不同于其他決策森林胶征。在MF的基礎(chǔ)上作了以下擴(kuò)展:在每個(gè)葉節(jié)點(diǎn)應(yīng)用分層高斯先驗(yàn)概率(hierarchical Gaussian prior),并利用Gaussian belief propagation計(jì)算后驗(yàn)參數(shù)桨仿。
Mondrain Forest
建樹過程
預(yù)測
原始MF算法中睛低,預(yù)測值為:在特征向量為時(shí),
的預(yù)測概率是什么服傍,表達(dá)為
钱雷。不同于MF分類樹預(yù)測后驗(yàn)概率,Modrian regression tree是預(yù)測高斯后驗(yàn)值吹零。
每個(gè)節(jié)點(diǎn)數(shù)據(jù)分布都滿足高斯分布
需要注意的是:大部分決策樹預(yù)測值都僅與葉節(jié)點(diǎn)相關(guān)罩抗,而與內(nèi)部節(jié)點(diǎn)無關(guān)。但Mondrain tree不同灿椅,一個(gè)測試點(diǎn)
在根節(jié)點(diǎn)
到葉節(jié)點(diǎn)
任一節(jié)點(diǎn)中都可能分裂套蒂。因此一棵樹的預(yù)測值(后驗(yàn)概率)滿足一個(gè)混合高斯分布:
其中,代表了每個(gè)component的權(quán)重茫蛹,指的是操刀,在快到達(dá)節(jié)點(diǎn)
之前(即
的父節(jié)點(diǎn)),節(jié)點(diǎn)分裂的概率婴洼。
如果在預(yù)測過程中:
-
重新劃分馍刮,此時(shí)預(yù)測值為其父節(jié)點(diǎn)的后驗(yàn)概率;
-
落入一個(gè)葉節(jié)點(diǎn)窃蹋,此時(shí)預(yù)測值為當(dāng)前葉節(jié)點(diǎn)的后驗(yàn)概率卡啰;
而離訓(xùn)練集越遠(yuǎn),則更有可能分裂警没,因此當(dāng)測試集與訓(xùn)練集分布不同時(shí)匈辱,MF仍能保留預(yù)測能力。
一個(gè)森林的預(yù)測值則為: