登錄注冊(cè)寫文章

WHAT MY DEEP MODEL DOESN'T KNOW.

WHAT MY DEEP MODEL DOESN'T KNOW.

WHAT MY DEEP MODEL DOESN'T KNOW...
I recently spent some time trying to understand why dropout deep learning models work so well – trying to relate them to new research from the last couple of years. I was quite surprised to see how close these were to Gaussian processes. I was even more surprised to see that we can get uncertainty information from these deep learning models for free – without changing a thing.
[Post]
DROPOUT AS A BAYESIAN APPROXIMATION
05/06/2015

2 new papers on dropout as a Bayesian approximation, with applications to model uncertainty in deep learning [1], and Bayesian convolutional neural networks [2]have been added to the publications

SOFTWARE

OPEN SOURCE PROJECTS I'M CURRENTLY WORKING ON

VSSGP
An implementation of the Variational Sparse Spectrum Gaussian Process using Theano (a Python package for symbolic differentiation).
[Software] [Paper]

CLGP
An implementation of the Categorical Latent Gaussian Process using Theano (a Python package for symbolic differentiation).
[Software] [Paper]

GPARML
A light-weight and minimal Python implementation of parallel inference for the Bayesian Gaussian process latent variable model and GP regression.
[Software] [Paper]

UNIVERSITY OF CAMBRIDGE PRESENTATION TEMPLATE
This is a presentation template with the colour scheme of the University of Cambridge. The beamer template is based oncambridge-beamer with changes to the colour scheme and page layout.
[Software] [Example]

GIZA#
An optimised C++ extension of Giza++ (a word alignment software package) implementing the hierarchical Pitman-Yor process alignment models.
[Software] [Paper]

發(fā)表論文

BAYESIAN CONVOLUTIONAL NEURAL NETWORKS WITH BERNOULLI APPROXIMATE VARIATIONAL INFERENCE

Link to this paper

Link to this paper

We present an efficient Bayesian convolutional neural network (convnet). The model offers better robustness to over-fitting on small data and achieves a considerable improvement in classification accuracy compared to previous approaches. We give state-of-the-art results on CIFAR-10 following our insights.
Yarin Gal, Zoubin GhahramaniIn submission, 2015 [arXiv] [BibTex]

Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

DROPOUT AS A BAYESIAN APPROXIMATION: REPRESENTING MODEL UNCERTAINTY IN DEEP LEARNING

Link to this paper

Link to this paper

We show that dropout in multilayer perceptron models (MLPs) can be interpreted as a Bayesian approximation. Results are obtained for modelling uncertainty for dropout MLP models - extracting information that has been thrown away so far, from existing models. This mitigates the problem of representing uncertainty in deep learning without sacrificing computational performance or test accuracy.
Yarin Gal, Zoubin GhahramaniIn submission, 2015 [arXiv] [Appendix] [BibTex]

Dropout as a Bayesian Approximation: Insights and Applications

Dropout as a Bayesian Approximation: Insights and Applications

DROPOUT AS A BAYESIAN APPROXIMATION: INSIGHTS AND APPLICATIONS

Link to this paper

Link to this paper

Deep learning techniques lack the ability to reason about uncertainty over the features. We show that a multilayer perceptron (MLP) with arbitrary depth and non-linearities, with dropout applied after every weight layer, is mathematically equivalent to an approximation to a well known Bayesian model. This paper is a short version of the appendix of "Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning".
Yarin Gal, Zoubin GhahramaniDeep Learning Workshop, ICML, 2015 [PDF] [BibTex]

An Infinite Product of Sparse Chinese Restaurant Processes

An Infinite Product of Sparse Chinese Restaurant Processes

AN INFINITE PRODUCT OF SPARSE CHINESE RESTAURANT PROCESSES

Link to this paper

Link to this paper

We define a new process that gives a natural generalisation of the Indian buffet process (used for binary feature allocation) into categorical latent features. For this we take advantage of different limit parametrisations of the Dirichlet process and its generalisation the Pitman–Yor process.
Yarin Gal, Tomoharu Iwata, Zoubin Ghahramani10th Conference on Bayesian Nonparametrics (BNP), 2015 [Talk] [BibTex] We thank BNP for the travel award.
**

Improving the Gaussian Process Sparse Spectrum Approximation by Representing Uncertainty in Frequency Inputs

IMPROVING THE GAUSSIAN PROCESS SPARSE SPECTRUM APPROXIMATION BY REPRESENTING UNCERTAINTY IN FREQUENCY INPUTS

Link to this paper

Standard sparse pseudo-input approximations to the Gaussian process (GP) cannot handle complex functions well. Sparse spectrum alternatives attempt to answer this but are known to over-fit. We use variational inference for the sparse spectrum approximation to avoid both issues. We extend the approximate inference to the distributed and stochastic domains.
Yarin Gal, Richard TurnerICML, 2015** [PDF] [Software] [BibTex]

Latent Gaussian Processes for Distribution Estimation of Multivariate Categorical Data

Latent Gaussian Processes for Distribution Estimation of Multivariate Categorical Data

LATENT GAUSSIAN PROCESSES FOR DISTRIBUTION ESTIMATION OF MULTIVARIATE CATEGORICAL DATA

Link to this paper

Link to this paper

Multivariate categorical data occur in many applications of machine learning. One of the main difficulties with these vectors of categorical variables is sparsity. The number of possible observations grows exponentially with vector length, but dataset diversity might be poor in comparison. Recent models have gained significant improvement in supervised tasks with this data. These models embed observations in a continuous space to capture similarities between them. Building on these ideas we propose a Bayesian model for the unsupervised task of distribution estimation of multivariate categorical data.
Yarin Gal, Yutian Chen, Zoubin GhahramaniWorkshop on Advances in Variational Inference, NIPS, 2014 [PDF] [Poster] [Presentation] [BibTex] ICML, 2015 [PDF] [Software] [BibTex] We thank Google DeepMind for the travel award.
**

Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

DISTRIBUTED VARIATIONAL INFERENCE IN SPARSE GAUSSIAN PROCESS REGRESSION AND LATENT VARIABLE MODELS

Link to this paper

We develop parallel inference for sparse Gaussian process regression and latent variable models. These processes are used to model functions in a principled way and for non-linear dimensionality reduction in linear time complexity. Using parallel inference we allow the models to work on much larger datasets than before.
Yarin Gal, Mark van der Wilk, Carl E. RasmussenWorkshop on New Learning Models and Frameworks for Big Data, ICML, 2014 [arXiv] [Presentation] [Software] [BibTex] NIPS, 2014 [PDF] [BibTex] We thank NIPS for the travel award.
**

Feature Partitions and Multi-View Clusterings

FEATURE PARTITIONS AND MULTI-VIEW CLUSTERINGS

Link to this paper

We define a new combinatorial structure that unifies Kingman's random partitions and Broderick, Pitman, and Jordan's feature frequency models. This structure underlies non-parametric multi-view clustering models, where data points are simultaneously clustered into different possible clusterings. The de Finetti measure is a product of paintbox constructions. Studying the properties of feature partitions allows us to understand the relations between the models they underlie and share algorithmic insights between them.
Yarin Gal, Zoubin GhahramaniInternational Society for Bayesian Analysis (ISBA), 2014[Link] [Poster]We thank ISBA for the travel award.
**

Dirichlet Fragmentation Processes

Dirichlet Fragmentation Processes

DIRICHLET FRAGMENTATION PROCESSES

Link to this paper

Link to this paper

We introduce a new class of models over trees based on the theory of fragmentation processes. The Dirichlet Fragmentation Process Mixture Model is an example model derived from this new class. This model has efficient and simple inference, and significantly outperforms existing approaches for hierarchical clustering and density modelling.
Hong Ge, Yarin Gal, Zoubin GhahramaniIn submission, 2014[PDF] [BibTex]

Pitfalls in the use of Parallel Inference for the Dirichlet Process

Pitfalls in the use of Parallel Inference for the Dirichlet Process

PITFALLS IN THE USE OF PARALLEL INFERENCE FOR THE DIRICHLET PROCESS

Link to this paper

Link to this paper

We show that the recently suggested parallel inference for the Dirichlet process is conceptually invalid. The Dirichlet process is important for many fields such as natural language processing. However the suggested inference would not work in most real-world applications.
Yarin Gal, Zoubin GhahramaniWorkshop on Big Learning, NIPS, 2013[PDF] [Presentation] [BibTex] ICML, 2014[PDF] [Talk] [Presentation] [Poster] [BibTex]

Variational Inference in the Gaussian Process Latent Variable Model and Sparse GP Regression – a Gentle Tutorial

Variational Inference in the Gaussian Process Latent Variable Model and Sparse GP Regression – a Gentle Tutorial

VARIATIONAL INFERENCE IN THE GAUSSIAN PROCESS LATENT VARIABLE MODEL AND SPARSE GP REGRESSION – A GENTLE TUTORIAL

Link to this paper

Link to this paper

We present an in-depth and self-contained tutorial for sparse Gaussian Process (GP) regression. We also explain GP latent variable models, a tool for non-linear dimensionality reduction. The sparse approximation reduces the time complexity of the models from cubic to linear but its development is scattered across the literature. The various results are collected here.
Yarin Gal, Mark van der WilkTutorial, 2014[arXiv] [BibTex]

Semantics, Modelling, and the Problem of Representation of Meaning – a Brief Survey of Recent Literature

Semantics, Modelling, and the Problem of Representation of Meaning – a Brief Survey of Recent Literature

SEMANTICS, MODELLING, AND THE PROBLEM OF REPRESENTATION OF MEANING – A BRIEF SURVEY OF RECENT LITERATURE

Link to this paper

Link to this paper

Over the past 50 years many have debated what representation should be used to capture the meaning of natural language utterances. Recently new needs of such representations have been raised in research. Here I survey some of the interesting representations suggested to answer for these new needs.
Yarin GalLiterature survey, 2013[arXiv] [BibTex]

A Systematic Bayesian Treatment of the IBM Alignment Models

A Systematic Bayesian Treatment of the IBM Alignment Models

A SYSTEMATIC BAYESIAN TREATMENT OF THE IBM ALIGNMENT MODELS

Link to this paper

Link to this paper

We used a non-parametric process in models that align words between pairs of sentences. These alignment models are used at the core of all machine translation systems. We obtained a significant improvement in translation using the process.
Yarin Gal, Phil BlunsomAssociation for Computational Linguistics (NA-ACL), 2013[PDF] [Presentation] [BibTex]

Relaxing HMM Alignment Model Assumptions for Machine Translation Using a Bayesian Approach

Relaxing HMM Alignment Model Assumptions for Machine Translation Using a Bayesian Approach

RELAXING HMM ALIGNMENT MODEL ASSUMPTIONS FOR MACHINE TRANSLATION USING A BAYESIAN APPROACH

Link to this paper

Link to this paper

We used a non-parametric process to relax some of the restricting assumptions often used in machine translation. When a long history of translation words is not available the process falls-back onto shorter histories in a principled way.
Yarin GalMaster's Dissertation, 2012[PDF] [BibTex]

Overcoming Alpha-Beta Limitations Using Evolved Artificial Neural Networks

Overcoming Alpha-Beta Limitations Using Evolved Artificial Neural Networks

OVERCOMING ALPHA-BETA LIMITATIONS USING EVOLVED ARTIFICIAL NEURAL NETWORKS

Link to this paper

Link to this paper

We trained a feed-forward neural network to play checkers. The network acts as both the value function for a min-max algorithm and a heuristic for pruning tree branches in a reinforcement learning setting. We used no supervised signal for training - a set of networks was assessed by playing against each-other and the winning networks' weights where changed slightly.
Yarin Gal, Mireille AvigalMachine Learning and Applications (IEEE), 2010[PDF] [BibTex]

TALKS

Latent Gaussian Processes for Distribution Estimation of Multivariate Categorical Data

Latent Gaussian Processes for Distribution Estimation of Multivariate Categorical Data

LATENT GAUSSIAN PROCESSES FOR DISTRIBUTION ESTIMATION OF MULTIVARIATE CATEGORICAL DATA

Link to this paper

Link to this paper

We discuss the issues with representing high dimensional vectors of discrete variables, and existing models that attempt to estimate the distribution of such. We then present our approach which relies on a continuous latent representation for the discrete data.
Yarin GalInvited talk: Microsoft Research, Cambridge, 2015Invited talk: NTT Labs, Kyoto, Japan, 2015[Presentation] [Video]

Representations of Meaning

Representations of Meaning

REPRESENTATIONS OF MEANING

Link to this paper

Link to this paper

We discuss various formal representations of meaning, including Gentzen sequent calculus, vector spaces over the real numbers, and symmetric closed monoidal categories.
Yarin GalInvited talk: Trinity College Mathematical Society, University of Cambridge, 2015[Presentation]

Symbolic Differentiation for Rapid Model Prototyping in Machine Learning and Data Analysis – a Hands-on Tutorial

Symbolic Differentiation for Rapid Model Prototyping in Machine Learning and Data Analysis – a Hands-on Tutorial

SYMBOLIC DIFFERENTIATION FOR RAPID MODEL PROTOTYPING IN MACHINE LEARNING AND DATA ANALYSIS – A HANDS-ON TUTORIAL

Link to this paper

Link to this paper

We talk about the theory of symbolic differentiation and demonstrate its use through the Theano Python package. We give two example models: logistic regression and a deep net, and continue to talk about rapid prototyping of probabilistic models with SVI. The talk is based in part on the Theano online tutorial.
Yarin GalMLG Seminar, 2014[Presentation]

Rapid Prototyping of Probabilistic Models using Stochastic Variational Inference

Rapid Prototyping of Probabilistic Models using Stochastic Variational Inference

RAPID PROTOTYPING OF PROBABILISTIC MODELS USING STOCHASTIC VARIATIONAL INFERENCE

Link to this paper

Link to this paper

In data analysis we have to develop new models which can often be a lengthy process. We need to derive appropriate inference which often involves cumbersome implementation which changes regularly. Rapid prototyping answers similar problems in manufacturing, where it is used for quick fabrication of scale models of physical parts. We present Stochastic Variational Inference (SVI) as a tool for rapid prototyping of probabilistic models.
Yarin GalShort talk, 2014[Presentation]

Distributed Inference in Bayesian Nonparametrics – the Dirichlet Process and the Gaussian Process

Distributed Inference in Bayesian Nonparametrics – the Dirichlet Process and the Gaussian Process

DISTRIBUTED INFERENCE IN BAYESIAN NONPARAMETRICS – THE DIRICHLET PROCESS AND THE GAUSSIAN PROCESS

Link to this paper

Link to this paper

I present distributed inference methodologies for two major processes in Bayesian nonparametrics. Pitfalls in the use of parallel inference for the Dirichlet process are discussed, and distributed variational inference in sparse Gaussian process regression and latent variable models is presented.
Yarin GalInvited talk: NTT Labs, Kyoto, Japan, 2014

Emergent Communication for Collaborative Reinforcement Learning

Emergent Communication for Collaborative Reinforcement Learning

EMERGENT COMMUNICATION FOR COLLABORATIVE REINFORCEMENT LEARNING

Link to this paper

Link to this paper

Slides from a seminar introducing collaborative reinforcement learning and how learning communication can improve collaboration. We use game theory to motivate the use of collaboration in a multi-agent setting. We then define multi-agent and decentralised multi-agent Markov decision processes. We discuss issues with these definitions and possible ways to overcome them. We then transition to emergent languages. We explain how the use of an emergent communication protocol could aid in collaborative reinforcement learning. Reviewing a range of emergent communication models developed from a linguistic motivation to a pragmatic view, we finish with an assessment of the problems left unanswered in the field.
Yarin Gal, Rowan McAllisterMLG Seminar, 2014[Presentation]

The Borel–Kolmogorov paradox

The Borel–Kolmogorov paradox

THE BOREL–KOLMOGOROV PARADOX

Link to this paper

Link to this paper

Slides from a short talk explaining the the Borel–Kolmogorov paradox, alluding to possible pitfalls in probabilistic modelling. The slides are partly based on Jaynes, E.T. (2003) "Probability Theory: The Logic of Science".
Yarin GalShort talk, 2014[Presentation]

Bayesian Nonparametrics in Real-World Applications: Statistical Machine Translation and Language Modelling on Big Datasets

Bayesian Nonparametrics in Real-World Applications: Statistical Machine Translation and Language Modelling on Big Datasets

BAYESIAN NONPARAMETRICS IN REAL-WORLD APPLICATIONS: STATISTICAL MACHINE TRANSLATION AND LANGUAGE MODELLING ON BIG DATASETS

Link to this paper

Link to this paper

Slides from a seminar introducing statistical machine translation and language modelling as real-world applications of Bayesian nonparametrics. We give a friendly introduction to statistical machine translation and language modelling, and then describe how recent developments in the field of Bayesian nonparametrics can be exploited for these tasks. The first part of the presentation is based on the lecture notes by Dr Phil Blunsom.
Yarin GalMLG Seminar, 2013[Presentation]

最后編輯于：2017.11.27 03:01:23

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末态兴，一起剝皮案震驚了整個(gè)濱河市禁偎，隨后出現(xiàn)的幾起案子半沽，更是在濱河造成了極大的恐慌，老刑警劉巖蹋订，帶你破解...
沈念sama閱讀 206,839評(píng)論 6贊 482
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場(chǎng)離奇詭異刻伊，居然都是意外死亡辅辩，警方通過(guò)查閱死者的電腦和手機(jī)，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 88,543評(píng)論 2贊 382
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門娃圆，熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)玫锋，“玉大人，你說(shuō)我怎么就攤上這事讼呢×寐梗” “怎么了？”我有些...
開(kāi)封第一講書人閱讀 153,116評(píng)論 0贊 344
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵悦屏，是天一觀的道長(zhǎng)节沦。經(jīng)常有香客問(wèn)我键思，道長(zhǎng)，這世上最難降的妖魔是什么甫贯？我笑而不...
開(kāi)封第一講書人閱讀 55,371評(píng)論 1贊 279
?港島之戀（遺憾婚禮）
正文為了忘掉前任吼鳞，我火速辦了婚禮，結(jié)果婚禮上叫搁，老公的妹妹穿的比我還像新娘赔桌。我一直安慰自己，他們只是感情好渴逻，可當(dāng)我...
茶點(diǎn)故事閱讀 64,384評(píng)論 5贊 374
惡毒庶女頂嫁案：這布局不是一般人想出來(lái)的
文/花漫我一把揭開(kāi)白布疾党。她就那樣靜靜地躺著，像睡著了一般惨奕。火紅的嫁衣襯著肌膚如雪雪位。梳的紋絲不亂的頭發(fā)上，一...
開(kāi)封第一講書人閱讀 49,111評(píng)論 1贊 285
城市分裂傳說(shuō)
那天梨撞，我揣著相機(jī)與錄音雹洗，去河邊找鬼。笑死卧波，一個(gè)胖子當(dāng)著我的面吹牛时肿，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播幽勒，決...
沈念sama閱讀 38,416評(píng)論 3贊 400
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開(kāi)眼嗜侮，長(zhǎng)吁一口氣：“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼！你這毒婦竟也來(lái)了啥容？” 一聲冷哼從身側(cè)響起锈颗，我...
開(kāi)封第一講書人閱讀 37,053評(píng)論 0贊 259
萬(wàn)榮殺人案實(shí)錄
序言：老撾萬(wàn)榮一對(duì)情侶失蹤，失蹤者是張志新（化名）和其女友劉穎咪惠，沒(méi)想到半個(gè)月后击吱，有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 43,558評(píng)論 1贊 300
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡遥昧，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 36,007評(píng)論 2贊 325
?白月光啟示錄
正文我和宋清朗相戀三年覆醇，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片炭臭。...
茶點(diǎn)故事閱讀 38,117評(píng)論 1贊 334
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡永脓，死狀恐怖，靈堂內(nèi)的尸體忽然破棺而出鞋仍，到底是詐尸還是另有隱情常摧，我是刑警寧澤，帶...
沈念sama閱讀 33,756評(píng)論 4贊 324
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站落午，受9級(jí)特大地震影響谎懦，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜溃斋，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 39,324評(píng)論 3贊 307
男人毒藥：我在死后第九天來(lái)索命
文/蒙蒙一界拦、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧梗劫，春花似錦享甸、人聲如沸。這莊子的主人今日做“春日...
開(kāi)封第一講書人閱讀 30,315評(píng)論 0贊 19
一樁弒父案枪萄，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽(yáng)隐岛。三九已至猫妙，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間聚凹，已是汗流浹背割坠。一陣腳步聲響...
開(kāi)封第一講書人閱讀 31,539評(píng)論 1贊 262
情欲美人皮
我被黑心中介騙來(lái)泰國(guó)打工，沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留妒牙，地道東北人彼哼。一個(gè)月前我還...
沈念sama閱讀 45,578評(píng)論 2贊 355
代替公主和親
正文我出身青樓，卻偏偏與公主長(zhǎng)得像湘今，于是被迫代替她去往敵國(guó)和親敢朱。傳聞我的和親對(duì)象是個(gè)殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 42,877評(píng)論 2贊 345

推薦閱讀更多精彩內(nèi)容

努力不一定成功拴签，但不努力一定很舒服
文/Sunny笙笳周末在宿舍睡了兩天，整整做了兩天的廢柴旗们。胡吃海喝蚓哩，不運(yùn)動(dòng)不鍛煉不學(xué)習(xí)，三不人員上渴“独妫總之就縮在被子...
笙笳愛(ài)吃糖閱讀 850評(píng)論 0贊 1
我以為我不會(huì)秀恩愛(ài)曹阔，直到遇到了你。
單身的時(shí)候心里想著“就算有一天我真找到了自己的幸福我也不會(huì)秀恩愛(ài) 萬(wàn)一所有人都知道了我們最后還沒(méi)在一起那豈不是...
初心不改丶閱讀 308評(píng)論 0贊 1
邏輯學(xué)的4種命題AEIO——簡(jiǎn)單的邏輯學(xué)3/6
By 小雨讀書如我所料隔披，前2篇邏輯學(xué)的閱讀量真的是歷史最低啊赃份，不過(guò)既然開(kāi)始了，就要把自己挖的坑填完锹锰，況且我還挺喜...
小雨讀書寫作閱讀 27,174評(píng)論 6贊 25

2贊3贊

贊賞

手機(jī)看全文

亚洲A日韩AV无卡,小受高潮白浆痉挛av免费观看,成人AV无码久久久久不卡网站,国产AV日韩精品