On Explainability of Deep Neural Networks

On Explainability of Deep Neural Networks

? ? During a discussion yesterday with software architect extraordinaire?David Lazar?regarding?how everything old is new again, the topic of deep neural networks and its amazing success was brought up. Unless one is?living under a rock for past five years, the advancements in artificial neural networks (ANN) has been quite significant and noteworthy. Since the thaw of?AI winter, the frowned-upon wave has come a long way to be a successful and relied upon technique in multiple problem spaces. From an interesting?apocryphalwhich sums up the state of ANN back in the day to its current state of ConvNets with?Google Translate squeezing deep learning onto a phone, there has been significant progress made. We all have seen the dreamy images of?Inceptionism: Going Deeper into Neural Network?with great results in?image classification?and?speech recognition?while?fine tuning network parameters. Beyond the classical feats?of Reading Digits in Natural Images with Unsupervised Feature Learning?Deep Neural Networks (DNNs) have shown outstanding performance on image classification tasks. We now have excellent results on?MNIST,Imagenet classification with deep convolutional neural networks, and effective use of?Deep Neural Networks for Object Detection.

Otavio Good?of Google?puts it quite well,

? ? Five years ago, if you gave a computer an image of a cat or a dog, it had trouble telling which was which. Thanks to convolutional neural networks, not only can computers tell the difference between cats and dogs, they can even recognize different breeds of dogs.

Geoffrey Hinton et al?noted?that

? ? Best system in 2010 competition got 47% error for its first choice and 25% error for its top 5 choices. A very deep neural net (Krizhevsky et. al. 2012) gets less than 40% error for its first choice and less than 20% for its top 5 choices

Courtesy: XKCD and?http://pekalicious.com/blog/training/

? ? So with all this fanfare, what could possibly go wrong?

? ? In deep learning systems where both the classifiers and the features are learned automatically, neural networks possess a grey side, the explain-ability problem.

? ? Explain-ability and determinism in ML systems is a larger discussion, but limiting the scope to stay within the context of neural nets when you see the?Unreasonable Effectiveness of Recurrent Neural Networks, it is important to pause and ponder, why does it work? Is it good enough that I can peek into this black-box by getting strategic heuristics out of the network, or infer the concept of cat from a trained neural network by?Building High-level Features Using Large Scale Unsupervised Learning? Does it make it a ‘grey-box’ if we can figure out word embedding extractions from the network in high dimensional space, and therefore?exploit similarities among languages for machine translation? The very idea of this non deterministic nature is problematic; as in context of how you choose the initial parameters such as starting point for gradient descent when training the back-propagation being of key importance. How about retain-ability? The imperviousness makes troubleshooting harder to say the least.

If you haven’t noticed, I am trying hard not make this a?pop-science?alarmist post but here is the leap I am going to take; that the relative lack of explain-ability and transparency inherent in the neural networks (and research community’s relative complacency towards the approach ‘because it just works’), this idea of black-boxed-intelligence is probably what may lead to larger issues identified by Gates, Hawking, and Musk. I would be the first one to state that this argument might be a stretch or over generalization of the shortcomings of a specific technique to create the doomsday scenario, and we might be able to ‘decrypt’ the sigmoid and all these fears will go away. However, my fundamental argument stays; if the technique isn’t quite as explainable, and with the ML proliferation as we have today, the unintended consequences might be too real to ignore.

With the ensemble of strong AI from weak AI, the concern towards explain-ability enlarges. There is no denying that it can be challenging to understand?what?a neural network is really doing under those layers approximating functions. For a happy path scenario when a network is trained well, we have seen repeatedly that it does achieve high quality results. However, it is still perplexing to comprehend the underpinnings as to how it is doing so? Even more alarmingly, if the network fails, it is hard to understand what went wrong. Can we really shrug off the skeptics?fearful about the dangers?that seemingly sentient Artificial Intelligence (AI) poses. As Bill Gates said?articulately?(practically refuting?Eric Horvitz's position)

I am in the camp that is concerned about super intelligence. First the machines will do a lot of jobs for us and not be super intelligent. That should be positive if we manage it well. A few decades after that though the intelligence is strong enough to be a concern. I agree with Elon Musk and some others on this and don’t understand why some people are not concerned.

The non-deterministic nature of a technique like neural network pose a larger concerns in terms of understanding the confidence of the classifier? The convergence of a neural network isn’t really clear but alternatively for SVM, it’s fairly trivial to validate.? Depicting the approximation of an ‘undocumented’ function as a black-box is most probably a fundamentally flawed idea in itself. If we equate this with the biological thought process, the signals and the corresponding trained behavior, we have an expected output based on the training set as an observer. However, in the non-identifiable model, the approximation provided by the neural network is fairly impenetrable for all intents and purposes.

I don’t think anyone with deep understanding of AI and machine learning is really worried about Skynet, at this point. Like Andrew Ng said

“Fearing a rise of killer robots is like worrying about overpopulation on Mars.”

The concern is more about adhering to “but it works!” aka If-I-fits-I-sits approach (the mandatory cat meme goes here).

The sociological challenges associated with self-driving trucks, taxis, delivery people?and employment are real but these are regulatory issues. The key issue lies in the heart of the technology and our understanding of its internals. Stanford's?Katie Malone?said it quite well in linear digressions episode on Neural Nets

? ? Even though it sounds like common sense that we would like to have controls in place where automation?should not be allowed to engage targets without human intervention, and luminaries like Hawking, Musk and Wozniak would like to?Ban autonomous weapons, urging AI experts, our default reliance on black-box approaches may make this nothing more than wishful thinking. As Stephen Hawking said

? ? “The primitive forms of artificial intelligence we already have, have proved very useful. But I think the development of full artificial intelligence could spell the end of the human race. Once humans develop artificial intelligence it would take off on its own and redesign itself at an ever-increasing rate. Humans, who are limited by slow biological evolution, couldn’t compete and would be superseded.”

? ? It might be fair to say that since we don’t completely understand a new technique, it makes us afraid (of change), and will be adapted as the research moves forward. As great as the results are, for non-black box models or interpretable models such as regression (closed form approximation) and decision trees / belief nets (graphical representations of deterministic and probabilistic beliefs) there is the comfort of determinism and understanding. We know today that smaller changes in NN can lead to significant changes as one of the“Intriguing” properties of neural networks. In?their paper, authors demonstrated that small changes can cause larger issues

? ? ?We find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extent. We can cause the network to misclassify an image by applying a certain hardly perceptible perturbation, which is found by maximizing the network’s prediction error….

? ? We demonstrated that deep neural networks have counter-intuitive properties both with respect to the semantic meaning of individual units and with respect to their discontinuities.

? ? The existence of the adversarial negatives appears to be in contradiction with the network’s ability to achieve high generalization performance. Indeed, if the network can generalize well, how can it be confused by these adversarial negatives, which are indistinguishable from the regular examples? Possible explanation is that the set of adversarial negatives is of extremely low probability….. However, we don’t have a deep understanding of how often adversarial negatives appears…

? ? Let’s be clear that when we discuss the black-box nature of ANN, we are not talking about Single-unit perceptron only being capable of learning linearly separable?patterns (Minsky et al, 69). It is well established that XOR functions inability to learn in single layer networks does not extend to multi-layer perceptron (MLP). Convolutional Neural Networks (CNN) are therefore a working proof to the contrary; the biologically-inspired variants of MLPs with the explicit assumption that the input comprises of images hence certain properties can be embedded into the architecture. The point here is against the rapid adaption of a technique which is black-box in nature with greater computational burden, inherent non-determinism, and over-fitting proneness over its “better” counterparts. To paraphrase?Jitendra Malik?without being an NN skeptic, there is no reason that multi-layer random forests or SVM cannot achieve the same results. During AI winter we made ANN pariah, aren’t we repeating the same mistake with other techniques now?

Recently Elon Musk has tweeted

Worth reading?Superintelligence by Bostrom. We need to be super careful with AI. Potentially more dangerous than nukes.

And even though?things might not be so bad?right now, let’s conclude this with the following quote from?Michael Jordan?from?IEEE spectrum.

Sometimes those go beyond where the achievements actually are. Specifically on the topic of deep learning, it’s largely a rebranding of neural networks, which go back to the 1980s. … In the current wave, the main success story is the convolutional neural network, but that idea was already present in the previous wave. And one of the problems … is that people continue to infer that something involving neuroscience is behind it, and that deep learning is taking advantage of an understanding of how the brain processes information, learns, makes decisions, or copes with large amounts of data. And that is just patently false.

Now this also leaves the other fundamental question is that if the pseudo-mimicry of biological neural nets actually a good approach to emulate intelligence? Or may be?Noam Chomsky on Where Artificial Intelligence Went Wrong?

That we will talk about some other day.

最后編輯于：2017.11.27 03:18:28

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末慰安，一起剝皮案震驚了整個濱河市，隨后出現(xiàn)的幾起案子聪铺，更是在濱河造成了極大的恐慌化焕，老刑警劉巖，帶你破解...
沈念sama閱讀 211,817評論 6贊 492
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件铃剔，死亡現(xiàn)場離奇詭異撒桨，居然都是意外死亡，警方通過查閱死者的電腦和手機键兜，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 90,329評論 3贊 385
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門凤类，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人普气，你說我怎么就攤上這事谜疤。” “怎么了？”我有些...
開封第一講書人閱讀 157,354評論 0贊 348
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵夷磕，是天一觀的道長履肃。經(jīng)常有香客問我，道長坐桩，這世上最難降的妖魔是什么尺棋？我笑而不...
開封第一講書人閱讀 56,498評論 1贊 284
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮绵跷，結(jié)果婚禮上膘螟，老公的妹妹穿的比我還像新娘。我一直安慰自己碾局，他們只是感情好荆残，可當(dāng)我...
茶點故事閱讀 65,600評論 6贊 386
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著擦俐，像睡著了一般脊阴。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上蚯瞧，一...
開封第一講書人閱讀 49,829評論 1贊 290
城市分裂傳說
那天嘿期，我揣著相機與錄音，去河邊找鬼埋合。笑死备徐，一個胖子當(dāng)著我的面吹牛，可吹牛的內(nèi)容都是我干的甚颂。我是一名探鬼主播蜜猾，決...
沈念sama閱讀 38,979評論 3贊 408
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼振诬！你這毒婦竟也來了蹭睡？” 一聲冷哼從身側(cè)響起，我...
開封第一講書人閱讀 37,722評論 0贊 266
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤赶么，失蹤者是張志新（化名）和其女友劉穎肩豁，沒想到半個月后，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體辫呻，經(jīng)...
沈念sama閱讀 44,189評論 1贊 303
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡清钥，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 36,519評論 2贊 327
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時候發(fā)現(xiàn)自己被綠了放闺。大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片祟昭。...
茶點故事閱讀 38,654評論 1贊 340
活死人
序言：一個原本活蹦亂跳的男人離奇死亡，死狀恐怖怖侦，靈堂內(nèi)的尸體忽然破棺而出篡悟，到底是詐尸還是另有隱情谜叹，我是刑警寧澤，帶...
沈念sama閱讀 34,329評論 4贊 330
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布恰力，位于F島的核電站叉谜，受9級特大地震影響，放射性物質(zhì)發(fā)生泄漏踩萎。R本人自食惡果不足惜停局，卻給世界環(huán)境...
茶點故事閱讀 39,940評論 3贊 313
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望香府。院中可真熱鬧董栽，春花似錦、人聲如沸企孩。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,762評論 0贊 21
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽勿璃。三九已至擒抛，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間补疑，已是汗流浹背歧沪。一陣腳步聲響...
開封第一講書人閱讀 31,993評論 1贊 266
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留莲组，地道東北人诊胞。一個月前我還...
沈念sama閱讀 46,382評論 2贊 360
代替公主和親
正文我出身青樓，卻偏偏與公主長得像锹杈，于是被迫代替她去往敵國和親撵孤。傳聞我的和親對象是個殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點故事閱讀 43,543評論 2贊 349

On Explainability of Deep Neural Networks

推薦閱讀更多精彩內(nèi)容