Neil Zhu谦炒,簡(jiǎn)書ID Not_GOD积暖,University AI 創(chuàng)始人 & Chief Scientist或悲,致力于推進(jìn)世界人工智能化進(jìn)程孙咪。制定并實(shí)施 UAI 中長(zhǎng)期增長(zhǎng)戰(zhàn)略和目標(biāo),帶領(lǐng)團(tuán)隊(duì)快速成長(zhǎng)為人工智能領(lǐng)域最專業(yè)的力量巡语。
作為行業(yè)領(lǐng)導(dǎo)者翎蹈,他和UAI一起在2014年創(chuàng)建了TASA(中國(guó)最早的人工智能社團(tuán)), DL Center(深度學(xué)習(xí)知識(shí)中心全球價(jià)值網(wǎng)絡(luò)),AI growth(行業(yè)智庫培訓(xùn))等捌臊,為中國(guó)的人工智能人才建設(shè)輸送了大量的血液和養(yǎng)分杨蛋。此外,他還參與或者舉辦過各類國(guó)際性的人工智能峰會(huì)和活動(dòng),產(chǎn)生了巨大的影響力逞力,書寫了60萬字的人工智能精品技術(shù)內(nèi)容曙寡,生產(chǎn)翻譯了全球第一本深度學(xué)習(xí)入門書《神經(jīng)網(wǎng)絡(luò)與深度學(xué)習(xí)》,生產(chǎn)的內(nèi)容被大量的專業(yè)垂直公眾號(hào)和媒體轉(zhuǎn)載與連載寇荧。曾經(jīng)受邀為國(guó)內(nèi)頂尖大學(xué)制定人工智能學(xué)習(xí)規(guī)劃和教授人工智能前沿課程举庶,均受學(xué)生和老師好評(píng)。
問:什么樣的算法可以稱為是可非局部泛化的揩抡?
答:我指的非局部泛化算法是對(duì)那些與訓(xùn)練過程中的輸入相距很遠(yuǎn)的輸入也泛化得較好的學(xué)習(xí)算法户侥。這類算法必須能夠?qū)忉寯?shù)據(jù)的內(nèi)在概念的新的組合進(jìn)行泛化。近鄰方法和相關(guān)的像 kernal SVM 和 Decision Tree 算法只能在某些訓(xùn)練樣本的空間鄰居上按照簡(jiǎn)單的方式組合(like linear interpolation or linear extrapolation)進(jìn)行比較好的泛化峦嗤。因?yàn)榻忉寯?shù)據(jù)的內(nèi)部概念的可能配置的數(shù)目是指數(shù)級(jí)大的蕊唐,這些算法的泛化雖然不錯(cuò),但是還不足夠好烁设。非局部泛化表示能夠泛化到超級(jí)大內(nèi)在概念的可能的配置的空間的能力替梨,那些新的數(shù)據(jù)可能會(huì)離觀測(cè)到的訓(xùn)練數(shù)據(jù)很遠(yuǎn),遠(yuǎn)遠(yuǎn)超過訓(xùn)練樣本的鄰居的訓(xùn)練樣本的線性組合装黑。
原文:
I mean that the algorithm should be able to provide good generalizations even for inputs that are far from those it has seen during training. It should be able to generalize to new combinations of the underlying concepts that explain the data. Nearest-neighbor methods and related ones like kernel SVMs and decision trees can only generalize in some neighborhood around the training examples, in a way that is simple (like linear interpolation or linear extrapolation). Because the number of possible configurations of the underlying concepts that explain the data is exponentially large, this kind of generalization is good but not sufficient at all. Non-local generalization refers to the ability to generalize to a huge space of possible configurations of the underlying causes of the data, potentially very far from the observed data, going beyond linear combinations of training examples that have been seen in the neighborhood of the given input.