A Survey on Deep Learning for Data-driven Soft Sensors
(Qingqiang Sun and Zhiqiang Ge, Senior Member, IEEE)
本文是來自浙江大學(xué)葛志強(qiáng)教授團(tuán)隊(duì)21年的一篇關(guān)于深度學(xué)習(xí)軟測(cè)量的綜述毛甲,文章詳細(xì)總結(jié)了當(dāng)前深度學(xué)習(xí)在軟測(cè)量領(lǐng)域的各項(xiàng)工作以及未來的研究熱點(diǎn)及展望。
本博客亦可以作為一篇閱讀筆記,更可以說是一篇重點(diǎn)內(nèi)容的翻譯,可以供讀者更快的了解目前在軟測(cè)量領(lǐng)域的各項(xiàng)深度學(xué)習(xí)的研究工作以及未來的研究熱點(diǎn)拾弃。文末附上參考文獻(xiàn)以供讀者對(duì)各項(xiàng)工作展開更詳細(xì)的了解屠尊。
數(shù)據(jù)驅(qū)動(dòng)軟測(cè)量深度學(xué)習(xí)調(diào)研
摘要
????軟測(cè)量技術(shù)在流程工業(yè)中得到了廣泛的使用,以實(shí)現(xiàn)過程監(jiān)測(cè)叉庐、質(zhì)量預(yù)測(cè)等許多重要應(yīng)用裹匙。隨著軟硬件技術(shù)的發(fā)展瑞凑,工業(yè)過程具有了新的特點(diǎn),導(dǎo)致傳統(tǒng)的軟測(cè)量建模方法性能變差概页。深度學(xué)習(xí)作為一種數(shù)據(jù)驅(qū)動(dòng)的方法籽御,在許多領(lǐng)域以及軟測(cè)量場(chǎng)景中顯示出其巨大的潛力。經(jīng)過一段時(shí)間的發(fā)展,特別是過去五年技掏,出現(xiàn)了許多需要調(diào)研的新問題铃将。因此,本文首先通過分析深度學(xué)習(xí)的優(yōu)點(diǎn)和工業(yè)過程的趨勢(shì)哑梳,論證了深度學(xué)習(xí)對(duì)軟測(cè)量應(yīng)用的必要性和重要性劲阎。接下來,總結(jié)和討論主流深度學(xué)習(xí)模型鸠真、技巧和框架/工具包悯仙,以幫助設(shè)計(jì)人員推動(dòng)軟測(cè)量技術(shù)的發(fā)展。然后弧哎,對(duì)現(xiàn)有的工作進(jìn)行了綜述和分析雁比,討論了實(shí)際應(yīng)用中出現(xiàn)的需求和問題。最后撤嫩,給出了展望和結(jié)論。
1.引言
????當(dāng)前蠢终,由于信息技術(shù)的發(fā)展和客戶需求的增加序攘,流程工業(yè)變得越來越復(fù)雜。因此寻拂,直接測(cè)量和分析關(guān)鍵質(zhì)量變量的成本和難度都在增加[1-3]程奠。然而,為了監(jiān)測(cè)系統(tǒng)的運(yùn)行狀態(tài)祭钉,實(shí)現(xiàn)過程的順利控制和提高產(chǎn)品質(zhì)量瞄沙,必須盡可能快速、準(zhǔn)確地獲得那些關(guān)鍵變量或質(zhì)量指標(biāo)慌核。
????因此距境,軟測(cè)量技術(shù)是一種以易于測(cè)量的輔助變量為輸入,以難測(cè)量變量為輸出的數(shù)學(xué)模型垮卓,在過去的幾十年中垫桂,它被發(fā)展用來快速地估計(jì)或預(yù)測(cè)重要變量。[4]
?????? 建立軟測(cè)量模型的方法主要有三種粟按,即基于機(jī)理的方法诬滩,基于知識(shí)的方法,基于數(shù)據(jù)驅(qū)動(dòng)的方法灭将。[5]如果詳細(xì)和準(zhǔn)確的過程機(jī)理是已知的或者關(guān)于過程的豐富經(jīng)驗(yàn)和知識(shí)是可用的疼鸟,前兩種方法可以取得很好的效果。 然而庙曙,工業(yè)過程的日益復(fù)雜使這些先決條件不能再容易地得到滿足空镜。 因此,數(shù)據(jù)驅(qū)動(dòng)建模已成為主流的軟測(cè)量建模方法。[6,7]
???????? 傳統(tǒng)的數(shù)據(jù)驅(qū)動(dòng)軟測(cè)量建模方法主要包括多種統(tǒng)計(jì)推理技術(shù)和機(jī)器學(xué)習(xí)技術(shù)姑裂,例如主成分分析法和回歸模型相結(jié)合的主成分回歸法馋袜,偏最小二乘回歸,支持向量機(jī)和人工神經(jīng)網(wǎng)絡(luò)等舶斧。[8-12] 在過去的20年里欣鳖,隨著在一些關(guān)鍵問題上的技術(shù)突破,具有足夠數(shù)量的隱藏層或具有足夠復(fù)雜結(jié)構(gòu)的網(wǎng)絡(luò)是可用的茴厉,這被稱為深度學(xué)習(xí)(DL)技術(shù)[13,14]泽台。 由于DL技術(shù),允許由多個(gè)處理層組成的計(jì)算模型學(xué)習(xí)具有多個(gè)抽象級(jí)別的數(shù)據(jù)表示矾缓。 這些方法大大改善了語音識(shí)別怀酷、目標(biāo)檢測(cè)和許多其他領(lǐng)域的最新技術(shù),如藥物發(fā)現(xiàn)和基因組學(xué)[15]嗜闻。近年來蜕依,將深度學(xué)習(xí)方法應(yīng)用于軟測(cè)量的研究也越來越多。從傳統(tǒng)的人工智能領(lǐng)域到軟測(cè)量領(lǐng)域琉雳,客觀上存在著許多差異样眠。有許多問題需要調(diào)查和討論(包括但不限于以下問題):在軟測(cè)量場(chǎng)景中是否需要和適合使用深度學(xué)習(xí)技術(shù)玖像?哪些深度學(xué)習(xí)模型可以用于實(shí)際應(yīng)用蜂筹?如何將它們應(yīng)用于解決實(shí)際過程中的問題?未來的潛在的研究點(diǎn)是什么般码?因此束倍,這項(xiàng)工作的動(dòng)機(jī)是盡可能合理地回答這些問題被丧。
???????? 論文的其余部分組織如下。第二節(jié)討論了DL的獨(dú)特優(yōu)點(diǎn)绪妹,并證明了它對(duì)軟測(cè)量建模的必要性甥桂。第三節(jié)概述了幾種典型的DL模型和核心訓(xùn)練技術(shù)。然后在第四節(jié)中研究了使用DL方法的軟測(cè)量應(yīng)用的最新技術(shù)喂急。 討論情況和展望見第五節(jié)格嘁。最后,這項(xiàng)工作的結(jié)論載于第六節(jié)廊移。
2. 深度學(xué)習(xí)對(duì)軟測(cè)量的意義
???????? 關(guān)于常規(guī)方法的詳細(xì)回顧可參見現(xiàn)在的工作糕簿,例如文獻(xiàn)[7,16]狡孔,雖然這些方法已經(jīng)有許多應(yīng)用懂诗,但它們可能存在一些缺點(diǎn),如手工制作的特征工程帶來的繁重工作量或處理大量數(shù)據(jù)時(shí)的效率低下等苗膝。為了證明DL在軟測(cè)量建模中的意義殃恒,應(yīng)討論DL的不同優(yōu)點(diǎn)和工業(yè)過程的趨勢(shì)或特點(diǎn)。
A.深度學(xué)習(xí)技術(shù)的優(yōu)點(diǎn)
????首先,一個(gè)具有單個(gè)隱藏層的簡(jiǎn)單網(wǎng)絡(luò)的結(jié)構(gòu)如圖1所示离唐。.它有三個(gè)層病附,即一個(gè)輸入層、隱藏層和輸出層亥鬓。輸入層包含變量下x1到xm和一個(gè)常量節(jié)點(diǎn)“1”完沪。隱藏層有許多節(jié)點(diǎn),并且每個(gè)節(jié)點(diǎn)都有一個(gè)激活函數(shù)嵌戈。每個(gè)節(jié)點(diǎn)的特征通過原始輸入層的仿射變換和激活的函數(shù)變換提取覆积,定義如下公式:
????根據(jù)普遍逼近理論,如果隱藏層中有足夠的節(jié)點(diǎn)熟呛,則由圖1中所示的網(wǎng)絡(luò)表示的函數(shù)宽档。可以近似任何連續(xù)函數(shù)[17-19]庵朝。 此外吗冤,使用多層神經(jīng)元來表示某些功能要簡(jiǎn)單得多。
?????? 與傳統(tǒng)的軟測(cè)量建模方法相比九府,深度學(xué)習(xí)有它自己的優(yōu)勢(shì)欣孤。在這里,我們將它們大致分為三類:基于規(guī)則的系統(tǒng)昔逗、經(jīng)典的機(jī)器學(xué)習(xí)和淺層表示學(xué)習(xí)。它們之間的差異如圖2所示篷朵,其中綠色方塊表示能夠從數(shù)據(jù)中學(xué)習(xí)信息的組件勾怒。[21]
????綜上所述,與傳統(tǒng)算法相比声旺,深度學(xué)習(xí)技術(shù)的優(yōu)點(diǎn)主要在于(i)沒有知識(shí)或經(jīng)驗(yàn)要求的學(xué)習(xí)表示笔链,以及(ii)充分利用大量的數(shù)據(jù)來提高性能∪總之鉴扫,根據(jù)充分的文獻(xiàn)研究和我們最好的知識(shí),得出了工業(yè)過程發(fā)展的兩個(gè)主要趨勢(shì):(一)它們?cè)絹碓綇?fù)雜澈缺,不斷變化坪创;(二)產(chǎn)生和存儲(chǔ)了大量的過程數(shù)據(jù)。 在這種情況下姐赡,第二節(jié)討論的深度學(xué)習(xí)技術(shù)的特點(diǎn)與這兩種趨勢(shì)完全吻合莱预。 首先,深度學(xué)習(xí)可以避免復(fù)雜的特征工程项滑,并自動(dòng)學(xué)習(xí)抽象表示(圖2)依沮。 其次,深度學(xué)習(xí)可以充分利用大量的數(shù)據(jù),有效地提高建模性能(圖3)危喉。 這就是為什么深度學(xué)習(xí)技術(shù)具有重要意義宋渔,并將越來越重要的軟測(cè)量應(yīng)用。????
????????然后文章介紹了深度學(xué)習(xí)的各種常規(guī)模型框架辜限,例如SAE皇拣、RBM、DBN 列粪、CNN等审磁,這里就不再一一贅述。
3.開發(fā)DL模型的一般技巧
?????? 雖然深度學(xué)習(xí)具有巨大的潛力岂座,但有效地訓(xùn)練具有滿意泛化性能的深度模型可能是非常具有挑戰(zhàn)性的态蒂。其原因主要在于深層結(jié)構(gòu)引起的過度擬合和梯度消失問題。 為了克服或減輕這些問題费什,在訓(xùn)練深度模型時(shí)钾恢,以下幾個(gè)技巧應(yīng)該是有幫助的。
正則化
?????? 正則化是克服高方差問題鸳址,即過擬合的有效工具瘩蚪。一種直接的方法是用參數(shù)范數(shù)懲罰來規(guī)范代價(jià)函數(shù),例如[if !msEquation][endif]正則化稿黍。當(dāng)最小化成本函數(shù)時(shí)疹瘦,這些參數(shù)也會(huì)被限制為不太大。
數(shù)據(jù)集增強(qiáng)
?????? 獲取更多的數(shù)據(jù)用于訓(xùn)練機(jī)器學(xué)習(xí)模型是提高其泛化性能的最佳方法巡球。雖然從真實(shí)場(chǎng)景中收集大量數(shù)據(jù)可能不容易言沐,但創(chuàng)建新的假數(shù)據(jù)對(duì)于一些特定的任務(wù)是有意義的,例如對(duì)象識(shí)別 [69]酣栈、語音識(shí)別[70]险胰。 將噪聲引入輸入層也可以看作是一種數(shù)據(jù)增強(qiáng)[71,72]。
早停
?????? 訓(xùn)練過程的成本通常先下降矿筝,然后隨著學(xué)習(xí)的進(jìn)一步進(jìn)行而增加起便,這意味著發(fā)生了過擬合。為了避免這個(gè)問題窖维,每次出現(xiàn)更好的驗(yàn)證誤差時(shí)榆综,都應(yīng)該保存參數(shù)設(shè)置,以便在所有訓(xùn)練步驟都完成之后[73]陈辱,返回性能最佳的點(diǎn)奖年。因此,早期停止策略可以防止參數(shù)的過度學(xué)習(xí)沛贪。
稀疏性的表示法
?????? 另一種參數(shù)懲罰方法是約束激活單元陋守,它將間接地對(duì)參數(shù)的復(fù)雜性施加懲罰震贵。與普通正則化相似,在代價(jià)函數(shù)中加入了基于隱藏單元激活狀態(tài)的懲罰項(xiàng)水评。為了獲得相對(duì)較小的成本猩系,神經(jīng)元激活的概率應(yīng)該盡可能小[74]。其他方法中燥,如KL離散度懲罰或?qū)せ钪凳┘佑布s束也被應(yīng)用寇甸。
丟棄法
?????? 丟棄法是一種集成式策略×粕妫基本原則是刪除非輸出單元(例如拿霉。從基本網(wǎng)絡(luò)將輸出乘以零)形成幾個(gè)子網(wǎng)絡(luò)。每個(gè)輸入單元和隱藏單元都按照一個(gè)采樣概率包含在內(nèi)咱扣,從而保證子模型的隨機(jī)性和多樣性绽淘。集合權(quán)值通常是根據(jù)子模型[78]的概率來得到的。另一個(gè)顯著的優(yōu)點(diǎn)是闹伪,對(duì)適用的模型或訓(xùn)練過程幾乎沒有什么限制沪铭。但是,如果只有少數(shù)數(shù)據(jù)[79]偏瓤,效果不是很好杀怠。
批量標(biāo)準(zhǔn)化
?????? 批歸一化是一種自適應(yīng)再參數(shù)化方法,旨在更好地訓(xùn)練極深的網(wǎng)絡(luò)[80]厅克。 訓(xùn)練時(shí)赔退,深層網(wǎng)絡(luò)中隱藏層的參數(shù)會(huì)不斷變化,導(dǎo)致內(nèi)部協(xié)變量移位問題证舟。一般來說离钝,整體分布會(huì)逐漸靠近非線性函數(shù)值區(qū)間的上下限。因此褪储,當(dāng)進(jìn)行反向傳播時(shí),梯度很容易消失慧域。采用批量歸一化鲤竹,對(duì)每個(gè)單元的均值和方差進(jìn)行標(biāo)準(zhǔn)化,以穩(wěn)定學(xué)習(xí)昔榴,但允許單元之間的關(guān)系和單個(gè)單元的非線性統(tǒng)計(jì)發(fā)生變化辛藻。
開發(fā)深度學(xué)習(xí)算法的框架
?????? 為了更好地實(shí)現(xiàn)深度學(xué)習(xí)算法的發(fā)展,有幾個(gè)開源框架可用互订,這些框架可能包括最先進(jìn)的算法或設(shè)計(jì)良好的底層網(wǎng)絡(luò)元素吱肌,例如, TensorFlow [81], Caffe [82], Theano [83], CNTK [84], Keras [85],Pytorch等。這些平臺(tái)的比較如表2所示仰禽。
總結(jié)來說就是pytorch作為近年來新興起的深度學(xué)習(xí)框架氮墨,其簡(jiǎn)潔纺蛆、快捷、易用规揪,并且擁有活躍的社區(qū)桥氏,越來越得到學(xué)術(shù)工作者的青睞。
????????深度學(xué)習(xí)算法的成功開發(fā)實(shí)際上是一個(gè)高度迭代的過程猛铅,可以概括為圖8字支。對(duì)于軟測(cè)量應(yīng)用,第一步是發(fā)現(xiàn)實(shí)際工業(yè)過程中存在的需求或問題(如半監(jiān)督學(xué)習(xí)奸忽、動(dòng)態(tài) 建模堕伪、缺失數(shù)據(jù)等。) 試著想出一個(gè)值得嘗試的新主意栗菜。接下來需要做的是用開源框架或工具包對(duì)其進(jìn)行編碼欠雌。之后,數(shù)據(jù)被收集并輸入到程序中苛萎,以獲得一個(gè)結(jié)果桨昙,告訴設(shè)計(jì)者這個(gè)特定的算法或配置工作得有多好‰缜福基于結(jié)果蛙酪,設(shè)計(jì)者應(yīng)該細(xì)化思想,改變策略翘盖,找到更好的神經(jīng)網(wǎng)絡(luò)桂塞。 然后重復(fù)該過程,迭代改進(jìn)方案馍驯,直至達(dá)到理想效果阁危。
????????為了幫助讀者了解最新的進(jìn)展,并更好地開發(fā)高性能的軟測(cè)量模型汰瘫,本文綜述了基于深度學(xué)習(xí)技術(shù)的應(yīng)用狂打。對(duì)現(xiàn)有的工作進(jìn)行了介紹和討論,主要突出了動(dòng)機(jī)混弥、策略和有效性等因素趴乡。 以下內(nèi)容根據(jù)各項(xiàng)工作的所屬的主流模型展開。
A.基于自編碼器(AE)的應(yīng)用
?????? AE及其變體被廣泛用于構(gòu)建半監(jiān)督學(xué)習(xí)的軟測(cè)量模型和處理工業(yè)過程數(shù)據(jù)缺失問題蝗拿。此外晾捏,結(jié)合傳統(tǒng)的機(jī)器學(xué)習(xí)算法可以獲得優(yōu)異的性能。
????????由于AE是一種無監(jiān)督學(xué)習(xí)模型哀托,它通常被修改為半監(jiān)督或監(jiān)督形式惦辛,以完成預(yù)測(cè)任務(wù)。 例如仓手,在[87]中使用變分自動(dòng)編碼器(VAE)建立了半監(jiān)督概率隱變量回歸模型胖齐。一種常見的方法是將標(biāo)簽變量的監(jiān)督引入到編碼和解碼過程中玻淑。在[88]中,提出了一種可變加權(quán)SAE(VW-SAE),在預(yù)訓(xùn)練時(shí)引入每個(gè)隱藏層的輸入與質(zhì)量標(biāo)簽之間的線性皮爾遜系數(shù)市怎,以便以半監(jiān)督的方式提取特征岁忘。此外,還采用了基于非線性關(guān)系的技術(shù)区匠,如相互信息[89]干像,以更好地提取特征表示。然而驰弄,線性和非線性關(guān)系都是人為指定的麻汰,可能是不充分的或不合適的。因此戚篙,一種相對(duì)更智能和自動(dòng)化的方法是將質(zhì)量標(biāo)簽的預(yù)測(cè)損失添加到預(yù)訓(xùn)練成本[90]中五鲫。此外,還可以采用其他策略來構(gòu)建隱藏層和標(biāo)簽值之間的連接岔擂。 Sun等人使用門控單元測(cè)量不同隱藏層中特征的貢獻(xiàn)位喂,更好地控制隱藏層與輸出層[91]之間的信息流。 此外乱灵,在只有少量標(biāo)記樣本和過量未標(biāo)記樣本的半監(jiān)督場(chǎng)景下塑崖,提出了一種考慮數(shù)據(jù)多樣性和結(jié)構(gòu)多樣性的雙集成學(xué)習(xí)方法[92]。
?????? 數(shù)據(jù)缺失是工業(yè)軟測(cè)量設(shè)計(jì)中最常見的問題之一痛倚。作為自編碼器的一種變體规婆,變分自編碼器(VAE)在學(xué)習(xí)數(shù)據(jù)分布和處理缺失數(shù)據(jù)問題方面表現(xiàn)良好。 例如蝉稳,基于VAE和Wasserstein GAN,提出了一個(gè)名為VA-WGAN的生成模型抒蚜,它可以從工業(yè)過程中生成相同分布的真實(shí)數(shù)據(jù),這是傳統(tǒng)回歸模型[93]難以實(shí)現(xiàn)的耘戚。在[94]中嗡髓,利用VAE為即時(shí)建模方法提取每個(gè)特征變量的分布,并通過數(shù)值案例和工業(yè)過程驗(yàn)證了該方法的有效性收津。此外器贩,作者還提出了一種用于即時(shí)軟測(cè)量應(yīng)用的輸出相關(guān)VAE,旨在處理缺失的數(shù)據(jù)[95]朋截,從而豐富了該理論。與前者不同的是吧黄,在一種新的軟測(cè)量框架中使用了兩種 VAE部服,這種框架也側(cè)重于缺失的數(shù)據(jù)[96]。 第一個(gè)被命名為監(jiān)督深層VAE的設(shè)計(jì)是為了獲得潛在特征的分布拗慨,它被用作第二個(gè)被稱為修改后的無監(jiān)督深層VAE的先驗(yàn)特征廓八。通過將第一個(gè)編碼器與第二個(gè)解碼器相結(jié)合構(gòu)造整個(gè)框架奉芦,并且在缺失數(shù)據(jù)情況下效果很好。
?????? 在一些案例中剧蹂,AEs可以通過將其與其他方法相結(jié)合或改進(jìn)其學(xué)習(xí)策略來更好地工作声功。 例如,姚等人實(shí)現(xiàn)了一個(gè)用于無監(jiān)督特征提取的深度自編碼器網(wǎng)絡(luò)宠叼,然后利用極限學(xué)習(xí)機(jī)進(jìn)行回歸任務(wù)[97]先巴。 Wang等人采用有限記憶Broyden-Fletcher-Goldfarb-Shanno算法對(duì)SAE學(xué)習(xí)到的權(quán)重參數(shù)進(jìn)行優(yōu)化,然后將提取的特征輸入到支持向量回歸(SVR)模型中冒冬,用于估計(jì)空氣預(yù)熱器[98]的轉(zhuǎn)子受損情況伸蚯。 而不是使用純數(shù)據(jù)驅(qū)動(dòng)的模型,Wang等人將一個(gè)名為L(zhǎng)ab模型的基于知識(shí)的模型(KDM)與一個(gè)數(shù)據(jù)驅(qū)動(dòng)模型 (DDM)結(jié)合起來简烤,即堆疊自動(dòng)編碼器剂邮,實(shí)驗(yàn)結(jié)果驗(yàn)證了該混合方法優(yōu)于只使用KDM或DDM[99]。嚴(yán)等人使用了一種改進(jìn)的梯度下降算法提出了一種基于DAE的方法横侦,與傳統(tǒng)淺層學(xué)習(xí)方法[100]相比是有效的挥萌。此外,針對(duì)自適應(yīng)地建模時(shí)變過程枉侧,提出了一種基于SAE的軟測(cè)量結(jié)構(gòu)的即時(shí)微調(diào)框架引瀑。
B. 基于受限Boltzmann機(jī)的應(yīng)用
????非線性是工業(yè)過程中普遍存在的特征。 為此棵逊,RBM及其變體特別是DBN伤疙,在工業(yè)過程建模中通常被用作無監(jiān)督的非線性特征提取器。預(yù)測(cè)器可以利用RBM或DBN學(xué)習(xí)到的特征辆影,SVR和BPNN是兩種常見的預(yù)測(cè)器徒像。例如,為了解決燃煤鍋爐過程中多變量之間的高非線性和強(qiáng)相關(guān)性問題[102]蛙讥,提出了一種采 用連續(xù)RBM(CRBM)和SVR算法的新型深度結(jié)構(gòu)锯蛀。由Lian等人提出了一項(xiàng)相關(guān)工作,利用DBN和SVR與改進(jìn)的粒子群優(yōu)化來完成轉(zhuǎn)子熱變形預(yù)測(cè)[103]的任務(wù)次慢。在[104]中提出了一種基于DBN和BPNN的軟測(cè)量模型來預(yù)測(cè)純化的對(duì)苯二甲酸工業(yè)生產(chǎn)過程中的4-羧基-苯丙二酸的濃度旁涤。面對(duì)非線性系統(tǒng)建模的復(fù)雜性和非線性,在[105]中提出了一種基于RBM 的改進(jìn)BPNN迫像。在這項(xiàng)工作中利用靈敏度分析和互信息理論對(duì)BPNN 的結(jié)構(gòu)進(jìn)行了優(yōu)化劈愚,并利用RBM對(duì)參數(shù)進(jìn)行了初始化。然而在[106]中DBN被用來學(xué)習(xí)BPNN的層次特征闻妓,它是為了建模在球磨機(jī)生產(chǎn)過程中提取的特征與軋機(jī)水平之間的關(guān)系而構(gòu)建的菌羽。除了SVR和BPNN之外,極限學(xué)習(xí)機(jī)(ELM)還可以根DBN提取的特征作為預(yù)測(cè)器由缆。應(yīng)用此方法實(shí)現(xiàn)了無土栽培[107] 營(yíng)養(yǎng)液組成的測(cè)定注祖。
????????為了克服數(shù)據(jù)豐富但信息貧乏的問題猾蒂,RBM可以用于集成學(xué)習(xí)。例如是晨,鄭等人提出了一種將集成策肚菠、DBN和 Correntropy核回歸集成到一個(gè)統(tǒng)一軟測(cè)量框架中[108]。 同樣罩缴,集成深度核學(xué)習(xí)模型是在工業(yè)聚合過程中提出的蚊逢,它采用DBN進(jìn)行無監(jiān)督信息提取 [109]。 在另一個(gè)案例當(dāng)中靴庆,缺乏標(biāo)記樣本也會(huì)導(dǎo)致信息缺乏时捌,這可以通過使用DBN的半監(jiān)督學(xué)習(xí)來解決,就像[110] 中提出的工作一樣炉抒。 在[111]中奢讨,針對(duì)標(biāo)記數(shù)據(jù)稀缺性、計(jì)算復(fù)雜度降低和無監(jiān)督特征提取焰薄,設(shè)計(jì)了一種基于DBN的軟測(cè)量框架拿诸。
????????RBM也有一些其他有趣的應(yīng)用。 Graziani等人為工廠過程設(shè)計(jì)了一種基于DBN的軟測(cè)量系統(tǒng)塞茅,以估計(jì)未知的測(cè)量延遲亩码,而不是質(zhì)量變量[112]。 另一個(gè)基于DBN的模型被應(yīng)用于處理火焰圖像野瘦,而不是常見的結(jié)構(gòu)數(shù)據(jù)描沟,用于工業(yè)燃燒過程中的氧含量預(yù)測(cè)[113]。朱等人 研究了DBN結(jié)構(gòu)在工業(yè)聚合過程中軟測(cè)量應(yīng)用的選擇鞭光。 通過與前饋神經(jīng)網(wǎng)絡(luò)的比較吏廉,基于DBN的方法可以更準(zhǔn)確地預(yù)測(cè)聚合物熔體指數(shù)[114]。
C. 基于卷積神經(jīng)網(wǎng)絡(luò)的應(yīng)用
?????? CNNs主要用于處理網(wǎng)格狀數(shù)據(jù)惰许,特別是圖像數(shù)據(jù)席覆。此外,還可以開發(fā)它們來捕獲工業(yè)過程數(shù)據(jù)或過程信號(hào)在頻域的局部動(dòng)態(tài)特性汹买。通過處理圖像數(shù)據(jù)佩伤,CNN可以被用來構(gòu)造軟測(cè)量系統(tǒng)。例如晦毙,Horn等人利用CNN提取泡沫浮選測(cè)量中的特征生巡,顯示出良好的特征提取速度和預(yù)測(cè)性能[115]。然而见妒,與常見的數(shù)據(jù)形式相比孤荣,圖像仍然很少用于軟測(cè)量的構(gòu)建。
????????關(guān)于動(dòng)態(tài)問題,袁等人還提出了多通道CNN(MCNN)在工業(yè)脫丁烷塔和加氫裂化過程中的軟測(cè)量應(yīng)用垃环,可以學(xué)習(xí)不同變量組合的相互作用和各種局部相關(guān)性[116]。此外返敬,王等人 使用兩個(gè)基于CNN的軟傳測(cè)量模型來處理豐富的過程數(shù)據(jù)遂庄,以保持低復(fù)雜度,同時(shí)包含其過程動(dòng)態(tài)[117]劲赠。 在[118]中提出了一種利用卷積神經(jīng)網(wǎng)絡(luò)的軟測(cè)量系統(tǒng)涛目,它通過從移動(dòng)窗口中提取時(shí)間相關(guān)的相關(guān)性來預(yù)測(cè)下一步的測(cè)量。
????????在頻域上凛澎,CNN可以獲得對(duì)信號(hào)平移霹肝、縮放和失真的高不變性。在[119]中塑煎,在網(wǎng)絡(luò)末層利用一對(duì)卷積層和最大池層從軋機(jī)軸承的振動(dòng)光譜特征中提取高層次的抽象特征沫换。然后利用ELM學(xué)習(xí)從提取的特征到磨坊水平的映射。 在航空航天工程領(lǐng)域最铁,提出了一種利用CNN進(jìn)行部分振動(dòng)測(cè)量的虛擬傳感器模型讯赏,用于估計(jì)結(jié)構(gòu)響應(yīng),這對(duì)于結(jié)構(gòu)健康監(jiān)測(cè)和損傷檢測(cè)很重要冷尉,但物理傳感器在相應(yīng)的操作條件是有限的[120]漱挎。
D. 基于循環(huán)神經(jīng)網(wǎng)絡(luò)的應(yīng)用
?????? RNN被廣泛用于動(dòng)態(tài)建模,各種變體如LSTM也被應(yīng)用于實(shí)際情況開發(fā)了基于RNN的軟測(cè)量系統(tǒng)來估計(jì)具有較強(qiáng)動(dòng)態(tài)特性的變量雀哨,如環(huán)氧/石墨纖維復(fù)合材料的固化[121]磕谅、汽車輪胎與地面的接觸面積[122]、地鐵的室內(nèi)空氣質(zhì)量(IA Q) [123]雾棺、注射成型過程中的熔體流動(dòng)長(zhǎng)度[124]膊夹、生物質(zhì)濃度 [125]、反應(yīng)精餾塔的產(chǎn)品濃度[126]垢村。
????????除了基于普通RNN的方法外割疾,LSTM也是軟測(cè)量應(yīng)用中的 一種流行模型,由于長(zhǎng)期依賴減弱嘉栓,LSTM可以更深入宏榕、更強(qiáng)大。 例如侵佃,提出了一種基于LSTM的軟測(cè)量模型麻昼,以應(yīng)對(duì)[127]過程的強(qiáng)非線性和動(dòng)力學(xué)。此外馋辈,袁等人提出了一種有監(jiān)督的LSTM網(wǎng)絡(luò)抚芦,該網(wǎng)絡(luò)利用輸入變量和質(zhì)量變量來學(xué)習(xí)動(dòng)態(tài)隱藏狀態(tài),該方法在青霉素發(fā)酵過程和工業(yè)去尿劑柱[128]中得到了有效的應(yīng)用。 此外叉抡,還利用 LSTM網(wǎng)絡(luò)對(duì)污水處理廠氮源組分的含量進(jìn)行了預(yù)測(cè)[129]尔崔。
????????還有其他變體是為特定的工業(yè)應(yīng)用而設(shè)計(jì)的。例如褥民,一個(gè)采用批量歸一化和丟棄技巧的兩流網(wǎng)絡(luò)結(jié)構(gòu)被設(shè)計(jì)用來學(xué)習(xí)各種過程數(shù)據(jù)的不同特征[130]季春。 在[131]中,另一種稱為時(shí)間延遲神經(jīng)網(wǎng)絡(luò)(TDNN)的RNN被實(shí)現(xiàn)用于理想反應(yīng)精餾塔的推理狀態(tài)估計(jì)消返。 此外载弄,Echo狀態(tài)網(wǎng)(ESN)作為一種RNN,也被用于高密度聚乙烯(HDPE)生產(chǎn)過程和純化對(duì)苯二甲酸(PTA)生產(chǎn)過程中的軟測(cè)量應(yīng)用[132]撵颊。 利用奇異值分解(SVD)宇攻,解決了共線性和過擬合問題。 最近倡勇,[133]種提出了一種將SAE與雙向LSTM(BLSTM)相結(jié)合的集成半監(jiān)督模型逞刷。該方法不僅可以提取和利用標(biāo)記數(shù)據(jù)和未標(biāo)記數(shù)據(jù)中的短暫行為,而且可以考慮質(zhì)量度量自身中隱藏的時(shí)間依賴性译隘。同時(shí)亲桥,在[134]中提出了基于GRU的魯棒動(dòng)態(tài)特征自動(dòng)深度提取方法,并在脫丁烷精餾過程中取得了良好的性能固耘。
半監(jiān)督建模
????????在[135]中將流形嵌入集成到深度神經(jīng)網(wǎng)絡(luò)(DNN)中题篷,構(gòu)建了一個(gè)半監(jiān)督框架,其中流形嵌入利用了工業(yè)數(shù)據(jù)之間的局部鄰域關(guān)系厅目,提高了深度神經(jīng)網(wǎng)絡(luò)中未標(biāo)記數(shù)據(jù)的利用效率番枚。此外,[136]還提出了一種基于極限學(xué)習(xí)機(jī)的即時(shí)半監(jiān)督軟傳感器损敷,用多種配方在線估計(jì)中的門尼粘度葫笼。
動(dòng)態(tài)建模
????????除了CNN和RNN,還有一些其他的神經(jīng)網(wǎng)絡(luò)用于動(dòng)態(tài)建模拗馒。Graziani等人路星。提出了一種基于動(dòng)態(tài)DNN的軟測(cè)量模型,用于估算煉油廠轉(zhuǎn)化爐裝置的辛烷值诱桂,并研究了非線性有限輸入響應(yīng)模型[137]酵紫。Wang等人提出了一種名為NARX-DNN的動(dòng)態(tài)網(wǎng)絡(luò)罢坝,它可以從不同的方面解釋驗(yàn)證數(shù)據(jù)的質(zhì)量預(yù)測(cè)誤差,并自動(dòng)確定歷史數(shù)據(jù)的最合適的延遲[138]。此外僧界,在[139]中還采用動(dòng)態(tài)策略來提高極限學(xué)習(xí)機(jī)的動(dòng)態(tài)捕獲性能最盅,并將其與 PLS結(jié)合葵蒂。
數(shù)據(jù)生成
????????由于工業(yè)過程環(huán)境惡劣担敌,直接收集數(shù)據(jù)可能很困難郭宝。因此,[140]中提出了一種基于生成對(duì)抗性網(wǎng)絡(luò)的數(shù)據(jù)生成方法掷漱。
消除冗余
????????在[141]中粘室,將雙最小絕對(duì)收縮選擇算子(dLASSO)算法集成到多層感知器(MLP)網(wǎng)絡(luò)中,解決了輸入變量冗余和模型結(jié)構(gòu)冗余兩個(gè)冗余問題卜范。
推斷和近似
????????由于學(xué)習(xí)能力強(qiáng)育特,深層神經(jīng)網(wǎng)絡(luò)可以用于智能控制目的。例如先朦,[142]中設(shè)計(jì)了一種基于Levenberg-Marquart和自適應(yīng)線性網(wǎng)絡(luò)的軟測(cè)量模型,并將其應(yīng)用于多組分精餾過程的推理控制犬缨。此外喳魏,還利用自適應(yīng)模糊均值算法進(jìn)化了一個(gè)徑向基函數(shù)(RBF)神經(jīng)網(wǎng)絡(luò),其目的是逼近一個(gè)未知的系統(tǒng)[143]怀薛。
現(xiàn)有應(yīng)用的總結(jié)
????????開發(fā)基于DL的新型軟測(cè)量的目的包括特征提取刺彩、解決缺失值問題、動(dòng)態(tài)特征捕獲枝恋、半監(jiān)督建模等(如表所示1). 值得注意的是只是詳細(xì)討論了軟測(cè)量領(lǐng)域的現(xiàn)有應(yīng)用创倔,這并不意味著尚未出現(xiàn)在軟測(cè)量領(lǐng)域的應(yīng)用是不可行的。例如焚碌,雖然VAE是用 DL處理軟測(cè)量應(yīng)用中缺失值問題的主流方法畦攘,但基于RBM 和GAN的方法在其他領(lǐng)域也是可行的[144,145]。 為了去設(shè)計(jì)可行的模型采用了不同的策略十电,如優(yōu)化網(wǎng)絡(luò)結(jié)構(gòu)知押、改進(jìn)訓(xùn)練算法、集成不同的算法等鹃骂。
????????從以上小節(jié)中討論的應(yīng)用台盯,可以進(jìn)一步總結(jié)一些要點(diǎn)。 首先畏线,使用DL方法的軟測(cè)量應(yīng)用的統(tǒng)計(jì)數(shù)據(jù)可以在圖中9看到静盅。根據(jù)第四節(jié)討論和引用的總共57條參考資料。 從圖(a)可以看出寝殴,近年來基于DL理論的算法越來越多蒿叠,這反映了在實(shí)際工業(yè)過程建模中對(duì)DL模型的需求越來越大。此外杯矩,與其他3種主要理論相比栈虚,基于CNN的方法應(yīng)用較少。 這是因?yàn)橄窬W(wǎng)格一樣的數(shù)據(jù)史隆,如圖像魂务,更多地用于分類,而不是回歸任務(wù)。此外粘姜,雖然AE看起來比其他主要模型簡(jiǎn)單鬓照,但它更容易開發(fā)和擴(kuò)展,因此它也有很大的潛力孤紧。
???????如圖(b)所示豺裆。基于DL理論的軟測(cè)量模型在多種場(chǎng)景下構(gòu)建号显,包括化工臭猜、電力工業(yè)、機(jī)械制造押蚤、航空航天工程等蔑歌。其中,化工應(yīng)用占比例最大揽碘,約66.7%次屠。
????????通過進(jìn)行數(shù)值模擬實(shí)驗(yàn)(例如[95],[116]等)雳刺,驗(yàn)證了本調(diào)查中介紹的大多數(shù)工作的有效性劫灶。或使用公開的基準(zhǔn)數(shù)據(jù)集(例如[139])掖桦,或通過從實(shí)際過程中的建模數(shù)據(jù)集(例如[93]本昏,[94],[95]枪汪,[110]凛俱,[116],[123]等)料饥。 最常見的情況是第三種類型蒲犬,它可以盡可能多地反映真實(shí)過程的特征。例如岸啡,在化工領(lǐng)域原叮,實(shí)際運(yùn)行數(shù)據(jù)是從脫丁烷過程[96]、聚合過程[109]巡蘸、加氫裂化過程[116]中收集的奋隶。然而,當(dāng)將這些軟測(cè)量應(yīng)用于真實(shí)場(chǎng)景時(shí)悦荒,需要考慮更詳細(xì)和具體的因素唯欣。
????????雖然深度學(xué)習(xí)在許多領(lǐng)域都取得了很大的進(jìn)步,但要更好地將先進(jìn)的方法應(yīng)用于軟測(cè)量領(lǐng)域搬味,特別是滿足實(shí)際工業(yè)過程中的需求境氢,還有很多工作要做蟀拷。數(shù)據(jù)和結(jié)構(gòu)是需要一直考慮的兩個(gè)最重要的問題。 圍繞這兩個(gè)課題萍聊,未來一些熱點(diǎn)研究方向應(yīng)該得到更多的關(guān)注问芬。
缺少標(biāo)記樣本
?????? 雖然在大數(shù)據(jù)的趨勢(shì)下,數(shù)據(jù)很容易獲得寿桨,但標(biāo)記成本仍然非常昂貴此衅。因此,我們總是希望使用較少的標(biāo)記樣本可以訓(xùn)練一個(gè)具有良好泛化能力的模型亭螟。這個(gè)問題的傳統(tǒng)解決方法是使用半監(jiān)督的學(xué)習(xí)方法挡鞍,然而未標(biāo)記數(shù)據(jù)和標(biāo)記數(shù)據(jù)之間越來越多的嚴(yán)重不平衡問題使其不那么令人滿意。自監(jiān)督學(xué)習(xí)(SSL)是另一種可行的解決方案预烙,是一種無監(jiān)督策略[146]匕累。與遷移學(xué)習(xí)不同[32,33],有用的特征表示是從從未標(biāo)記的輸入數(shù)據(jù)設(shè)計(jì)的前置任務(wù)中學(xué)習(xí)的(而不是從其他類似的數(shù)據(jù)集)默伍。 對(duì)比方式是SSL最流行的類型之一,在語音衰琐、圖像也糊、文本和增強(qiáng)學(xué)習(xí)領(lǐng)域取得了一些巨大的成就[148]。 然而羡宙,其軟測(cè)量應(yīng)用仍有許多研究和探索工作有待完成狸剃。
超參數(shù)優(yōu)化
???????? 長(zhǎng)期以來,如何優(yōu)化網(wǎng)絡(luò)的超參數(shù)和結(jié)構(gòu)是研究人員和工程師的一個(gè)難題[106,114,141]狗热。 并且這些工作大多需要人工試驗(yàn)钞馁。為了避免工作量大和隨機(jī)性大,元學(xué)習(xí)理論被提出與研究匿刮,這也被稱為“學(xué)習(xí)到學(xué)習(xí)”[148]僧凰。其動(dòng)機(jī)是提供具有像人一樣學(xué)習(xí)能力的機(jī)器。元學(xué)習(xí)不是為特定任務(wù)學(xué)習(xí)單個(gè)函數(shù)熟丸,而是學(xué)習(xí)一個(gè)函數(shù)來輸出幾個(gè)子任務(wù)的函數(shù)训措。同時(shí),元學(xué)習(xí)需要許多子任務(wù)光羞,每個(gè)子任務(wù)都有自己的訓(xùn)練集和測(cè)試集绩鸣。經(jīng)過有效的訓(xùn)練,機(jī)器可以擁有優(yōu)化超參數(shù)的能力纱兑,并且可以自行選擇網(wǎng)絡(luò)結(jié)構(gòu)呀闻。這對(duì)多模態(tài)和不斷變化的過程很有吸引力。
模型可靠性
?????? 深度學(xué)習(xí)方法以端到端的方式學(xué)習(xí)特征潜慎,這增加了工程師或設(shè)計(jì)師理解他們學(xué)到了什么和如何學(xué)習(xí)的難度捡多。此外蓖康,學(xué)習(xí)過程對(duì)數(shù)據(jù)的依賴增加了數(shù)據(jù)質(zhì)量差造成的不準(zhǔn)確。 這兩個(gè)因素都對(duì)DL模型的可靠性構(gòu)成威脅局服。 因此钓瞭,提高模型的可靠性是很重要的,模型可視化[149,150]與經(jīng)驗(yàn)或知識(shí)[151]相結(jié)合是兩種可行的方法淫奔。 模型可視化有助于研究人員理解所學(xué)知識(shí)山涡,而引入經(jīng)驗(yàn)或知識(shí)有助于減少僅僅依賴數(shù)據(jù)帶來的不準(zhǔn)確。然而唆迁,這兩點(diǎn)需要更多實(shí)際的工業(yè)應(yīng)用的研究鸭丛。
分布式并行建模
?????? 隨著第二節(jié)中討論的工業(yè)大數(shù)據(jù)的趨勢(shì),如何從大量數(shù)據(jù)中有效地建模該過程是一個(gè)重要而緊迫的問題唐责。一個(gè)可行的解決方案是將原始的深度學(xué)習(xí)模型轉(zhuǎn)換為分布式和并行建模鳞溉。通過將一個(gè)大的數(shù)據(jù)集分割成幾個(gè)小的分布式塊,可以同時(shí)進(jìn)行數(shù)據(jù)處理鼠哥,這有利于大規(guī)模的數(shù)據(jù)建模[152,153]熟菲。然而,到目前為止朴恳,還有很長(zhǎng)的路要走抄罕。
結(jié)論
?????? 深度學(xué)習(xí)技術(shù)在許多領(lǐng)域包括軟測(cè)量領(lǐng)域都顯示出了它巨大的潛力。為了總結(jié)過去于颖,分析現(xiàn)在呆贿,展望未來,在本工作中森渐,我們對(duì)深度學(xué)習(xí)理論在軟測(cè)量領(lǐng)域的應(yīng)用做出了以下貢獻(xiàn):(i)對(duì)深度學(xué)習(xí)的優(yōu)點(diǎn)與傳統(tǒng)算法的比較以及工業(yè)過程的發(fā)展趨勢(shì)進(jìn)行了詳細(xì)的討論做入,證明了深度學(xué)習(xí)算法在軟測(cè)量建模中的必要性和意義;(ii)討論并總結(jié)了主要的DL模型同衣、技巧和框架/工具包竟块,以幫助讀者更好地開發(fā)基于DL的軟傳感器;(iii)通過回顧和討論現(xiàn)有的工作或出版物耐齐,分析了實(shí)際的應(yīng)用場(chǎng)景彩郊;(iv)對(duì)未來工作的可能研究熱點(diǎn)進(jìn)行了研究。
?????? 我們希望這篇論文作為一種分類學(xué)蚪缀,也是一篇從大量基于深度學(xué)習(xí)的軟測(cè)量的工作中闡明的進(jìn)展教程秫逝,并為社區(qū)提供路線圖和未來努力的事項(xiàng)的藍(lán)圖。
參考文獻(xiàn):
[1] B. Huang, and R. Kadali, Dynamic Modeling, Predictive Control and
Performance Monitoring, Springer London, 2008.
[2] X. Wang, B. Huang, and T. Chen, “Multirate Minimum Variance Control
Design and Control Performance Assessment: A Data-Driven Subspace
Approach,” IEEE. T. Contr. Syst. T., vol. 15, no. 1, pp. 65-74, 2006.
[3] Z. Chen, S. X. Ding, T. Peng, C. Yang, and W. Gui, “Fault Detection for
Non-Gaussian Processes Using Generalized Canonical Correlation
Analysis and Randomized Algorithms,” IEEE. T. Ind. Electron., vol. 65,
no. 2, pp. 1559-1567, 2018.
[4] Y. Jiang, S. Yin, J. Dong, O. Kaynak, “A Review on Soft Sensors for
Monitoring, Control and Optimization of Industrial Processes,” IEEE
Sensors Journal, 2020, doi: 10.1109/JSEN.2020.3033153.
[5] V. Venkatasubramanian, R. Rengaswamy, S. N. Kavuri, “A review of
process fault detection and diagnosis: Part II: Qualitative models and
search strategies,” Computers & Chemical Engineering, vol. 27, no. 3,
pp. 313-326, 2003.
[6] P. Kadlec, B. Gabrys, S. Strandt, “Data-driven soft sensors in the process
industry,” Comput. Chem. Eng. vol. 33, pp. 795-814, 2009.
[7] M. Kano, M. Ogawa, “The state of the art in chemical process control in
Japan: good practice and questionnaire survey,” J. Process Control, vol.
20, pp. 969-982, 2010.
[8] K. Pearson, “LIII. On lines and planes of closest fit to systems of points
in space,” Philosophical Magazine, vol. 2, no. 11, pp. 559-572, 1901.
[9] H. Wold, “Estimation of principal components and related models by
iterative least squares,” Multivar. Anal., Vol. 1, pp. 391-420, 1966.
[10] Q. Jiang, X. Yan, H. Yi and F. Gao, “Data-Driven Batch-End Quality
Modeling and Monitoring Based on Optimized Sparse Partial Least
Squares,” IEEE Transactions on Industrial Electronics, vol. 67, no. 5, pp.
4098-4107, May 2020, doi: 10.1109/TIE.2019.2922941.
[11] W. Yan, H. Shao, X. Wang, “Soft sensing modeling based on support
vector machine and Bayesian model selection,” Comput, Chem. Eng.
vol. 28, pp. 1489-1498, 2004.
[12] K. Desai, Y. Badhe, S.S. Tambe, B.D. Kulkarni, “Soft-sensor
development for fed-batch bioreactors using support vector regression,”
Biochem. Eng. J., vol. 27, pp. 225-239, 2006.
[13] G. Hinton, S. Osindero, Y-W. Teh, “A Fast Learning Algorithm for Deep
Belief Nets,” Neural Comput., vol. 18, no. 7, pp. 1527-1554, 2006.
[14] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep
feedforward neural networks,” J. Mach. Learn. Res., vol. 9, pp. 249–256,
2010.
[15] Y. LeCun, Y. Bengio, G. Hinton, “Deep learning,” Nature, vol. 521, no.
7553, pp. 436-444, 2015.
[16] F.A.A. Souza, R. Araújo, and J. Mendes, “Review of soft sensor methods
for regression applications,” Chemometrics and Intelligent Laboratory
Systems, vol. 152, pp.69-79, 2016.
[17] K. Hornik, et al. “Multilayer feedforward networks are universal
approximations,” Neural Networks, vol. 2, pp. 359-366, 1989.
[18] G. Cybenko, “Approximation by superpositions of a sigmoidal
function,” Math. Control Signals System, vol. 2, pp. 303-314, 1989.
[19] K. Hornik, “Approximation capabilities of multilayer feedforward
networks,” Neural Networks, vol. 4, pp. 251-257, 1991.
[20] K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image
Recognition,” arXiv:1512.03385v1, 2015.
[21] I. Goodfellow, Y. Bengio, A. Courville, Deep learning, vol. 1,
Cambridge, MA, USA: the MIT press, 2016.
[22] C. Grosan, A. Abraham, “Rule-Based Expert Systems,” Intelligent
Systems, vol. 17, pp. 149-185, 2011.
[23] A. Lig?za, Logical Foundations for Rule-based Systems. 2nd edn.
Springer, Heidelberg, 2006.
[24] J. Durkin, Expert Systems: Design and Development. Prentice Hall, New
York, 1994.
[25] C. R. Turner, A. Fuggetta, L. Lavazza, A. L. Wolf, “A conceptual basis
for feature engineering,” Journal of Systems and Software, vol. 49, no. 1,
pp. 3-15, 1999.
[26] F. Nargesian, H. Samulowitz, U. Khurana, E. B. Khalil, D. Turaga,
“Learning Feature Engineering for Classification,” Presented at
Proceedings of the Twenty-Sixth International Joint Conference on
Artificial Intelligence, Aug. 2017, doi: 10.24963/ijcai.2017/352.
[27] Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A
review and new perspectives,” IEEE Transactions on Pattern Analysis
and Machine Intelligence, vol.35, no. 8, pp. 1798-1828, 2013.
[28] Andrew Ng, “Scale drives machine learning progress,” in Machine
Learning Yearning, pp. 10-12. [online]. Available:
https://www.deeplearning.ai/machine-learning-yearning/.
[29] S. J. Pan, Q. Yang, “A survey on transfer learning,” IEEE Transactions
on Knowledge and Data Engineering, vol. 22, no.10, pp. 1345-1359,
Oct. 2010.
[30] Y. Bengio, “Deep Learning of Representations for Unsupervised and
Transfer Learning,” Proceedings of ICML workshop on unsupervised
and transfer learning, pp. 17-36, 2012.
[31] W. Shao, Z. Song, and L. Yao, “Soft sensor development for multimode
processes based on semisupervised Gaussian mixture models,”
IFAC-PapersOnLine, vol. 51, no. 18, pp. 614–619, 2018.
[32] F. A. A. Souza and R. Araújo, “Mixture of partial least squares experts
and application in prediction settings with multiple operating modes,”
Chemometrics Intell. Lab. Syst., vol. 130, no. 15, pp. 192–202, 2014.
[33] H. Jin, X. Chen, L. Wang, K. Yang, and L. Wu, “Dual learning-based
online ensemble regression approach for adaptive soft sensor modeling
of non-linear time-varying processes,” Chemometrics Intell. Lab. Syst.,
vol. 151, pp. 228–244, 2016.
[34] M. Kano, and K. Fujiwara, “Virtual sensing technology in process
industries: trends and challenges revealed by recent industrial
applications,” Journal of Chemical Engineering of Japan, 2012, doi:
10.1252/jcej.12we167.
[35] L. X. Yu, “Pharmaceutical Quality by Design: Product and Process
Development, Understanding, and Control,” Pharm Res, vol. 25, pp.
781–791, 2008, doi: 10.1007/s11095-007-9511-1.
[36] S. J. Qin, “Process Data Analytics in the Era of Big Data,” AIChE
Journal, vol. 60, no. 9, pp. 3092-3100, 2014.
[37] N. Stojanovic, M. Dinic, L. Stojanovic, “Big data process analytics for
continuous process improvement in manufacturing,” 2015 IEEE
International Conference on Big Data, 2015, doi:
10.1109/BigData.2015.7363900.
[38] L. Yao, Z. Ge, “Big data quality prediction in the process industry: A
distributed parallel modeling framework,” J. Process Contr., vol. 68, pp.
1-13, 2018.
[39] M. S. Reis, and G. Gins, “Industrial Process Monitoring in the Big
Data/Industry 4.0 Era: from Detection, to Diagnosis, to Prognosis,”
Processes, vol. 5, no. 3, 35, 2017, doi:10.3390/pr5030035.
[40] S. W. Roberts, “Control charts tests based on geometric moving
averages,” Technometrics, vol. 1, pp. 239-250, 1959.
[41] C. A. Lowry, W. H. Woodall, C. W. Champ, C. E. Rigdon, “A
multivariate exponentially weighted moving average control chart,”
Technometrics, vol. 34, pp. 46–53, 1992.
[42] T. Kourti, J. F. MacGregor, “Multivariate SPC methods for process and
product monitoring,” J. Qual. Technol., vol. 28, pp. 409–428, 1996.
[43] M. S. Reis, P. M. Saraiva, “Prediction of profiles in the process
industries,” Ind. Eng. Chem. Res., vol. 51, pp. 4254–4266, 2012.
[44] C. Duchesne, J. J. Liu, J. F. MacGregor, “Multivariate image analysis in
the process industries: A review,” Chemom. Intell. Lab. Syst., vol. 117,
pp. 116-128, 2012.
[45] D. C. Montgomery, C. M. Mastrangelo, “Some statistical process control
methods for autocorrelated data,” J. Qual. Technol., vol. 23, pp. 179–
193, 1991.
[46] T. J. Rato, M. S. Reis, “Advantage of using decorrelated residuals in
dynamic principal component analysis for monitoring large-scale
systems,” Ind. Eng. Chem. Res., vol. 52, pp. 13685–13698, 2013.
[47] G. E. Hinton, and J. L. McClelland, “Learning representations by
recirculation,” In NIPS’ 1987, pp. 358–366, 1988.
[48] D. E. Rumelhar, G. E. Hinton, R. J. Williams, “Learning representations
by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533-536,
1986.
[49] H. Larochelle, I. Lajoie, Y. Bengio, P. A. Manzagol, “Stacked denoising
autoencoders: learning useful representations in a deep network with a
local denoising criterion,” J Mach Learn Res, vol. 11, no. 12, pp.
3371-3408, 2010.
[50] B. Sch?lkopf, J. Platt, T. Hofmann, “Efficient learning of sparse
representations with an energy-Based model,” Proceedings of advances
in neural information processingsystems, pp. 1137-1144, 2006.
[51] M. A. Ranzato, Y. L. Boureau, Y. Lecun, “Sparse feature learning for
deep belief networks,” Proceedings of international conference on neural
information processing systems, vol. 20, pp. 1185-1192, 2007.
[52] A. Hassanzadeh, A. Kaarna, T. Kauranne, “Unsupervised multi-manifold
classification of hyperspectral remote sensing images with contractive
Autoencoder,” Neurocomputing, vol. 257, pp.67-78.
[53] Y. Bengio, “Learning deep architectures for AI,” Foundations and Trends
in Machine Learning, vol. 2, no. 1, pp. 1–127, 2009.
[54] G. E. Hinton, “A practical guide to training restricted Boltzmann
machines,” Neural networks: Tricks of the trade. Springer, Berlin,
Heidelberg, pp. 599-619, 2012.
[55] G. E. Hinton, R. R. Salakhutdinov, “Deep Boltzmann machines,” J Mach
Learn Res, vol. 5, no. 2, pp. 1967-2006, 2009.
[56] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
with deep convolutional neural networks,” Advances in neural
information processing systems. 2012.
[57] Y. Zhou, and R. Chellappa, “Computation of optical flow using a neural
network,” IEEE 1988 International Conference on Neural Networks,
1988, doi: 10.1109/ICNN.1988.23914.
[58] Y. LeCun, L. Bottou, Y. Bengio, et al. “Gradient-based learning applied
to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp.
2278–2324, 1998.
[59] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
with deep convolutional neural networks,” In Advances in Neural
Information Processing Systems, pp. 1097–1105, 2012.
[60] K. Simonyan, A. Zisserman, “Very deep convolutional networks for
large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[61] P. J. Werbos, “Backpropagation through time: What it does and how to
do it,” Proc. IEEE, vol. 78, no. 10, pp. 1550–1560, Oct. 1990.
[62] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies
with gradient descent is difficult,” IEEE Transactions on Neural
Networks, vol. 5, no. 2, pp. 157–166, 1994.
[63] R. Pascanu, T. Mikolov, Y. Bengio, “On the difficulty of training
recurrent neural networks,” In Proceedings of International Conference
on Machine Learning, pp. 1310-1318, 2013.
[64] F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget:
Continual prediction with LSTM,” Neural computation, vol. 12, no. 10,
pp. 2451–2471, 2000.
[65] R. Pascanu, C. Gulcehre, K. Cho, and Y. Bengio, “How to construct deep
recurrent neural networks,” arXiv preprint arXiv:1312.6026, 2013.
[66] K. Cho, B. V. Merri?nboer, C. Gulcehre, F. Bougares, H. Schwenk, and
Y. Bengio, “Learning phrase representations using RNN
encoder-decoder for statistical machine translation,” In Proceedings of
the Empiricial Methods in Natural Language Processing 2014, 2014.
[67] G. Chrupala, A. Kadar, and A. Alishahi, “Learning language through
pictures,” arXiv: 1506.03694, 2015.
[68] F. Girosi, M. Jones, and T. Poggio, “Regularization theory and neural
networks architectures,” Neural computation, vol. 7, no. 2, pp. 219-269,
1995.
[69] D. M. Montserrat, Q. Lin, J. Allebach, E. J. Delp, “Training object
detection and recognition CNN models using data augmentation,”
Electronic Imaging, vol. 2017, no. 10, pp. 27-36, 2017.
[70] N. Jaitly, and G. E. Hinton, “Vocal tract length perturbation (VTLP)
improves speech recognition,” Proc. ICML Workshop on Deep Learning
for Audio, Speech and Language, Vol. 117, 2013.
[71] P. Vincent, H. Larochelle, Y. Bengio, et al. “Extracting and composing
robust features with denoising autoencoders,” Proceedings of the 25th
international conference on Machine learning, pp. 1096-1103, 2008.
[72] B. Poole, J. Sohl-Dickstein, and S. Ganguli, “Analyzing noise in
autoencoders and deep networks,” arXiv preprint arXiv: 1406.1831,
2014.
[73] R. Caruana, S. Lawrence, and C. L. Giles, “Overfitting in neural nets:
Backpropagation, conjugate gradient, and early stopping,” Advances in
neural information processing systems, 2001.
[74] Z. Zhang, Y. Xu, J. Yang, X. Li, D. Zhang, “A survey of sparse
representation: algorithms and applications,” IEEE access, vol. 3, pp.
[75] H. Larochelle, Y. Bengio, “Classification using discriminative restricted
Boltzmann machines,” Proceedings of the 25th international conference
on Machine learning, pp. 536-543, 2008.
[76] Y. Pati, R. Rezaiifar, and P. Krishnaprasad, “Orthogonal matching
pursuit: Recursive function approximation with applications to wavelet
decomposition,” In Proceedings of the 27 th Annual Asilomar
Conference on Signals, Systems, and Computers, pp. 40–44, 1993.
[77] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R.
Salakhutdinov, “Dropout: A simple way to prevent neural networks from
overfitting,” Journal of Machine Learning Research, vol. 15, pp. 1929–
1958, 2014.
[78] G. E. Hinton, N. Srivastava, A. Krizhevsky, et al. “Improving neural
networks by preventing co-adaptation of feature detectors,” arXiv
preprint arXiv:1207.0580, 2012.
[79] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R.
Salakhutdinov, “Dropout: A simple way to prevent neural networks from
overfitting,” Journal of Machine Learning Research, vol. 15, pp. 1929–
1958, 2014.
[80] S. Ioffe, C. Szegedy, “Batch normalization: Accelerating deep network
training by reducing internal covariate shift,” arXiv preprint
arXiv:1502.03167, 2015.
[81] M. Abadi, P. Barham, J. Chen, et al. “Tensorflow: A system for
large-scale machine learning,” 12th Symposium on Operating Systems
Design and Implementation, pp.265-283, 2016.
[82] Y. Jia, E. Shelhamer, J. Donahue, et al. “Caffe: Convolutional
architecture for fast feature embedding,” Proceedings of the 22nd ACM
international conference on Multimedia, pp. 675-678, 2014.
[83] F. Bastien, P. Lamblin, R. Pascanu, et al. “Theano: new features and
speed improvements,” arXiv preprint arXiv:1211.5590, 2012.
[84] F. Seide, A. Agarwal, “CNTK: Microsoft's open-source deep-learning
toolkit,” Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, pp. 2135-2135,
2016.
[85] A. Gulli, S. Pal, Deep learning with Keras. Packt Publishing Ltd, 2017.
[86] A. Paszke, S. Gross, F. Massa, et al. “Pytorch: An imperative style,
high-performance deep learning library,” Advances in Neural
Information Processing Systems, pp. 8026-8037, 2019.
[87] B. Shen, L. Yao, Z. Ge, “Nonlinear probabilistic latent variable
regression models for soft sensor application: From shallow to deep
structure,” Control Engineering Practice, vol. 94, 2020, doi:
10.1016/j.conengprac.2019.104198.
[88] X. Yuan, B. Huang, Y. Wang, et al. “Deep learning-based feature
representation and its application for soft sensor modeling with
variable-wise weighted SAE,” IEEE Transactions on Industrial
Informatics, vol. 14, no. 7, pp. 3235-3243, 2018.
[89] X. Yan, J. Wang, and Q. Jiang, “Deep relevant representation learning for
soft sensing,” Information Sciences, vol. 514, pp. 263-274, 2020.
[90] X. Yuan, J. Zhou, B. Huang, et al. “Hierarchical quality-relevant feature
representation for soft sensor modeling: a novel deep learning strategy,”
IEEE Transactions on Industrial Informatics, vol. 16, no. 6, pp.
3721-3730, 2019.
[91] Q. Sun, Z. Ge, “Gated Stacked Target-Related Autoencoder: A Novel
Deep Feature Extraction and Layerwise Ensemble Method for Industrial
Soft Sensor Application,” IEEE transactions on cybernetics, 2020, doi:
10.1109/TCYB.2020.3010331.
[92] Q. Sun, Z. Ge, “Deep Learning for Industrial KPI Prediction: When
Ensemble Learning Meets Semi-Supervised Data,” IEEE Transactions
on Industrial Informatics, 2020, doi: 10.1109/TII.2020.2969709.
[93] X. Wang, H. Liu, “Data supplement for a soft sensor using a new
generative model based on a variational autoencoder and Wasserstein
GAN,” Journal of Process Control, vol. 85, pp. 91-99, 2020.
[94] F. Guo, R. Xie, B. Huang, “A deep learning just-in-time modeling
approach for soft sensor based on variational autoencoder,”
Chemometrics and Intelligent Laboratory Systems, vol. 197, 2020, doi:
10.1016/j.chemolab.2019.103922.
[95] F. Guo, W. Bai, B. Huang, “Output-relevant Variational autoencoder for
Just-in-time soft sensor modeling with missing data,” Journal of Process
Control, 2020, 92: 90-97.
[96] R. Xie, N. M. Jan, K. Hao, et al. “Supervised Variational Autoencoders
for Soft Sensor Modeling with Missing Data,” IEEE Transactions on
Industrial Informatics, vol. 16, no. 4, pp. 2820-2828, 2019.
[97] L. Yao, Z. Ge, “Deep learning of semisupervised process data with
hierarchical extreme learning machine and soft sensor application,”
IEEE Transactions on Industrial Electronics, vol. 65, no. 2, pp.
[98] X. Wang, H. Liu, “Soft sensor based on stacked auto-encoder deep
neural network for air preheater rotor deformation prediction,”
Advanced Engineering Informatics, vol. 36, pp. 112-119, 2018.
[99] X. Wang and H. Liu, “A Knowledge- and Data-Driven Soft Sensor Based
on Deep Learning for Predicting the Deformation of an Air Preheater
Rotor,” in IEEE Access, vol. 7, pp. 159651-159660, 2019.
[100] W. Yan, D. Tang and Y. Lin, “A Data-Driven Soft Sensor Modeling
Method Based on Deep Learning and its Application,” in IEEE
Transactions on Industrial Electronics, vol. 64, no. 5, pp. 4237-4245,
May 2017, doi: 10.1109/TIE.2016.2622668.
[101] Y. Wu, D. Liu, X. Yuan and Y. Wang, “A just-in-time fine-tuning
framework for deep learning of SAE in adaptive data-driven modeling of
time-varying industrial processes,” IEEE Sensors Journal, doi:
10.1109/JSEN.2020.3025805.
[102] W. Fan, F. Si, S. Ren, et al. “Integration of continuous restricted
Boltzmann machine and SVR in NOx emissions prediction of a
tangential firing boiler,” Chemometrics and Intelligent Laboratory
Systems, vol. 195, 2019, doi: 10.1016/j.chemolab.2019.103870.
[103] P. Lian, H. Liu, X. Wang, et al. “Soft sensor based on DBN-IPSO-SVR
approach for rotor thermal deformation prediction of rotary
air-preheater,”
Measurement, vol. 165, 2020, doi:
10.1016/j.measurement.2020.108109.
[104] R. Liu, Z. Rong, B. Jiang, Z. Pang and C. Tang, “Soft Sensor of 4-CBA
Concentration Using Deep Belief Networks with Continuous Restricted
Boltzmann Machine,” 2018 5th IEEE International Conference on Cloud
Computing and Intelligence Systems (CCIS), Nanjing, China, pp.
421-424, 2018, doi: 10.1109/CCIS.2018.8691166.
[105] J. Qiao, L. Wang, “Nonlinear system modeling and application based on
restricted Boltzmann machine and improved BP neural network,”
Applied Intelligence, 2020, doi: 10.1007/s10489-019-01614-1.
[106] M. Lu, Y. Kang, X. Han and G. Yan, “Soft sensor modeling of mill level
based on Deep Belief Network,” The 26th Chinese Control and Decision
Conference (2014 CCDC), Changsha, pp. 189-193, 2014, doi:
10.1109/CCDC.2014.6852142.
[107] X. Wang, W. Hu, K. Li, L. Song and L. Song, “Modeling of Soft Sensor
Based on DBN-ELM and Its Application in Measurement of Nutrient
Solution Composition for Soilless Culture,”2018 IEEE International
Conference of Safety Produce Informatization (IICSPI), Chongqing,
China, pp. 93-97, 2018, doi: 10.1109/IICSPI.2018.8690373.
[108] S. Zheng, K. Liu, Y. Xu, et al. “Robust soft sensor with deep kernel
learning for quality prediction in rubber mixing processes,” Sensors, vol.
20, no. 3, 2020, doi: 10.3390/s20030695.
[109] Y. Liu, C. Yang, Z. Gao, et al. “Ensemble deep kernel learning with
application to quality prediction in industrial polymerization processes,”
Chemometrics and Intelligent Laboratory Systems, vol. 174, pp. 15-21,
2018.
[110] C. Shang C, F. Yang, D. Huang, et al. “Data-driven soft sensor
development based on deep learning technique,” Journal of Process
Control, vol. 24, no. 3, pp. 223-233, 2014.
[111] S. Graziani, and M. G. Xibilia, “Deep Learning for Soft Sensor Design,”
Development and Analysis of Deep Learning Architectures. Springer,
Cham, pp. 31-59, 2020.
[112] S. Graziani and M. G. Xibilia, “Design of a Soft Sensor for an Industrial
Plant with Unknown Delay by Using Deep Learning,” 2019 IEEE
International Instrumentation and Measurement Technology Conference
(I2MTC), Auckland, New Zealand, pp. 1-6, 2019, doi:
10.1109/I2MTC.2019.8827074.
[113] Y. Liu, Y. Fan, J. Chen, “Flame images for oxygen content prediction of
combustion systems using DBN,” Energy & Fuels, vol. 31, no. 8, pp.
8776-8783, 2017.
[114] C. H. Zhu, J. Zhang, “Developing Soft Sensors for Polymer Melt Index
in an Industrial Polymerization Process Using Deep Belief Networks,”
International Journal of Automation and Computing, vol. 17, no. 1, pp.
44-54, 2020.
[115] Z.C. Horn, et al. “Performance of convolutional neural networks for
feature extraction in froth flotation sensing,” IFAC-PapersOnLine, vol.
50, no. 2, pp. 13-18, 2017.
[116] X. Yuan, S. Qi, Y. Shardt, et al. “Soft sensor model for dynamic
processes based on multichannel convolutional neural network,”
Chemometrics and Intelligent Laboratory Systems, 2020: 104050.
[117] K. Wang, C. Shang, L. Liu, et al. “Dynamic soft sensor development
based on convolutional neural networks,” Industrial & Engineering
Chemistry Research, vol. 58, no. 26, pp. 11521-11531, 2019.
[118] W. Zhu, et al. “Deep learning based soft sensor and its application on a
pyrolysis reactor for compositions predictions of gas phase
components,” Computer Aided Chemical Engineering, Elsevier, Vol. 44,
pp. 2245-2250, 2018.
[119] J. Wei, L. Guo, X. Xu and G. Yan, “Soft sensor modeling of mill level
based on convolutional neural network,” The 27th Chinese Control and
Decision Conference (2015 CCDC), Qingdao, pp. 4738-4743, 2015, doi:
10.1109/CCDC.2015.7162762.
[120] S. Sun, Y. He, S. Zhou, et al. “A data-driven response virtual sensor
technique with partial vibration measurements using convolutional
neural network,” Sensors, vol. 17, no. 12, 2017, doi:
10.3390/s17122888.
[121] H.B. Su, L.T. Fan, J.R. Schlup, “Monitoring the process of curing of
epoxy/graphite fiber composites with a recurrent neural network as a soft
sensor,” Engineering Applications of Artificial Intelligence, vol. 11, no.
2, pp. 293-306, 1998.
[122] C.A. Duchanoy, M.A. Moreno-Armendáriz, L. Urbina, et al. “A novel
recurrent neural network soft sensor via a differential evolution training
algorithm for the tire contact patch,” Neurocomputing, vol. 235, pp.
71-82, 2017.
[123] J. Loy-Benitez, S.K. Heo, C.K. Yoo, “Soft sensor validation for
monitoring and resilient control of sequential subway indoor air quality
through memory-gated recurrent neural networks-based autoencoders,”
Control Engineering Practice, vol. 97: 104330, 2020.
[124] X. Chen, F. Gao, G. Chen, “A soft-sensor development for
melt-flow-length measurement during injection mold filling,” Materials
Science and Engineering: A, vol. 384, no. 1-2, pp. 245-254, 2004.
[125] L.Z. Chen, S.K. Nguang, X.M. Li, et al. “Soft sensors for on-line
biomass measurements,” Bioprocess and Biosystems Engineering, vol.
26, no. 3, pp. 191-195, 2004.
[126] G. Kataria, K. Singh, “Recurrent neural network based soft sensor for
monitoring and controlling a reactive distillation column,” Chemical
Product and Process Modeling, vol. 13, no. 3, 2017, doi:
10.1515/cppm-2017-0044.
[127] W. Ke, D. Huang, F. Yang and Y. Jiang, “Soft sensor development and
applications based on LSTM in deep neural networks,” 2017 IEEE
Symposium Series on Computational Intelligence (SSCI), Honolulu, HI,
pp. 1-6, 2017, doi: 10.1109/SSCI.2017.8280954.
[128] X. Yuan, L. Li and Y. Wang, “Nonlinear Dynamic Soft Sensor Modeling
with Supervised Long Short-Term Memory Network,” in IEEE
Transactions on Industrial Informatics, vol. 16, no. 5, pp. 3168-3176,
May 2020, doi: 10.1109/TII.2019.2902129.
[129] I. Pisa, I. Santín, J.L. Vicario, et al. “ANN-based soft sensor to predict
effluent violations in wastewater treatment plants,” Sensors, vol. 19, no.
6, 2019: 1280.
[130] R. Xie, K. Hao, B. Huang, L. Chen and X. Cai, “Data-Driven Modeling
Based on Two-Stream λ Gated Recurrent Unit Network with Soft Sensor
Application,” in IEEE Transactions on Industrial Electronics, vol. 67, no.
8, pp. 7034-7043, Aug. 2020, doi: 10.1109/TIE.2019.2927197.
[131] S.R. V. Raghavan, T.K. Radhakrishnan, K. Srinivasan, “Soft sensor
based composition estimation and controller design for an ideal reactive
distillation column,” ISA transactions, vol. 50, no. 1, pp. 61-70, 2011.
[132] Y.L. He, Y. Tian, Y. Xu, et al. “Novel soft sensor development using echo
state network integrated with singular value decomposition: Application
to complex chemical processes,” Chemometrics and Intelligent
Laboratory Systems, vol. 200, 2020: 103981, doi:
10.1016/j.chemolab.2020.103981.
[133] X. Yin, Z. Niu, Z. He, et al. “Ensemble deep learning based
semi-supervised soft sensor modeling method and its application on
quality prediction for coal preparation process,” Advanced Engineering
Informatics, vol. 46, 2020: 101136.
[134] X. Zhang and Z. Ge, “Automatic Deep Extraction of Robust Dynamic
Features for Industrial Big Data Modeling and Soft Sensor Application,”
in IEEE Transactions on Industrial Informatics, vol. 16, no. 7, pp.
4456-4467, July 2020, doi: 10.1109/TII.2019.2945411.
[135] W. Yan, R. Xu, K. Wang, et al. “Soft Sensor Modeling Method Based on
Semisupervised Deep Learning and Its Application to Wastewater
Treatment Plant,” Industrial & Engineering Chemistry Research, vol. 59,
no. 10, pp.4589-4601, 2020.
[136] W. Zheng, Y. Liu, Z. Gao, et al. “Just-in-time semi-supervised soft sensor
for quality prediction in industrial rubber mixers,” Chemometrics and
Intelligent Laboratory Systems, vol.180, pp. 36-41, 2018.
[137] S. Graziani, M.G. Xibilia, “Deep structures for a reformer unit soft
sensor,” 2018 IEEE 16th International Conference on Industrial
Informatics (INDIN). IEEE, pp. 927-932, 2018.
[138] K. Wang, C. Shang, F. Yang, Y. Jiang and D. Huang, “Automatic
hyper-parameter tuning for soft sensor modeling based on dynamic deep
neural network,” 2017 IEEE International Conference on Systems, Man,
and Cybernetics (SMC), Banff, AB, pp. 989-994, 2017, doi:
10.1109/SMC.2017.8122739.
[139] Y. He, Y. Xu, and Q. Zhu, “Soft-sensing model development using
PLSR-based dynamic extreme learning machine with an enhanced
hidden layer,” Chemometrics and Intelligent Laboratory Systems, vol.
154, pp. 101-111, 2016.
[140] X. Wang, “Data Preprocessing for Soft Sensor Using Generative
Adversarial Networks,” 2018 15th International Conference on Control,
Automation, Robotics and Vision (ICARCV), Singapore, pp. 1355-1360,
2018, doi: 10.1109/ICARCV.2018.8581249.
[141] Y. Fan, B. Tao, Y. Zheng and S. Jang, “A Data-Driven Soft Sensor Based
on Multilayer Perceptron Neural Network with a Double LASSO
Approach,” in IEEE Transactions on Instrumentation and Measurement,
vol. 69, no. 7, pp. 3972-3979, July 2020, doi:
10.1109/TIM.2019.2947126.
[142] A. Rani, V. Singh, J.R.P. Gupta, “Development of soft sensor for neural
network based control of distillation column,” ISA transactions, vol. 52,
no. 3, pp. 438-449, 2013.
[143] A. Alexandridis, “Evolving RBF neural networks for adaptive
soft-sensor design,” International journal of neural systems, vol. 23, no.
6, 2013: 1350029.
[144] M.D. Zeiler, et al. “Modeling pigeon behavior using a Conditional
Restricted Boltzmann Machine.” ESANN, 2009.
[145] Y. Luo, et al. “Multivariate time series imputation with generative
adversarial networks,” Advances in Neural Information Processing
Systems, 2018.
[146] L. Jing and Y. Tian, “Self-supervised Visual Feature Learning with Deep
Neural Networks: A Survey,” in IEEE Transactions on Pattern Analysis
and Machine Intelligence, 2020, doi: 10.1109/TPAMI.2020.2992393.
[147] A. Oord, Y. Li, O. Vinyals, “Representation learning with contrastive
predictive coding,” arXiv preprint arXiv: 1807.03748, 2018.
[148] C. Finn, P. Abbeel, S. Levine, “Model-agnostic meta-learning for fast
adaptation of deep networks,” arXiv preprint arXiv:1703.03400, 2017.
[149] L. Maaten, G. Hinton, Visualizing data using t-SNE,” Journal of machine
learning research, no. 9, pp. 2579-2605, Nov. 2008.
[150] M.D. Zeiler, R. Fergus, “Visualizing and understanding convolutional
networks,” European conference on computer vision. Springer, Cham,
pp. 818-833, 2014.
[151] S. Kabir, R. U. Islam, M. S. Hossain, et al. “An Integrated Approach of
Belief Rule Base and Deep Learning to Predict Air Pollution.” Sensors,
vol. 20, no. 7: 1956, 2020.
[152] Q. Jiang, S. Yan, H. Cheng and X. Yan, “Local-Global Modeling and
Distributed Computing Framework for Nonlinear Plant-Wide Process
Monitoring with Industrial Big Data,” IEEE Transactions on Neural
Networks and Learning Systems, doi: 10.1109/TNNLS.2020.2985223.
[153] Z. Yang, Z. Ge, “Monitoring and Prediction of Big Process Data with
Deep Latent Variable Models and Parallel Computing,” Journal of
Process Control, vol. 92, pp. 19-34, 2020.