目錄
圖嵌入是一種從圖中生成無監(jiān)督節(jié)點(diǎn)特征(node features)的方法,生成的特征可以應(yīng)用在各類機(jī)器學(xué)習(xí)任務(wù)上〈温現(xiàn)代的圖網(wǎng)絡(luò)岩馍,尤其是在工業(yè)應(yīng)用中,通常會(huì)包含數(shù)十億的節(jié)點(diǎn)(node)和數(shù)萬億的邊(edge)伙菊。這已經(jīng)超出了已知嵌入系統(tǒng)的處理能力败玉。Facebook開源了一種嵌入系統(tǒng),PyTorch-BigGraph(PBG)镜硕,系統(tǒng)對(duì)傳統(tǒng)的多關(guān)系嵌入系統(tǒng)做了幾處修改讓系統(tǒng)能擴(kuò)展到能處理數(shù)十億節(jié)點(diǎn)和數(shù)萬億條邊的圖形运翼。
本系列為翻譯的pytouch的官方手冊(cè),希望能幫助大家快速入門GNN及其使用兴枯,全文十五篇血淌,文中如果有勘誤請(qǐng)隨時(shí)聯(lián)系。
(一)Facebook開源圖神經(jīng)網(wǎng)絡(luò)-Pytorch Biggraph
(二)Facebook:BigGraph 中文文檔-數(shù)據(jù)模型(PyTorch)
(三)Facebook:BigGraph 中文文檔-從實(shí)體嵌入到邊分值(PyTorch)
(四)Facebook:BigGraph 中文文檔-I/O格式化(PyTorch)
(五)Facebook:BigGraph 中文文檔-批預(yù)處理
(六)Facebook:BigGraph 中文文檔-分布式模式(PyTorch)
(七)Facebook:BigGraph 中文文檔-損失計(jì)算(PyTorch)
(八)Facebook:BigGraph 中文文檔-評(píng)估(PyTorch)
(九)Facebook:BigGraph 中文文檔-動(dòng)態(tài)關(guān)系(PyTorch)
Dynamic relations 動(dòng)態(tài)關(guān)系
Caution 注意
This is an advanced topic! 這是升級(jí)教程财剖!
Enabling the?dynamic_relations?flag in the configuration activates an alternative mode to be used for graphs with a large number of relations (more than ~100 relations). In dynamic relation mode, PBG runs with several modifications to its “standard” operation in order to support the large number of relations.?
在配置中啟動(dòng)dynamic_relations配置將激活另一種模式六剥,用于具有大量關(guān)系(超過~100)的圖。在動(dòng)態(tài)關(guān)系模式下峰伙,PBG運(yùn)行時(shí)對(duì)其“標(biāo)準(zhǔn)”操作進(jìn)行了一些修改用于支持大量的關(guān)系疗疟。
The differences are:
相比不同有:
The?number?of relations isn’t provided in the config but is instead found in the input data, namely in the entity path, inside a?dynamic_rel_count.txt?file. The settings of the relations, however, are still provided in the config file. This happens by providing a single relation config which will act as a “template” for all other ones, by being duplicated an appropriate number of times. One can think of this as the one relation in the config being “broadcasted” to the size of the relation list found in the?dynamic_rel_count.txt?file.
配置中不需要提供number數(shù)量,替代的是在輸入數(shù)據(jù)的整個(gè)實(shí)體路徑中來查找瞳氓,即dynamic_rel_count.txt文件策彤,但關(guān)系的設(shè)置,仍然需要在配置文件中配置匣摘。這是通過提供一個(gè)單獨(dú)的關(guān)系配置來實(shí)現(xiàn)的店诗,該配置將充當(dāng)當(dāng)所有其他關(guān)系的“模板”,并且被復(fù)制合適的次數(shù)音榜。我們將其看做是配置中的一個(gè)關(guān)系被“廣播”到dynamic_rel_count.txt文件中的關(guān)系列表的大小庞瘸。
The batches of positive edges that are passed from the training loop into the model contain edges for multiple relation types at the same time (instead of each batch coming entirely from the same relation type). This introduces some performance challenges in how the operators are applied to the embeddings, as instead of a single operator with a single set of parameters applied to all edges, there might be a different one for each edge. The previous property ensures that all the operators are of the same type, so just their parameters might differ from one row to another. To account for this, the operators for dynamic relations are implemented differently, with a single operator object containing the parameters for all relation types. This implementation detail should be transparent as for how the operators are applied to the embeddings, but might come up when retrieving the parameters at the end of training.
在訓(xùn)練循環(huán)中包含正邊的批次,傳入模型中同時(shí)包含多個(gè)關(guān)系類型的邊(不是每個(gè)批次完全來自同一關(guān)系類型)赠叼。這讓如何將運(yùn)算符應(yīng)用于嵌入上帶來了一些性能挑戰(zhàn)擦囊,因?yàn)閷?duì)于每個(gè)邊违霞,可能會(huì)有一個(gè)不同的運(yùn)算符,而不是對(duì)所有邊應(yīng)用一組參數(shù)的單個(gè)運(yùn)算符瞬场。previous屬性確保所有運(yùn)算符都是同一類型的买鸽,因此這些參數(shù)可能會(huì)不同的行不一樣。為了匹配贯被,動(dòng)態(tài)關(guān)系的運(yùn)算符以不同方式實(shí)現(xiàn)眼五,單個(gè)運(yùn)算符對(duì)象包含所有關(guān)系類型的參數(shù)。對(duì)于如何向運(yùn)算符應(yīng)用到嵌入中彤灶,整個(gè)實(shí)現(xiàn)細(xì)節(jié)應(yīng)該是透明的看幼,但在訓(xùn)練結(jié)束時(shí)檢索參數(shù)是可能會(huì)出現(xiàn)。
With non-dynamic relations, the operator is applied to the embedding of the right-hand side entity of the edge, whereas the embedding of the left-hand side entity is left unchanged. In a given batch, denote the???i-th positive edge by?(????,??,????) (???? and?????yi?being the left- and right-hand side entities,??? being the relation type). For each of the positive edges, denote its???-th negative sample?(????,??,??′??,??). Due to?same-batch negative sampling?it may occur that the same right-hand side entity is used as a negative for several positives, that is, that???′??1,??1=??′??2,??2 . for???1≠??2. However, since it’s the same relation type???rfor all negatives, all the right-hand side entities will be transformed in the same way (i.e., passed through???’s operator) no matter what positive edge they are a negative for. we need to apply the operator of???r?to all of them, hence the total number of operator evaluations is equal to the number of positives and negatives.
對(duì)于非動(dòng)態(tài)關(guān)系幌陕,算子應(yīng)用在右側(cè)實(shí)體的嵌入上诵姜,同事左側(cè)試題的嵌入保持不變。在給定的批次中苞轿,用(????,??,????)來表示第i個(gè)正邊(xi和yi為左側(cè)和右側(cè)的實(shí)體,r是關(guān)系類型)逗物。對(duì)每一個(gè)正邊搬卒,用(????,??,??′??,??)來表示對(duì)應(yīng)的第j個(gè)負(fù)樣本。由于同一批負(fù)采樣可能會(huì)出現(xiàn)同一個(gè)右側(cè)實(shí)體被抽樣為復(fù)變翎卓,如:??′??1,??1=??′??2,??2? 并且 i1≠??2.?