Created: Jul 4, 2020 3:07 PM
Tags: KDD`19, 注意力機(jī)制, 知識(shí)圖譜
1 Target
most existing works are unaware of the relationships between these entities and users.
we investigate how to explore these relationships which are essentially determined by the interactions among entities.
2 Model
將實(shí)體間交互分為兩種類型:實(shí)體間交互和實(shí)體內(nèi)交互气破。inter-entity-interaction and intra-entity-interaction.
(AKUPM) ,propose a novel model named Attention-enhanced Knowledge-aware User Preference Model ****
a self-attention network to capture the inter-entity-interaction by learning appropriate importance of
each entity w.r.t the user.
the intra-entity-interaction is modeled by projecting each entity into its connected relation spaces to obtain the suitable characteristics
- click-through rate (CTR) prediction 點(diǎn)擊率預(yù)測(cè)
3 Contributions
- We propose a knowledge-aware model to alleviate the sparsity problem in recommendation systems based on a novel adaptive self-attention modeling design.
- this is the first work fully exploring the relationships between users and incorporated entities based on categorizing the interactions of entities into two types: intra-entity-interaction and inter-entity-interaction.
- AKUPM is able to figure out the most related part of the incorporated entities for each user, so as to make better CTR predictions. 2 real-world public datasets
INTRODUCTION
基于協(xié)同過濾的方法有很長(zhǎng)的歷史奏赘,但在矩陣稀疏時(shí)不好用梢灭,需要添加輔助的補(bǔ)充信息爽锥,近年來银舱,知識(shí)圖譜脫穎而出值骇。
knowledge graph
A knowledge graph is a type of multi-relational directed graph composed of a large number of entities and relations. More specifically, each edge in the knowledge graph is represented as a triple in the form of (head entity, relation, tail entity), also called a fact, indicating that the head entity and the tail entity is connected through the relation
最左是用戶呀枢,最右是實(shí)體entity
- Inter-entity-interaction: the importance of an entity varies a lot when included in different entity sets due to the interactions among entities. As shown in Figure 1, with regard to Bob, all incorporated movies are from USA, whereas to Steph the incorporated movies are from various countries. In this case, USA is of great importance to identify Bos’s interests rather than to identify Steph’s interests.由于實(shí)體之間的相互作用,當(dāng)實(shí)體包含在不同的實(shí)體集合中時(shí)较店,其重要性會(huì)有很大的變化士八。Bob喜歡的電影都是來自美國(guó),而斯蒂芬喜歡的電影哪個(gè)國(guó)家都有梁呈,因此在判斷Bob的喜好時(shí)婚度,美國(guó)很重要,但對(duì)于斯蒂芬卻不是很重要官卡。
- Intra-entity-interaction: for a certain user, an entity may show different characteristics when involved in different relations. For example in Figure 1, Alice may like City Lights since Charles Chaplin is the director of this film, whereas she may like Modern Time since Charles Chaplin acts the leading role in this film.一個(gè)實(shí)體在不同的關(guān)系里展示不同的特性蝗茁,Alice喜歡City Lights因?yàn)镃C是導(dǎo)演,然而她也可能喜歡MT因?yàn)镃C是主演
??CTR預(yù)測(cè)與推薦系統(tǒng)
CTR預(yù)估模型可以用于推薦系統(tǒng)寻咒,因?yàn)橥扑]系統(tǒng)把CTR預(yù)估模型產(chǎn)生的CTR值當(dāng)作排序的依據(jù)
/
ctr預(yù)估只是推薦系統(tǒng)中的一環(huán)哮翘。
一般推薦系統(tǒng)包括召回,精排(ctr預(yù)估)毛秘,rerank(機(jī)制策略)饭寺。召回和精排的打分集合是不一樣的。召回針對(duì)的是全部item叫挟,而精排針對(duì)的是召回輸出的item艰匙。因此召回一般是在全部item集合上構(gòu)建訓(xùn)練樣本,而精排一般是基于展現(xiàn)樣本來構(gòu)建訓(xùn)練樣本抹恳,而這部分樣本本身是有偏的员凝。即使精排有能力對(duì)全體item進(jìn)行打分,由于只基于展現(xiàn)樣本訓(xùn)練奋献,對(duì)于沒有展現(xiàn)過的item健霹,預(yù)估會(huì)有偏差,可能會(huì)有問題瓶蚂。因此現(xiàn)階段需要依賴召回通過各種召回方式來對(duì)item進(jìn)行過濾篩選糖埋。但是后面算力足夠了,這個(gè)問題是不是就不能解了呢扬跋?我覺得未必不能解阶捆。只不過在算力還沒達(dá)到的時(shí)候,大家的精力暫時(shí)不在這個(gè)地方钦听。
另外推薦系統(tǒng)的排序目標(biāo)一般是多種多樣的洒试,以阿里電商推薦為例,一個(gè)主要排序目標(biāo)是gmv(ctr×cvr×price)朴上,同時(shí)要兼顧用戶體驗(yàn)垒棋,考慮多樣性等指標(biāo),這些僅僅靠一個(gè)ctr預(yù)估模型是無法做到的痪宰。
/
常見的工業(yè)級(jí)推薦系統(tǒng)一般包含兩個(gè)主要部分:召回和 ranking叼架。為什么會(huì)有召回呢,是因?yàn)樵诠I(yè)界衣撬,我們需要推薦的內(nèi)容常常百萬千萬甚至億級(jí)的規(guī)模乖订,不經(jīng)過召回初篩,直接取特征進(jìn)行 ranking 預(yù)測(cè)耗時(shí)是不現(xiàn)實(shí)的具练。召回的種類大概可以分為幾類:一大類是基于用戶興趣的召回乍构,包括長(zhǎng)期興趣、實(shí)時(shí)興趣等等扛点,第二類是協(xié)同類召回哥遮,比如基于用戶session 鏈的協(xié)同、基于用戶社交關(guān)系應(yīng)用于內(nèi)容的協(xié)同等等陵究,還有一類是 nn 學(xué)習(xí)的 embedding 相似召回眠饮。Ranking 是將各路召回?cái)?shù)據(jù)整合之后基于一個(gè)或者多個(gè)特定目標(biāo)的模型排序部分,最常見的就是以點(diǎn)擊率作為目標(biāo)來進(jìn)行預(yù)測(cè)铜邮,而還有其他一些目標(biāo)比如說:停留時(shí)長(zhǎng)仪召、點(diǎn)贊評(píng)論等等。這里邊常用的模型包括 LR松蒜、GBDT返咱、DNN及各種改進(jìn)的深度學(xué)習(xí)模型等等,當(dāng)然還有多目標(biāo)的模型牍鞠,比如阿里的 ESMM咖摹。