用戶研究發(fā)現(xiàn)netflix的用戶在一到兩屏看過10-20個(gè)title之后,在60s-90s過后就會(huì)失去興趣。推薦系統(tǒng)的目的就是在兩屏之內(nèi)讓用戶找到感興趣的東西钠惩。
how each member watches (e.g., the device, time of day, day of week, intensity of watching)
有這么幾種推薦策略:
1)Personalized Video Ranker
orders the entire catalog of videos (or subsets selected by genre or other filtering) for each member profile in a personalized way柒凉。
Because we use PVR so widely, it must be good at general- purpose relative rankings throughout the entire catalog; this limits how personalized it can actually be
PVR需要對(duì)一個(gè)分類下所有的視頻進(jìn)行rank,需要對(duì)所有分類都進(jìn)行排序篓跛,這實(shí)際上限制了個(gè)性化
2) Top-N Video Ranker
find the best few personalized recommendations in the entire catalog for each member, that is, focusing only on the head of the ranking, a freedom that PVR does not have because it gets used to rank arbitrary subsets of the catalog
TVR其實(shí)是用對(duì)頭部的視頻進(jìn)行rank膝捞,挑出topn出來(lái),所以方法上比PVR更自由愧沟。但是這倆其實(shí)共享了很多相同的屬性绑警,比如
3)Treding Now
used to drive the Trending Now row,有兩部分情況表現(xiàn)很好:
- 季節(jié)性的熱點(diǎn)央渣,比如情人節(jié)
- 短期實(shí)時(shí)熱點(diǎn),比如颶風(fēng)
4)Continue Watching
the continue watching ranker sorts the subset of recently viewed titles based on our best estimate of whether the member intends to resume watching or rewatch渴频,主要特征有 - 上次看過的時(shí)間間隔
- 什么時(shí)候放棄的(中間芽丹、開始、結(jié)尾)
- 使用的設(shè)備
- 其他[相關(guān)]標(biāo)題是不是看過
5)Video-Video Similarity
an unpersonalized algorithm that computes a ranked list of videos—the similars—for every video in our catalog卜朗,the choice of which BYW rows make it onto a homepage is personalized
6) Page Generation: Row Selection and Ranking
select and order rows from a large pool of candidates to create an ordering optimized for relevance and diversity(怎么評(píng)估的相關(guān)性和多樣性拔第?A recent blogpost Learning a Personalized Homepage)
7) Evidence
Evidence selection algorithms evaluate all the possible evidence items that we can display for every recommendation, to select the few that we think will be most helpful to the member viewing the recommendation。推薦理由的選擇和展示
decide whether to show that a certain movie won an Oscar or instead show the member that the movie is similar to another video recently watched by that member
8)Search
a) search recommends videos for a given query as alternative results for a failed search.
b)we know about the searching member’s taste is also especially important for us. - One algorithm attempts to find the videos that match a given query
- Another algorithm predicts interest in a concept given a partial query
- A third algorithm finds video recommendations for a given concept
-
商業(yè)價(jià)值
The effective catalog size (ECS) is a metric that describes how spread viewing is across the items in our catalog.tells us how many videos are required to account for a typical hour streamed.
ECS的計(jì)算方法如下:
圖片.png
Notethat pi ≥ pi+1 for i=1,...,N?1and 綜合為1.
-
衡量標(biāo)準(zhǔn)
直覺跟線上效果不一定相關(guān)场钉,比如“house of cards”看起來(lái)更相似的相關(guān)推薦結(jié)果效果并不如更寬泛的結(jié)果.
we have observed that improving engagement—the time that our members spend viewing Netflix content—is strongly correlated with improving retention.
顯著性和測(cè)試的cell數(shù)量關(guān)系很大蚊俺,F(xiàn)or example, if we find that 50% of the members in the test have retained when we compute our retention metric, then we need roughly 2 million members per cell to measure a retention delta of 50.05% to 49.95%=0.1% with statistical confidence. this type of plot can be used as a guide to choose the sample size for the cells in a test, for example, detecting a retention delta of 0.2% requires the sample size traced by the black line labeled 0.2%, which changes as a function of the average retention rate when the experiment stops, being maximum (south of 500k members per cell) when the retention rate is 50%.
圖片.png
離線測(cè)試加速迭代,Offline experiments allow us to iterate quickly on algorithm prototypes, and to prune the candidate variants that we use in actual A/B experiments.
關(guān)鍵問題
1)Better Experimentation Protocols
還是需要更好地離線和在線評(píng)測(cè)指標(biāo)來(lái)綜合整體的收益逛万,比如在長(zhǎng)期收益和短期收益的衡量上
2)Global Algorithms
3)Controlling for Presentation Bias
introduce randomness into the recommendations
4)Page Construction
It took us a couple of years to find a fully personalized algorithm to construct a page of recommendations that A/B tested better than a page based on a template (itself optimized through years of A/B testing)
5)Member Coldstarting
Today, our member coldstart approach has evolved into a survey given during the sign-up process, during which we ask new members to select videos from an algorithmically populated set that we use as input into all of our algorithms.
6)Choosing the Best Evidence to Support Each Recommendation
highlight different aspects of a video, such as an actor or director involved in it-
延伸閱讀
Learning a Personalized Homepage
圖片.png
We want our recommendations to be accurate in that they are relevant to the tastes of our members, but they also need to be diverse so that we can address the spectrum of a member’s interests versus only focusing on one. We want to be able to highlight the depth in the catalog we have in those interests and also the breadth we have across other areas to help our members explore and even find new interests. We want our recommendations to be fresh and responsive to the actions a member takes, such as watching a show, adding to their list, or rating; but we also want some stability so that people are familiar with their homepage and can easily find videos they’ve been recommended in the recent past
二維的多行泳猬,橫著天然滿足相關(guān)性,豎著天然滿足多樣性宇植。
we consider important
- the quality of the videos in the row,
- the amount of diversity on the page
- the affinity of members for specific kinds of rows
- and the quality of the evidence we can surface for each video.
A simple way to add in diversity is to switch from a row-ranking approach to a stage-wise approach using a scoring function that considers both a row as well as its relationship to both the previous rows and the previous videos already chosen for the page.Other approaches to greedily add diversity based on submodular function maximization can also be used.
Diversity can also be additionally incorporated into the scoring model when considering the features of a row compared to the rest of the page by looking at how similar the row is to the rest of the rows or the videos in the row to the videos on the rest of the page.