以下是協(xié)同過濾推薦系統(tǒng)的學(xué)習(xí)筆記
-
公式
image.png -
邏輯圖
image.png
image.png
- 原理理解
- 使用用戶對物品的評分站故,分解出用戶感興趣的物品類型特征,和物品在不同物品類型的分?jǐn)?shù)。例如:電影分為動作電影類型、情感電影類型味滞,某一電影在動作電影類型分?jǐn)?shù)為9疫衩,情感電影類型分?jǐn)?shù)為1。同理某一用戶對動作類型電影分?jǐn)?shù)為1分翅帜,情感電影為9分姻檀。這些我把它理解為物對-物品類型特征和用戶-物品類型特征。
- 使用LightFM
- LightFM使用這邊比較簡單涝滴,就是給用戶電影的評分?jǐn)?shù)據(jù)绣版,LightFM自動計算出用戶對不同物品的分?jǐn)?shù)
- 一下是從LightFM官網(wǎng)粘帖的代碼
from lightfm import LightFM
from lightfm.datasets import fetch_movielens
from lightfm.evaluation import precision_at_k
import numpy as np
# Load the MovieLens 100k dataset. Only five
# star ratings are treated as positive.
data = fetch_movielens(data_home='./data', min_rating=5.0)
print(data['train'])
# Instantiate and train the model
model = LightFM(loss='warp')
model.fit(data['train'], epochs=30, num_threads=2)
# Evaluate the trained model
test_precision = precision_at_k(model, data['test'], k=5).mean()
print("Train precision: %.2f" % precision_at_k(model, data['train'], k=5).mean())
print("Test precision: %.2f" % precision_at_k(model, data['test'], k=5).mean())
def sample_recommendation(model, data, user_ids):
n_users, n_items = data['train'].shape
for user_id in user_ids:
known_positives = data['item_labels'][data['train'].tocsr()[user_id].indices]
print(data['train'].tocsr())
print(data['train'].tocsr()[user_id])
print(data['train'].tocsr()[user_id].indices)
scores = model.predict(user_id, np.arange(n_items))
top_items = data['item_labels'][np.argsort(-scores)]
print("User %s" % user_id)
print(" Known positives:")
for x in known_positives[:3]:
print(" %s" % x)
print(" Recommended:")
for x in top_items[:3]:
print(" %s" % x)
sample_recommendation(model, data, [3, 25, 450])
- github對應(yīng)源碼和需要的數(shù)據(jù)地址
https://github.com/wengmingdong/tf2-stu/tree/master/recommender