在lightfm中设置用户与物品交互数据的正确方法。

Question

当我有额外的隐含数据的情况下，当我有额外的项目产品的隐含数据的情况下，输入到lightfm模型时，正确的设置方法是什么。例如，我有 10万用户x200个项目 交互数据，但是在实际应用中，我希望模型只提供200个项目中的50个项目的推荐。那么我如何设置数据呢？我想了2种情况，但我不确定哪种方法是正确的。

案例1：直接将整个矩阵（100k用户x200个项目）作为数据送入 interactions 在lightfm中进行论证。这样更有利于协作学习。

案例2：只喂（10万用户×50个项目）到 interactions 参数，并使用(100k x 150项)矩阵作为 user_features. 这种方式更多的是基于内容的学习。

哪种方式是正确的呢？另外，对于案例1，有没有办法让模型评估的实用函数（精度、召回等）只针对选定的项目进行推荐，比如，只从50个项目中抽取前k个推荐项目，不推荐其他项目，并从这些项目中计算精度、召回等。

Answer 1

你应该遵循案例1。在进行预测时，你可以将所需(50)项的索引作为参数传递给model.predict。

从lightfm的文档中，你可以看到model.predict是以项目id作为参数的（本例中是50个项目的id）。

https:/making.lyst.comlightfmdocs_moduleslightfmlightfm.html#Lightfm.predict。

def predict(self, user_ids, item_ids, item_features=None, user_features=None, num_threads=1): """ 计算用户-项目对的推荐得分。

    Arguments
    ---------

    user_ids: integer or np.int32 array of shape [n_pairs,]
         single user id or an array containing the user ids for the
         user-item pairs for which a prediction is to be computed
    item_ids: np.int32 array of shape [n_pairs,]
         an array containing the item ids for the user-item pairs for which
         a prediction is to be computed
    user_features: np.float32 csr_matrix of shape [n_users, n_user_features], optional
         Each row contains that user's weights over features
    item_features: np.float32 csr_matrix of shape [n_items, n_item_features], optional
         Each row contains that item's weights over features
    num_threads: int, optional
         Number of parallel computation threads to use. Should
         not be higher than the number of physical cores.

在lightfm中设置用户与物品交互数据的正确方法。

问题描述投票：0回答：1

1个回答

最新问题

在lightfm中设置用户与物品交互数据的正确方法。

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1