从pandas添加数据到列表

问题描述 投票:0回答:1

Python新手在这里,我想从数据框中提取值到列表中,但我得到了我不需要的额外信息。有一个更好的方法吗:

rating1 = []
rating2 = []
for value in person1["Movie"]:
    for value2 in person2["Movie"]:
        if value == value2:
            rating1.append(person1[person1["Movie"] == value]["Rating"])
            rating2.append(person2[person2["Movie"] == value2]["Rating"])

当我打印rating1时,我得到了这个:

print(rating1)
[0    2.5
Name: Rating, dtype: float64, 1    3.5
Name: Rating, dtype: float64, 2    2.5
Name: Rating, dtype: float64, 5    3.0
Name: Rating, dtype: float64, 22    3.5
Name: Rating, dtype: float64, 23    3.0
Name: Rating, dtype: float64]

我的目标只是提取没有索引和其他信息的评级,用于计算曼哈顿和欧几里德距离。像这样的东西:

[2.5, 3.5, 2.5, 3.0, 3.5, 3.0]
python-3.x pandas list
1个回答
0
投票

我找到了我的问题的答案,这里是供将来参考。使用append方法,将其更改为extend方法,结果正是我想要的。

rating1 = []
rating2 = []
for value in person1["Movie"]:
    for value2 in person2["Movie"]:
        if value == value2:
            rating1.extend(person1[person1["Movie"] == value]["Rating"])
            rating2.extend(person2[person2["Movie"] == value2]["Rating"])

print(rating1)
>>>[2.5, 3.5, 2.5, 3.0, 3.5, 3.0]

这样我可以像这样调用Euclidean和Manhattan方法:

from scipy.spatial import distance
r1 = np.array(rating1)
r2 = np.array(rating2)

euclidean = distance.euclidean(r1, r2)
manhattan = distance.cityblock(r1, r2)
© www.soinside.com 2019 - 2024. All rights reserved.