我有一个这样的框架:
src = pl.DataFrame(
{
"c1": ["a", "b", "c", "d"],
"c2": [[0], [2, 3, 4], [3, 4, 7, 9], [3, 10]],
}
)
...以及目标列表:
targets = pl.Series([3, 7, 9])
...我想计算“c2”中目标的数量:
dst = pl.DataFrame(
{
"c1": ["a", "b", "c", "d"],
"c2": [[0], [2, 3, 4], [3, 4, 7, 9], [3, 9]],
"match_count": [0, 1, 3, 2],
}
)
最有效的方法是什么?
我看到了
count_matches
,但它不适用于多个选项:
df["c"].list.count_matches(3) # OK.
df["c"].list.count_matches([3, 7, 9]) # No way.
您可以使用Python中的
any()
函数。 Here 是解释此功能如何工作的链接。这是我编写的代码,可能会有所帮助
import pandas as pd
dst = pd.DataFrame(
{
"c1": ["a", "b", "c", "d"],
"c2": [[0], [2, 3, 4], [3, 4, 7, 9], [3, 9]],
"match_count": [0, 1, 3, 2],
}
)
targets = pd.Series([3, 7, 9])
count = 0
for i in dst["c2"]:
# check if any element in i is in targets
if any(j in i for j in targets):
count += 1
print(count) # 3