Pandas - code-value-pair-look-like-dataframe 中的复杂查找

Question

我在和熊猫打架

我有一个带有代码值对（来自外部来源）的数据框，如下所示：

c = {'attribute_name': ['heat_production', 'heat_production', 'heat_production', 'heat_emission', 'heat_emission' ],
     'attribute_code': [7410, 7420, 7430, 4211, 4220],
     'attribute_text': ['Heating with oil', 'Heating with gas', 'Central heating', 'Radiator', 'Floor']}
codes = pd.DataFrame(data=c)

然后有一个像这样的单独的数据框：

b = {'adress': ['Teststreet 1', ' Teststreet 2'],
     'heat_production': [7420, 7410]}
buildings = pd.DataFrame(data=b)

目标是在建筑物数据框中添加一列，其中包含加热代码的文本。

如果可能的话，我不想更改包含代码的数据框的结构。

像这样在 code-df 中进行简单的“静态”查找工作正常：

codes['attribute_text'][(codes['attribute_name'] == 'heat_production') & (codes['attribute_code'] == 7410)]

但是一旦开始将它动态地应用于整个数据框，例如：

buildings.assign(heating_txt = codes['attribute_text'][(codes['attribute_name'] == 'heat_production') & (codes['attribute_code'] == buildings['heat_production'])])

我收到如下错误： ValueError: Can only compare identically-labeled Series objects

Answer 1

例子

s1 = pd.Series([0, 1])
s2 = pd.Series([0, 1, 2])
s1 == s2

ValueError：只能比较相同标记的 Series 对象

这个错误是因为s1和s2的索引不一样

然后你可以使用

eq

代替

==

s1.eq(s2)

输出：

0     True
1     True
2    False
dtype: bool

您的以下代码导致了错误

codes['attribute_code'] == buildings['heat_production']

但是，您不需要使用

eq

代替

==

来修复错误。按照你的逻辑，修复错误不会给你想要的输出。

从现在开始，询问如何产生所需的输出，而不是询问修复错误。在大多数情况下，修复错误不会达到预期的输出。

Answer 2

你可以尝试合并吗？我不确定这是否正是您的目标，但这可能会有所帮助？

buildings.merge(codes[['attribute_code','attribute_text']],left_on = ['heat_production'], right_on = ['attribute_code'], how = 'inner')

这最终打印出：

          adress  heat_production  attribute_code    attribute_text
0   Teststreet 1             7420            7420  Heating with gas
1   Teststreet 2             7410            7410  Heating with oil

关键列的重复总是可以用

.drop(['attribute_code'])

消除如果它激怒了你。

Pandas - code-value-pair-look-like-dataframe 中的复杂查找

问题描述投票：0回答：2

2个回答

最新问题

Pandas - code-value-pair-look-like-dataframe 中的复杂查找

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2