为什么 df.at 使用 MultiIndex 会引发 KeyError，而 df.loc 不会？

Question

从常规索引切换到多重索引后，我的一个函数出现问题，我不知道如何解决这个问题。让我从 pandas 文档中获取 pandas.DataFrame.at 的 DataFrame 来说明问题：

>>> df = pd.DataFrame([[0, 2, 3], [0, 4, 1], [10, 20, 30]],
...                   index=[4, 5, 6], columns=['A', 'B', 'C'])
>>> df
    A   B   C
4   0   2   3
5   0   4   1
6  10  20  30
>>> df.at[4, 'B']
2

如果您现在将其转换为 MultiIndex，相同的调用将失败并引发 KeyError：

>>> df = df.set_index("A", append=True)
>>> df
       B   C
  A
4 0    2   3
5 0    4   1
6 10  20  30
>>> df.at[4, 'B']
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    df.at[4, "B"]
     ~~~~~^^^^^^^^
  File "/.../pandas/core/indexing.py", line 2419, in __getitem__
    return super().__getitem__(key)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../pandas/core/indexing.py", line 2371, in __getitem__
    return self.obj._get_value(*key, takeable=self._takeable)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../pandas/core/frame.py", line 3882, in _get_value
    loc = engine.get_loc(index)
          ^^^^^^^^^^^^^^^^^^^^^
  File "pandas/_libs/index.pyx", line 822, in pandas._libs.index.BaseMultiIndexCodesEn
gine.get_loc
KeyError: 4

如果 loc 的行为方式相同，那么这种行为就很好 - 但事实并非如此：

>>> df.loc[4, 'B']
A
0    2
Name: B, dtype: int64

当然，您可以通过指定索引的所有级别来解决这个问题......

df.at[(4,0), 'B']
2

但考虑到我有相当多的多索引级别，这似乎不是一个可行的解决方案。使用 loc 然后附加 .iloc[0] 也感觉不太Pythonic...... 有谁知道如何在不指定超过第一级的情况下使 .at 工作？

Answer 1

at

旨在选择 DataFrame 中的单个值。

访问行/列标签对的单个值。

因此您必须提供所有索引器。

如您在示例中所示，具有不完整索引器的

loc

会产生 Series，而不是值：

df.loc[4, 'B']

A
0    2
Name: B, dtype: int64

这与

at

选择单个值的行为不兼容。

为什么 df.at 使用 MultiIndex 会引发 KeyError，而 df.loc 不会？

问题描述投票：0回答：1

1个回答

最新问题

为什么 df.at 使用 MultiIndex 会引发 KeyError，而 df.loc 不会？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1