Pandas DataFrame KeyError:1

问题描述 投票:1回答:1

我是初学者,所以当train.jsonl使用这样的格式时,无法在以下代码中指出错误的原因

{"claim": "But he said if people really want to know if they have CHIP they can get a blood test that costs a few MONEYc1", "evidence": "sentenceID100037", "label": "0"}
{"claim": "This is rather a courtly formulation and would doubtless trigger further eyerolling if uttered in", "evidence": "sentenceID100038", "label": "0"}

顶部执行没有问题并显示数据。

import pandas as pd

prefix = '/content/'
train_df = pd.read_json(prefix + 'train.jsonl', orient='records', lines=True)
train_df.head()
[See my Colab Notebook][https://colab.research.google.com/gist/lenyabloko/0e17ebe0f3a0e808779bc1fa95e9b24d/semeval2020-delex.ipynb]

我什至尝试了这个额外的技巧,它解释了有关0列的评论

prefix = '/content/'
train_df = pd.read_json(prefix + 'train_delex.jsonl', orient='columns')

train_df.to_csv(prefix+'train.tsv', sep='\t', index=False, header=False)
train_df = pd.read_csv(prefix + 'train.tsv', header=None)

train_df.head()

现在我从上面看到的是标记为'0'的列,而不是原来的三列{“ claim”:“ ...”,“ evidence”:“ ...”,“ label”:“ ...”} JSONL文件(为什么?)

但是当我添加DataFrame代码时会导致错误

train_df = pd.DataFrame({
    'id': train_df[1],
    'text': train_df[0],
    'labels':train_df[2]
})

鉴于名为“ 0”的列将不起作用。但是那列是从哪里来的??

KeyError                                  Traceback (most recent call last)
2 frames
<ipython-input-16-0537eda6b397> in <module>()
      6 
      7 train_df = pd.DataFrame({
----> 8     'id': train_df[1],
      9     'text': train_df[0],
     10     'labels':train_df[2]

/usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in __getitem__(self, key)
   2993             if self.columns.nlevels > 1:
   2994                 return self._getitem_multilevel(key)
-> 2995             indexer = self.columns.get_loc(key)
   2996             if is_integer(indexer):
   2997                 indexer = [indexer]

/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2897                 return self._engine.get_loc(key)
   2898             except KeyError:
-> 2899                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2900         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2901         if indexer.ndim > 1 or indexer.size > 1:

我是初学者,所以当train.jsonl使用类似{“ claim”的格式时,无法在以下代码中指出错误的原因:“但是他说,如果人们真的想知道他们是否拥有CHIP,他们可以得到.. 。

json pandas dataframe keyerror
1个回答
0
投票

这里是对我有用的解决方案:

© www.soinside.com 2019 - 2024. All rights reserved.