进行tostring()时丢失原始文本

问题描述 投票:1回答:1

我有一个json对象,其中一个键是:

"transcript": "The universe is bustling with matter and energy. Even in the vast apparent emptiness of intergalactic space, there's one hydrogen atom per cubic meter. That's not the mention a barrage of particles and electromagnetic radiation passing every which way from stars, galaxies, and into black holes. There's even radiation left over from the Big Bang...

加载数据框:

#initialize dataframe for the universe transcript 

dfJson = pd.read_json('test1.json')

这是我尝试提取它的代码。

dfJsonTranscript = dfJson.get('transcript').to_string()
pprint.pprint(dfJsonTranscript)

text_file = open("sample.txt", "wt")
n = text_file.write(dfJsonTranscript)
text_file.close()

我的输出

0      The universe is bustling with matter and energ...
1      The universe is bustling with matter and energ...
2      The universe is bustling with matter and energ...
3      The universe is bustling with matter and energ...
4      The universe is bustling with matter and energ...
5      The universe is bustling with matter and energ...
6      The universe is bustling with matter and energ...
7      The universe is bustling with matter and energ...
8      The universe is bustling with matter and energ...

原始JSON:

"transcript": "The universe is bustling with matter and energy. Even in the vast apparent emptiness of intergalactic space, there's one hydrogen atom per cubic meter. That's not the mention a barrage of particles and electromagnetic radiation passing every which way from stars, galaxies, and into black holes. There's even radiation left over from the Big Bang... universe. ",
  "words": [
    {
      "alignedWord": "the",
      "case": "success",
      "end": 6.31,
      "endOffset": 3,
      "phones": [
        {
          "duration": 0.09,
          "phone": "dh_B"
        },
        {
          "duration": 0.05,
          "phone": "iy_E"
        }
      ],
      "start": 6.17,
      "startOffset": 0,
      "word": "The"
    },
    {
      "alignedWord": "universe",
      "case": "success",
      "end": 6.83,
      "endOffset": 12,
      "phones": [
        {
          "duration": 0.08,
          "phone": "y_B"
        },

为什么要在其上运行toString()方法时丢失键的原始值。我会因为通过熊猫将其变成数据框而丢失它吗?

python json pandas csv tostring
1个回答
0
投票

尝试一下:

dfJsonTranscript = dfJson.get('transcript').to_string(index=False)

设置index=False,我们可以指示to_stringDataFrame方法不要打印索引(行)标签。

编辑:

为了防止字符串被截断,您可以在熊猫上设置max_colwidth属性,需要在调用to_string方法之前进行设置。pd.set_option("display.max_colwidth", 10000)
© www.soinside.com 2019 - 2024. All rights reserved.