如何在 csv 列中使用逗号来分隔多个浮点值?

问题描述 投票:0回答:1

我有一个与此问题密切相关的问题:

如何将 .wav 文件转换为 Pandas DataFrame 以便将其输入神经网络?

我使用以下代码创建了一个 pandas DataFrame:

df = pd.DataFrame(data={"wavsamples": pd.Series(wavsamples), "wavsamplerate": pd.Series(wavsamplerate), "wavname": pd.Series(wavname)}, copy=False, columns = ['wavsamples','wavsamplerate','wavname'])
df.index.name = 'filenumber'

如果我在 pandas DataFrame 中打印第二列

print(df.wavsamples.to_string(index=False))

它向我展示了 pandas 系列“wavsamples”,如下所示:

[0.02709961, 0.06796265, -0.011810303, -0.23361...
[0.0068969727, 0.04547119, 0.043029785, -0.1025...
[-0.005432129, 0.021057129, 0.078063965, 0.0270...
[0.00079345703, 0.064941406, 0.09710693, -0.088...
[-0.0067749023, 0.008087158, 0.06536865, 0.0219...
[-0.008758545, 0.015106201, 0.08139038, 0.02600...
[-0.0034179688, 0.039733887, 0.07711792, 0.1164...
[-0.0008087158, -0.000579834, -0.00062561035, -...
[0.021026611, 0.029907227, 0.040527344, 0.05448...
[0.017288208, 0.026321411, 0.0340271, 0.0403137...
[0.019561768, 0.026611328, 0.03668213, 0.047576...
[0.022827148, 0.03414917, 0.056289673, 0.078018...

这 12 行中的每一行代表 .wav 文件的原始浮点样本值。现在,如果我将这些数组写入 CSV 文件的一列中:

df.to_csv("./test.csv", sep=',', columns = ['wavsamples','wavsamplerate','wavname'])

我得到以下 csv 文件:

filenumber,wavsamples,wavsamplerate,wavname
0,"[ 0.02709961  0.06796265 -0.0118103  ... -0.36627197 -0.36645508
 -0.3657837 ]",44100,Audio1.wav
1,"[ 0.00689697  0.04547119  0.04302979 ... -0.03359985 -0.03244019
 -0.03167725]",44100,Audio2.wav
2,"[-0.00543213  0.02105713  0.07806396 ...  0.45645142  0.45541382
  0.45510864]",44100,Audio3.wav
3,[0.00079346 0.06494141 0.09710693 ... 0.22116089 0.22421265 0.22741699],44100,Audio4.wav
4,"[-0.0067749   0.00808716  0.06536865 ...  0.24209595  0.23977661
  0.23754883]",44100,Audio5.wav
5,"[-0.00875854  0.0151062   0.08139038 ... -0.0256958  -0.0184021
 -0.01156616]",44100,Audio6.wav
6,"[-0.00341797  0.03973389  0.07711792 ...  0.41384888  0.41375732
  0.41348267]",44100,Audio7.wav
7,"[-0.00080872 -0.00057983 -0.00062561 ...  0.0100708   0.0100708
  0.01000977]",44100,Audio8.wav
8,[0.02102661 0.02990723 0.04052734 ... 0.00976562 0.00965881 0.00990295],44100,Audio9.wav
9,[0.01728821 0.02632141 0.0340271  ... 0.01344299 0.01341248 0.01325989],44100,Audio10.wav
10,[0.01956177 0.02661133 0.03668213 ... 0.0141449  0.01400757 0.01402283],44100,Audio11.wav
11,[0.02282715 0.03414917 0.05628967 ... 0.01019287 0.01037598 0.01025391],44100,Audio12.wav

因此“wavsamples”列丢失了所有逗号。如果我现在使用以下命令读取并打印 csv 文件中的列:

with open("./test.csv", "r") as csv_file:
    reader = csv.reader(csv_file)
    rows = list(reader)
    audiofile = rows[12][1]
    print(audiofile)

我刚刚得到:

[0.02282715 0.03414917 0.05628967 ... 0.01019287 0.01037598 0.01025391]

不仅删除了所有逗号,而且由于 wavsamples 列被视为字符串,因此三个点被误认为是文字点字符,因此在将它们写入 csv 时,中间的所有样本值都会丢失...

我知道 csv 可能是存储 .wav 数据的最差格式,就像在堆栈溢出时多次指出的那样......但我只是好奇 - 有没有办法存储浮点之间带有逗号的音频数组csv 列中的值?

当我从 csv 中读取内容时,我想得到这样的结果:

[0.022827148, 0.03414917, 0.056289673, 0.078018...

而不是这个:

[0.02282715 0.03414917 0.05628967 ... 0.01019287 0.01037598 0.01025391]

如何编写 csv 列以便之后可以正确读取?

python pandas csv audio wav
1个回答
0
投票

以下代码将您的 .csv 文件转换为 pandas 数据框:

import pandas as pd

with open("waves.csv", "r") as csv_file:
    data = pd.read_csv("waves.csv")
    with pd.option_context('display.max_rows', None, 'display.max_columns', None):  # more options can be specified also
        print(data)
© www.soinside.com 2019 - 2024. All rights reserved.