在有效点之间插值时精度损失

问题描述 投票:0回答:1

我的 test.csv 包含许多 NaN:

"Time","Y1","Y2","Y3"
"s","celsius","celsius","celsius"
"0.193","","",""
"0.697","","1",""
"1.074","","","-27"
"1.579","10","",""
"2.083","","5",""
"3.123","15","","-28"
"5.003","","",""

当我尝试使用插值填充有效点之间缺失的数据时,它会用整个整数填充它:

import pandas as pd
df = pd.read_csv("test.csv")
df.loc[1:, "Y3"] = pd.to_numeric(df.loc[1:, "Y3"])
df.loc[1:, "Y3"] =  df.loc[1:, "Y3"].interpolate(method='linear').ffill()  #method='time' , method='index'

>>> print (df)
    Time       Y1       Y2       Y3
0      s  celsius  celsius  celsius
1  0.193      NaN      NaN      NaN
2  0.697      NaN        1      NaN
3  1.074      NaN      NaN      -27
4  1.579       10      NaN      -27  <<-----
5  2.083      NaN        5      -27  <<-----
6  3.123       15      NaN      -28
7  5.003      NaN      NaN      -28

我可以用 bfill 修复列开头的 Nans,但是如何用 -27.3、-27.6 等小数值填充 -27 和 -28 之间的点?

python pandas dataframe nan
1个回答
1
投票

问题是第一行有字符串。

df.loc[1:, "Y3"] = pd.to_numeric(df.loc[1:, "Y3"])
不会将 dtype 更改为数字类型。

您不应该将标题作为一行,使用 MultiIndex:

df = pd.read_csv("test.csv", header=[0, 1])

然后:

df['Y3'] = df['Y3'].interpolate(method='linear').ffill()

输出:

    Time      Y1      Y2         Y3
       s celsius celsius    celsius
0  0.193     NaN     NaN        NaN
1  0.697     NaN     1.0        NaN
2  1.074     NaN     NaN -27.000000
3  1.579    10.0     NaN -27.333333
4  2.083     NaN     5.0 -27.666667
5  3.123    15.0     NaN -28.000000
6  5.003     NaN     NaN -28.000000
© www.soinside.com 2019 - 2024. All rights reserved.