我的目标是根据特定的列和特定的类型获取数据并插入缺失值我实现了这个目标,但是在插值之前我很难回到数据框的形状。
data = [
{"type": "Car", "avg_speed": 30, "max_speed": 200},
{"type": "Car", "avg_speed": 20, "max_speed": 100},
{"type": "Car", "avg_speed": 25, "max_speed": None},
{"type": "Plane", "avg_speed": 300, "max_speed": 2000},
{"type": "Plane", "avg_speed": 200, "max_speed": 1000},
{"type": "Plane", "avg_speed": 250, "max_speed": None}
]
df = pd.DataFrame(data)
print(df)
post_interp = df.groupby("type").apply(lambda x: x.set_index(
'avg_speed').sort_index().interpolate(method='index'))
print(post_interp)
第一张照片:
type avg_speed max_speed
0 Car 30 200.0
1 Car 20 100.0
2 Car 25 NaN
3 Plane 300 2000.0
4 Plane 200 1000.0
5 Plane 250 NaN
第二次打印:
type max_speed
type avg_speed
Car 20 Car 100.0
25 Car 150.0
30 Car 200.0
Plane 200 Plane 1000.0
250 Plane 1500.0
300 Plane 2000.0
我想返回带有插值的打印1中数据框的形状。
用途:
post_interp = (df.groupby("type", group_keys=False)
.apply(lambda x: x.set_index('avg_speed')
.sort_index()
.interpolate(method='index'))
.reset_index())
或:
post_interp = (df.set_index('avg_speed')
.sort_index()
.groupby("type", group_keys=False)
.apply(lambda x: x.interpolate(method='index'))
.reset_index())
print(post_interp)
avg_speed type max_speed
0 20 Car 100.0
1 25 Car 150.0
2 30 Car 200.0
3 200 Plane 1000.0
4 250 Plane 1500.0
5 300 Plane 2000.0