我正在创建回归模型的验证损失列表,它们的格式为:
mylist = ['[72.49191836014535]', '[83.83327374257702]', '[72.48327325617225]',
'[66.98897377186994]', '[71.13875892170039]', '[64.3820106481657]',
'[73.28287317220448]', '[79.7119315804787]', '[79.55777844179023]',
'[89.62320741326292]']
如何将它们放在简单列表中以计算平均值/偏差?
[eval(i)[0] for i in mylist]
带有numpy
的完整示例:
import numpy as np
mylist = ['[72.49191836014535]', '[83.83327374257702]', '[72.48327325617225]',
'[66.98897377186994]', '[71.13875892170039]', '[64.3820106481657]',
'[73.28287317220448]', '[79.7119315804787]', '[79.55777844179023]',
'[89.62320741326292]']
clean_list = np.array([eval(i)[0] for i in mylist])
#array([72.49191836, 83.83327374, 72.48327326, 66.98897377, 71.13875892,
# 64.38201065, 73.28287317, 79.71193158, 79.55777844, 89.62320741])
clean_list.mean()
75.34
没有numpy
的完整示例:
mylist = ['[72.49191836014535]', '[83.83327374257702]', '[72.48327325617225]',
'[66.98897377186994]', '[71.13875892170039]', '[64.3820106481657]',
'[73.28287317220448]', '[79.7119315804787]', '[79.55777844179023]',
'[89.62320741326292]']
clean_list = [eval(i)[0] for i in mylist]
#[72.49191836, 83.83327374, 72.48327326, 66.98897377, 71.13875892,
# 64.38201065, 73.28287317, 79.71193158, 79.55777844, 89.62320741]
average_list = sum(clean_list) / len(clean_list)
75.34
不使用eval:
import numpy as np
mylist = ['[72.49191836014535]', '[83.83327374257702]', '[72.48327325617225]',
'[66.98897377186994]', '[71.13875892170039]', '[64.3820106481657]',
'[73.28287317220448]', '[79.7119315804787]', '[79.55777844179023]',
'[89.62320741326292]']
mylist = np.array([float(i[1:-1]) for i in mylist])
mylist.mean()
输出:
75.34939993083671
您可以使用正则表达式删除[和],然后将值转换为float:
import regex as re
mylist = [
"[72.49191836014535]",
"[83.83327374257702]",
"[72.48327325617225]",
"[66.98897377186994]",
"[71.13875892170039]",
"[64.3820106481657]",
"[73.28287317220448]",
"[79.7119315804787]",
"[79.55777844179023]",
"[89.62320741326292]",
]
data = [float(re.sub(r"[\[\]]", "", v)) for v in mylist]
输出:
[72.49191836014535,
83.83327374257702,
72.48327325617225,
66.98897377186994,
71.13875892170039,
64.3820106481657,
73.28287317220448,
79.7119315804787,
79.55777844179023,
89.62320741326292]
另一个解决方案,与上面类似:
from statistics import mean
mylist = ['[72.49191836014535]', '[83.83327374257702]', '[72.48327325617225]',
'[66.98897377186994]', '[71.13875892170039]', '[64.3820106481657]',
'[73.28287317220448]', '[79.7119315804787]', '[79.55777844179023]',
'[89.62320741326292]']
list = [eval(i)[0] for i in mylist]
print(mean(list))
结果:
75.3493999308367
在上面的马吉斯和埃里克日展开:
数据= [在我的列表中为a的eval(a)[0]]
进口统计
stats = [sum,statistics.mean,statistics.stdev,statistics.variance]
对于统计资料中的统计资料:print(stat .___ name____,stat(data))