我需要在Pandas数据框中沿行计算成1列的各种百分位数。例如:
df['P90'] = df[['col1','col2','col3','col4','col5']].apply(quantile(0.9), axis=1)
df['P50'] = df[['col1','col2','col3','col4','col5']].apply(quantile(0.5), axis=1)
我有以下数据框:
ID 2019/31 2019/32 2019/33 2019/34 2019/35 2019/36 2019/37 2019/38 2019/39 2019/40
258101 67000
258102 56750 19105 35990 41250 44425 51275 1071 8125 16375
258103 8528 6853 3291 3000 5640 11248
258104 27532 19523 12092 7933 8675 435 1045 5115 1450
258105 40000 285500 16500
我需要以下格式的输出:
ID 2019/31 2019/32 2019/33 2019/34 2019/35 2019/36 2019/37 2019/38 2019/39 2019/40 P_50 P_90
258101 67000 x1 x2
258102 56750 19105 35990 41250 44425 51275 1071 8125 16375 x3 x4
258103 8528 6853 3291 3000 5640 11248 x5 x6
258104 27532 19523 12092 7933 8675 435 1045 5115 1450 x7 x8
258105 40000 285500 16500 x9 x10
我尝试了以下操作:
cols = ['2019/31', '2019/32', '2019/33', '2019/34', '2019/35', '2019/36', '2019/37', '2019/38', '2019/39', '2019/40']
df['P_50'] = df[cols].apply(np.median, axis=1)
df['P_50'] = df[cols].apply(np.quantile(0.5), axis=1)
perc99 = np.vectorize(lambda x: np.percentile(x, 50))
df['P_50'] = perc99(df[cols].values)
均未提供所需的输出。
我从发帖中刮了你的df。这是您要尝试执行的操作吗?:
df['P_50'] = df.iloc[:,1:-1].median(axis=1)
df['P_50'] = df.iloc[:,1:-1].quantile(0.5, axis=1)
此不包括ID列
我认为,这将起作用:
df['P_50'] = np.median(df.values, axis=1)
df['P_90'] = np.quantile(df.values, 0.9, axis=1)