我有下表,其中有主题、分数1、分数2、分数3、分数4 列。
| id | score1 | score2 | score3 | score4 |
|----|--------|---------|----------|---------|
| 1 | 0.05 | 0.0608 | 0.476464 | 0.53535 |
| 1 | 0.08 | 0.0333 | 0.8263 | 0.9463 |
| 1 | 0.05 | 0.0926 | 0.8694 | 0.9903 |
| 2 | 0.08 | 0.0425 | 0.1948 | 0.3958 |
| 2 | 0.09 | 0.0992 | 0.1238 | 0.1937 |
| 4 | 0.1 | 0.0627 | 0.0738 | 0.0987 |
| 4 | 0.05 | 0.06262 | 0.721 | 0.12 |
| 4 | 0.04 | 0.05227 | 0.0825 | 0.283 |
| 4 | 0.02 | 0.04728 | 0.0628 | 0.0936 |
我想为每个 id 创建一个图。该图应显示 4 个分数中的 ids 计数。我想为每个 id 创建多个图(此处为 id 1、2 和 4 分别绘制图)。
这是我迄今为止尝试过的。
使用 sql,我创建了单独的列来对每个分数进行分类,并使用下面的 python 代码进行绘图。在这里我只能生成单个图。
plt.figure(figsize=(10, 6))
sns.countplot(x='score', data=df1)
plt.xlabel('Score')
plt.xticks(rotation=90)
plt.ylabel('Count')
plt.title('Distribution per score - id 1')
plt.show()
有没有更简单的方法让我们只用 python 来完成这一切并绘制多个图?
这可以使用循环和直方图以及 matplotlib 中的分箱来完成,如下所示:
import matplotlib.pyplot as plt
ids=[1,1,1,2,2,4,4,4,4]
score1=[0.05,0.08,0.05,0.08,0.09,0.1,0.05,0.04,0.02]
score2=[0.0608,0.0333,0.0926,0.0425,0.0992,0.0627,0.06262,0.05227,0.04728]
score3=[0.476464,0.8263,0.8694,0.1948,0.1238,0.0738,0.721,0.0825,0.0628]
score4=[0.53535,0.9463,0.9903,0.3958,0.1937,0.0987,0.12,0.283,0.0936]
scoredf=pd.DataFrame({'id':ids,'score1':score1,'score2':score2,'score3':score3,'score4':score4})
uniqueids=scoredf['id'].unique()
for i in uniqueids:
tempdf=scoredf[scoredf['id']==i]
tempdf=tempdf.drop(['id'], axis=1)
plt.hist(tempdf,bins=3)
plt.xlabel('Score')
plt.xticks(rotation=90)
plt.ylabel('Count')
plt.title('Distribution per score - id '+str(i))
plt.show()