我正在编写代码来执行探索性数据集分析,作为其中的一部分,我想绘制数据集中的一些变量。我想要一个函数来生成绘图对象,然后可以根据需要在 Jupyter Notebook 中调用并显示该对象。在 R 中我可以这样:
# install.packages("tidyverse")
# install.packages("ggpubr")
supress_all <- function(e) {suppressPackageStartupMessages(suppressWarnings(e))}
supress_all(library(tidyverse))
supress_all(library(ggpubr))
# Adjust size of the plots in jupyter
options(repr.plot.width = 10, repr.plot.height = 4)
make_me_a_plot <- function(data, x_name, y_name) {
res <- ggplot() +
geom_point(aes(x = data[[x_name]], y = data[[y_name]])) +
labs(title = paste0(x_name, " vs ", y_name), x = x_name, y = y_name)
return(res)
}
p1 <- make_me_a_plot(mtcars, "mpg", "hp")
p2 <- make_me_a_plot(mtcars, "mpg", "wt")
p3 <- make_me_a_plot(mtcars, "mpg", "qsec")
然后,当我想调用我的图时,我可以做这样的事情。
# Plot just 2 plots and ignore the last one generated - p3
ggarrange(p1, p2, ncol = 2, nrow = 1)
绘图
p1
、p2
、p3
仍然可用,并且可以多次使用或即时修改。我如何在Python中实现同样的事情?下面不起作用的示例代码。
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
plt.ioff()
mtcars = sns.load_dataset('mpg')
x_name = 'mpg'
y_name = 'horsepower'
def make_me_a_plot(data, x_name, y_name):
res = plt.scatter(x=data[[x_name]], y=data[[y_name]])
return res
p1 = make_me_a_plot(mtcars, 'mpg', 'horsepower')
p2 = make_me_a_plot(mtcars, 'mpg', 'weight')
p2 = make_me_a_plot(mtcars, 'mpg', 'acceleration')
# What next? plt.show() will just draw all of the plots on the same figure.
调用
plt.show()
将仅显示所有绘图。不是所想要的。
您可以使用 plotnine 获取 python 中的
ggplot
功能和语法,并使用 patchworklib 将 ggpubr 替换为 ggarrange
(不是 1:1)。
与情节九:
from plotnine import ggplot, aes, geom_point, labs
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import patchworklib as pw
plt.ioff()
mtcars = sns.load_dataset("mpg")
x_name = "mpg"
y_name = "horsepower"
def make_me_a_plot(data: pd.DataFrame, x_name: str, y_name: str) -> pw.Brick:
res = (
ggplot(data)
+ geom_point(aes(x=x_name, y=y_name))
+ labs(title=f"{x_name} vs {y_name}", x=x_name, y=y_name)
)
return pw.load_ggplot(res)
p1 = make_me_a_plot(mtcars, "mpg", "horsepower")
p2 = make_me_a_plot(mtcars, "mpg", "weight")
p3 = make_me_a_plot(mtcars, "mpg", "acceleration")
p1 | p2
使用 matplotlib/seaborn:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import patchworklib as pw
plt.ioff()
sns.set_style("darkgrid")
mtcars = sns.load_dataset("mpg")
x_name = "mpg"
y_name = "horsepower"
def make_me_a_plot(
data: pd.DataFrame,
x_name: str,
y_name: str,
) -> pw.Brick:
ax = pw.Brick(figsize=(6, 4))
sns.scatterplot(
x=x_name,
y=y_name,
data=data,
ax=ax,
s=4**2,
)
ax.set_title(f"{x_name} vs. {y_name}")
return ax
p1 = make_me_a_plot(mtcars, "mpg", "horsepower")
p2 = make_me_a_plot(mtcars, "mpg", "weight")
p3 = make_me_a_plot(mtcars, "mpg", "acceleration")
p1 | p2
您可以使用
|
将绘图放在彼此旁边(按行),使用 /
放在彼此下方(按列)。
注意:
patchworklib
,更具体地说是load_ggplot
,不适用于最新版本的plotnine
。安装plotnine==0.12.4
。