为 R 中的每列创建具有唯一梯度的热图

问题描述 投票:0回答:1

我目前正在使用 ggplot2 在 R 中创建热图。我的数据集中的每一列代表一个不同的变量,我想为每个唯一的变量值分配一个唯一的梯度。然而,我在实现这个目标方面遇到了困难。

我有一个数据集数据,其中包含有关不同物种和各种属性的信息。这是我的数据的示例子集:

Species                     Total   LSE     Ortholog    Truncated   Pseudogenes
Wirenia_argentea           258     115     143         19          10
Gymnomenia_pellucida       260     96      164         7           3
Epimenia_babai             511     350     161         15          68
Acanthochitona_crinita     220     52      168         10          0
Acanthopleura_granulata     157     31     126          2           9
Mopalia_swanii             527     278     249         31          104
Mopalia_vespertina         491     249     242         13          57
Nautilus_pompilius         411     146     265         14          35
Argonauta_argo             137      5      132          0           4
Octopus_bimaculoides       192      11     181          2           4
Octopus_minor              236      43     193          3          NA
Octopus_sinensis           203      23     180          7           2
Octopus_vulgaris           329      51     278          2           5
Octopus_maya               161      33     128          2           8
Octopus_mimus               78      10      68          1           4
Octopus_insularis          170      43     127          1          10
Octopus_rubescens          197      42     155          5          16
Hapalochlaena_maculosa     172       3     169          0           0
Muusoctopus_leioderma*     152      10     142          1           2
Muusoctopus_longibrachus*  125       5     120          1           0
Japetella_diaphana          56      13      43          1           2
Sepia_pharaonis            162      35     127          0           7
Euprymna_scolopes          282      29     253          2           4
Octopoteuthis_deletron       46       6      40          0          NA
Watasenia_scintillans       52      10      42          3           3
Architeuthis_dux           323      34     289          3           7
Laevipilina_antarctica     164      62     102          6          NA
Gadila_tolmiei             140      53      87          9           1
Alviniconcha_marisindica   378     188     190          5          57
Batillaria_attramentaria   444     163     281         19          15
Melanoides_tuberculata     226     111     115         12          42
Babylonia_areolata         878     645     233         16          74
Conus_betulinus           1210    1071     139        265         773
Conus_consors             1560    1226     334        200         418

我已将这些数据融合为 ggplot2 的长格式,从而生成一个包含 Species、variable 和 value 列的数据框 Melted_data。例如:

Species                     variable    value
Wirenia argentea           Total       258
Gymnomenia pellucida       Total       260
Epimenia babai             Total       511
Acanthochitona crinita     Total       220
Acanthopleura granulata    Total       157
Mopalia swanii             Total       527
Mopalia vespertina         Total       491
Laevipilina antarctica     Total       164
Gadila tolmiei             Total       140
Wirenia argentea           LSE         115
Gymnomenia pellucida       LSE         96
Epimenia babai             LSE         350
Acanthochitona crinita     LSE         52
Acanthopleura granulata    LSE         31
Mopalia swanii             LSE         278
Mopalia vespertina         LSE         249
Laevipilina antarctica     LSE         62
Gadila tolmiei             LSE         53
Wirenia argentea           Ortholog    143
Gymnomenia pellucida       Ortholog    164
Epimenia babai             Ortholog    161
Acanthochitona crinita     Ortholog    168
Acanthopleura granulata    Ortholog    126
Mopalia swanii             Ortholog    249
Mopalia vespertina         Ortholog    242
Laevipilina antarctica     Ortholog    102
Gadila tolmiei             Ortholog    87
Wirenia argentea           Truncated   19
Gymnomenia pellucida       Truncated   7
Epimenia babai             Truncated   15
Acanthochitona crinita     Truncated   10
Acanthopleura granulata    Truncated   2
Mopalia swanii             Truncated   31
Mopalia vespertina         Truncated   13
Laevipilina antarctica     Truncated   6
Gadila tolmiei             Truncated   9
Wirenia argentea           Pseudogenes 10
Gymnomenia pellucida       Pseudogenes 3
Epimenia babai             Pseudogenes 68
Acanthochitona crinita     Pseudogenes 0
Acanthopleura granulata    Pseudogenes 9
Mopalia swanii             Pseudogenes 104
Mopalia vespertina         Pseudogenes 57
Laevipilina antarctica     Pseudogenes NA
Gadila tolmiei             Pseudogenes 1

我想创建一个热图,其中 y 轴将包含物种名称,x 轴将包含变量值。每列(变量)必须有其独特的梯度。例如,我希望 Total 列的梯度与 LSE 列的梯度不同,依此类推(否则,热图的 Total 列将始终具有最高值)。

我尝试使用以下代码创建热图:

# Melt the data frame to long format for ggplot
melted_data <- melt(data, id.vars = "Species")

# Remove underscores from species names
melted_data$Species <- gsub("_", " ", melted_data$Species)

# Define the breakpoints and corresponding colors
breaks <- c(0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1)
colors <- c("#e9eca7", "#c9f6c7", "#a8ecd2", "#92dfdc", "#8bd0df", "#5b8dce", "#4575b4", "#fca562", "#fc8d59")

# Plot the heatmap
heatmap_plot <- ggplot(melted_data, aes(x = variable, y = Species, fill = value)) +
  geom_tile(color = "white") +
  geom_text(aes(label = round(value, 2)), color = "black") +
  scale_fill_gradientn(colors = colors, na.value = "white") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  scale_x_discrete() +
  coord_fixed(ratio = 1)

# Print the plot
print(heatmap_plot)

但是,此代码为所有列分配相同的梯度。我正在寻找一种为每列分配唯一渐变的方法。

我想创建一个热图,其中每列都有唯一的梯度,这将允许更好地可视化跨变量的数据分布。

我正在寻求有关修改代码以实现预期结果的指导。具体来说,我需要帮助为热图中的每一列分配单独的梯度。 任何建议或见解将不胜感激。

谢谢!

r ggplot2 heatmap
1个回答
0
投票

一种可能的方法是分割变量上的数据并创建具有自己的颜色渐变的多个图。然后用cowplot/patchwork或类似的包将它们排列在一起。问题是图例,我不得不完全忽略它,因为我看不到任何方法来轻松显示所有 5 个渐变,因为变量具有不同的比例。

nVars <- length(unique(df$variable))
nSpecies <- length(unique(df$Species))

colors <- list(brewer.pal(nSpecies, "Blues"),
               brewer.pal(nSpecies, "Greens"),
               brewer.pal(nSpecies, "Reds"),
               brewer.pal(nSpecies, "Purples"),
               brewer.pal(nSpecies, "YlOrRd"))

df_split <- split(df, df$variable)

gg <- lapply(seq_along(df_split), function(i) {
  g <- ggplot(df_split[[i]], aes(x=1, y = Species, fill = value)) +
    geom_tile(color = "white") +
    geom_text(aes(label = round(value, 2)), color = "grey40") +
    scale_fill_gradientn(colors = colors[[i]], na.value = "white") +
    scale_x_discrete(expand=c(0,0)) +
   # coord_fixed(ratio = 1) +
    theme_minimal() +
    guides(fill="none", x="none") +
    labs(x=df_split[[i]]['variable'][1,]) +
    theme(axis.title.x = element_text(angle = 90, hjust = 0, vjust=0.5))
  if(i>1) g + theme(
    axis.title.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.text.x = element_text(angle = 45, hjust = 1),
    plot.margin = unit(c(1,0,1,0), "cm")) else g +
    theme(axis.title.x = element_text(angle = 90, vjust=0.5),
          plot.margin = unit(c(1,0,1,0), "cm"))
})

cowplot::plot_grid(plotlist = gg, nrow=1, align="h", rel_widths = c(3,1,1,1,1))

© www.soinside.com 2019 - 2024. All rights reserved.