在 R 中生成 96 或 384 孔板布局

问题描述 投票:0回答:5

我正在尝试编写一些代码,该代码将采用包含一些样本名称的 .csv 文件作为输入,并输出包含样本名称和 96 孔板或 384 孔板格式(A1、B1、C1)的 data.frame ...)。对于那些不知道的人,96 孔板有 8 个按字母顺序标记的行(A、B、C、D、E、F、G、H)和 12 个数字标记的列 (1:12),384 孔板有 16 个按字母顺序标记的行 (A:P) 和 24 个按数字标记的列 (1:24)。我正在尝试编写一些代码来生成这些格式中的任何一种(可以有两个不同的函数来执行此操作),允许将样本标记为向下(A1、B1、C1、D1、E1、F1、G1、H1 、A2...)或交叉(A1、A2、A3、A4、A5...)。

到目前为止,我已经弄清楚如何相当轻松地获取行名称

rowLetter <- rep(LETTERS[1:8], length.out = variable)
#variable will be based on how many samples I have

我只是不知道如何正确应用数字列名称...我已经尝试过:

colNumber <- rep(1:12, times = variable) 

但事情没那么简单。如果要“向下”,则必须在列号增加 1 之前填充所有 8 行;如果要“横向”,则必须在行字母增加 1 之前填充所有 12 列。

编辑:

这是一个笨重的版本。它需要您拥有的样品数量、尚未起作用的“板格式”以及方向,并将返回包含孔和板编号的数据框。接下来,我将 a) 修复板格式,以便其正常工作,b) 使该函数能够获取样本名称或 ID 或其他内容的列表,并返回样本名称、孔位置和板编号!

plateLayout <- function(numOfSamples, plateFormat = 96, direction = "DOWN"){
  #This assumes that each well will be filled in order. I may need to change this, but     lets get it working first.

  #Calculate the number of plates required
  platesRequired <- ceiling(numOfSamples/plateFormat)
  rowLetter <- character(0)
  colNumber <- numeric(0)
  plateNumber <- numeric(0)

  #The following will work if the samples are going DOWN
  if(direction == "DOWN"){
    for(k in 1:platesRequired){
     rowLetter <- c(rowLetter, rep(LETTERS[1:8], length.out = 96))  
      for(i in 1:12){
       colNumber <- c(colNumber, rep(i, times = 8))
      }
     plateNumber <- c(plateNumber, rep(k, times = 96))
    }  
  plateLayout <- paste0(rowLetter, colNumber)
  plateLayout <- data.frame(plateLayout, plateNumber)
  plateLayout <- plateLayout[1:numOfSamples,]
  return(plateLayout)
  }

  #The following will work if the samples are going ACROSS 
  if(direction == "ACROSS"){
    for(k in 1:platesRequired){
      colNumber <- c(colNumber, rep(1:12, times = 8))
      for(i in 1:8){
        rowLetter <- c(rowLetter, rep(LETTERS[i], times = 12))
        }
      plateNumber <- c(plateNumber, rep(k, times = 96))
      }
    plateLayout <- paste0(rowLetter, colNumber)
    plateLayout <- data.frame(plateLayout, plateNumber)
    plateLayout <- plateLayout[1:numOfSamples,]
    return(plateLayout)
  }
}

有人对还有什么可以让它变得很酷有任何想法吗?我将使用此函数生成 .csv 或 .txt 文件,用作不同仪器的示例名称导入,因此我会受到“酷功能”的限制,但我认为使用 ggplot 会很酷制作一个显示板和样品名称的图形?

r bioinformatics
5个回答
7
投票

您不需要

for
循环。这是一个开始:

#some sample ids
ids <- c(LETTERS, letters)
#plate size:
n <- 96
nrow <- 8
samples <- character(n)
samples[seq_along(ids)] <- ids

samples <- matrix(samples, nrow=nrow)
colnames(samples) <- seq_len(n/nrow)
rownames(samples) <- LETTERS[seq_len(nrow)]

#   1   2   3   4   5   6   7   8  9  10 11 12
# A "A" "I" "Q" "Y" "g" "o" "w" "" "" "" "" ""
# B "B" "J" "R" "Z" "h" "p" "x" "" "" "" "" ""
# C "C" "K" "S" "a" "i" "q" "y" "" "" "" "" ""
# D "D" "L" "T" "b" "j" "r" "z" "" "" "" "" ""
# E "E" "M" "U" "c" "k" "s" ""  "" "" "" "" ""
# F "F" "N" "V" "d" "l" "t" ""  "" "" "" "" ""
# G "G" "O" "W" "e" "m" "u" ""  "" "" "" "" ""
# H "H" "P" "X" "f" "n" "v" ""  "" "" "" "" ""

library(reshape2)
samples <- melt(samples)
samples$position <- paste0(samples$Var1, samples$Var2)

#    Var1 Var2 value position
# 1     A    1     A       A1
# 2     B    1     B       B1
# 3     C    1     C       C1
# 4     D    1     D       D1
# 5     E    1     E       E1
# 6     F    1     F       F1
# 7     G    1     G       G1
# 8     H    1     H       H1
# 9     A    2     I       A2
# 10    B    2     J       B2
# 11    C    2     K       C2
# 12    D    2     L       D2
# 13    E    2     M       E2
# 14    F    2     N       F2
# 15    G    2     O       G2
# 16    H    2     P       H2
# 17    A    3     Q       A3
# 18    B    3     R       B3
# 19    C    3     S       C3
# 20    D    3     T       D3
# 21    E    3     U       E3
# 22    F    3     V       F3
# 23    G    3     W       G3
# 24    H    3     X       H3
# 25    A    4     Y       A4
# 26    B    4     Z       B4
# 27    C    4     a       C4
# 28    D    4     b       D4
# 29    E    4     c       E4
# 30    F    4     d       F4
# 31    G    4     e       G4
# 32    H    4     f       H4
# 33    A    5     g       A5
# 34    B    5     h       B5
# 35    C    5     i       C5
# 36    D    5     j       D5
# 37    E    5     k       E5
# 38    F    5     l       F5
# 39    G    5     m       G5
# 40    H    5     n       H5
# 41    A    6     o       A6
# 42    B    6     p       B6
# 43    C    6     q       C6
# 44    D    6     r       D6
# 45    E    6     s       E6
# 46    F    6     t       F6
# 47    G    6     u       G6
# 48    H    6     v       H6
# 49    A    7     w       A7
# 50    B    7     x       B7
# 51    C    7     y       C7
# 52    D    7     z       D7
# 53    E    7             E7
# 54    F    7             F7
# 55    G    7             G7
# 56    H    7             H7
# 57    A    8             A8
# 58    B    8             B8
# 59    C    8             C8
# 60    D    8             D8
# 61    E    8             E8
# 62    F    8             F8
# 63    G    8             G8
# 64    H    8             H8
# 65    A    9             A9
# 66    B    9             B9
# 67    C    9             C9
# 68    D    9             D9
# 69    E    9             E9
# 70    F    9             F9
# 71    G    9             G9
# 72    H    9             H9
# 73    A   10            A10
# 74    B   10            B10
# 75    C   10            C10
# 76    D   10            D10
# 77    E   10            E10
# 78    F   10            F10
# 79    G   10            G10
# 80    H   10            H10
# 81    A   11            A11
# 82    B   11            B11
# 83    C   11            C11
# 84    D   11            D11
# 85    E   11            E11
# 86    F   11            F11
# 87    G   11            G11
# 88    H   11            H11
# 89    A   12            A12
# 90    B   12            B12
# 91    C   12            C12
# 92    D   12            D12
# 93    E   12            E12
# 94    F   12            F12
# 95    G   12            G12
# 96    H   12            H12

使用

byrow
参数在另一个方向填充矩阵:

samples <- matrix(samples, nrow=nrow, byrow=TRUE)

要填充多个盘子,您可以使用基本相同的想法,但使用数组而不是矩阵。


1
投票

我以前从未用 R 编写过这段代码,但它应该与 Perl、Python 或 Java 相同

对于行主序(遍历),伪代码算法很简单:

for each( i : 0..totalNumWells - 1){
   column   = (i % numColumns)
   row      = ((i % totalNumWells) / numColumns)
}

其中 96 孔板的 numColumns 为 12、24 或 384,

totalNumWells
分别为 96 或 384。这将为您提供基于 0 的坐标中的列和行索引,非常适合访问数组。

 wellName   = ABCs[row], column + 1

其中 ABC 是车牌中所有有效字母(或 A-Z)的数组。

+1
是将0基转换为1基,否则第一口井将是A0而不是A1。

我还想指出,通常 384 口井不按行主要顺序排列。我经常看到测序中心更喜欢“棋盘”模式 A01、A03、A05...然后是 A02、A04、A06...、B01、B03...等,以便能够组合 4 个 96 孔板无需改变布局即可集成到单个 384 孔中,并简化了采摘机器人的工作。这是一个更难计算第 i 个的算法


0
投票

以下代码完成了我打算做的事情。您可以使用它来制作所需数量的印版,前提是您的导入列表将按顺序排列。它可以根据您的需要制作尽可能多的印版,并将添加一个“plateNumber”列,该列将指示它所在的批次。它只能处理 96 或 384 孔板,但这就是我所处理的全部,所以没关系。

plateLayout <- function(numOfSamples, plateFormat = 96, direction = "DOWN"){
  #This assumes that each well will be filled in order.

#Calculate the number of plates required
platesRequired <- ceiling(numOfSamples/plateFormat)
rowLetter <- character(0)
colNumber <- numeric(0)
plateNumber <- numeric(0)

#define the number of columns and number of rows based on plate format (96 or 384 well plate)
switch(as.character(plateFormat),
       "96" = {numberOfColumns = 12; numberOfRows = 8},
       "384" = {numberOfColumns = 24; numberOfRows = 16})

#The following will work if the samples are going DOWN
if(direction == "DOWN"){
  for(k in 1:platesRequired){
    rowLetter <- c(rowLetter, rep(LETTERS[1:numberOfRows], length.out = plateFormat))  
  for(i in 1:numberOfColumns){
    colNumber <- c(colNumber, rep(i, times = numberOfRows))
    }
plateNumber <- c(plateNumber, rep(k, times = plateFormat))
  }  
plateLayout <- paste0(rowLetter, colNumber)
plateLayout <- data.frame(plateNumber,plateLayout)
plateLayout <- plateLayout[1:numOfSamples,]
return(plateLayout)
}

#The following will work if the samples are going ACROSS 
if(direction == "ACROSS"){
  for(k in 1:platesRequired){
    colNumber <- c(colNumber, rep(1:numberOfColumns, times = numberOfRows))
    for(i in 1:numberOfRows){
      rowLetter <- c(rowLetter, rep(LETTERS[i], times = numberOfColumns))
      }
    plateNumber <- c(plateNumber, rep(k, times = plateFormat))
    }
  plateLayout <- paste0(rowLetter, colNumber)
  plateLayout <- data.frame(plateNumber, plateLayout)
  plateLayout <- plateLayout[1:numOfSamples,]
  return(plateLayout)
  }
}

如何使用它的示例如下

#load whatever data you're going to use to get a plate layout on (sample ID's or names or whatever)
thisData <- read.csv("data.csv")

#make a data.frame containing your sample names and the function's output
    #alternatively you can use length() if you have a list
plateLayoutDataFrame <- data.frame(thisData$sampleNames, plateLayout(nrow(thisData), plateFormat = 96, direction = "DOWN")

#It will return something similar to the following, depending on your selections
#data plateNumber plateLayout
#sample1           1          A1
#sample2           1          B1
#sample3           1          C1
#sample4           1          D1
#sample5           1          E1
#sample6           1          F1
#sample7           1          G1
#sample8           1          H1
#sample9           1          A2
#sample10          1          B2
#sample11          1          C2
#sample12          1          D2
#sample13          1          E2
#sample14          1          F2
#sample15          1          G2

现在总结一下这个功能。 Roland 提供了一种很好的方法,该方法不太冗长,但我想尽可能避免使用外部包。我现在正在开发一个

shiny
应用程序,它实际上使用了这个!我希望它能够根据“plateNumber”自动进行子集化,并将每个板写入它自己的文件...有关更多信息,请访问:R-Shiny 中的自动多文件下载


0
投票

在链条中有点晚,但使用

expand.grid
会非常有帮助。此外,我发现下游处理有时会受益于能够对井名进行排序。此示例中的前导零有助于确保“A1”、“A2”...“A10”、“A11”中的“A2”位于“A10”之前。

plateLayout <- function(nSamples, nPlates, plateFormat = c("96", "384"),
  direction = c("down", "across")) {

# process arguments
  nSamples <- as.integer(nSamples)
  plateFormat <- match.arg(plateFormat)
  plateFormat <- as.integer(plateFormat)
  direction <- match.arg(direction)
  nCol <- ifelse(plateFormat == 96, 12, 24)
  nRow <- ifelse(plateFormat == 96, 8, 16)
  if (missing(nPlates))
    nPlates <- ceiling(nSamples/plateFormat)
  
# use expand.grid and organize as 'plate', 'row' and 'column' 
  if (direction == "across") {
    v <- expand.grid(column = seq_len(nCol), row = LETTERS[1:nRow],
      plate = seq_len(nPlates), stringsAsFactors = FALSE)
    v <- v[c(3, 2, 1)]
  }
  else {
    v <- expand.grid(row = LETTERS[1:nRow], column = seq_len(nCol), 
      plate = seq_len(nPlates), stringsAsFactors = FALSE)
    v <- v[c(3, 1, 2)]
  }

# assemble data.frame
# note that the format string for sprintf provides a leading '0'
# change to "%s%d" to NOTuse a leading zero
  well <- apply(v, 1, function(x) sprintf("%s%02d", x[2], as.integer(x[3])))
  plate <- data.frame(plate = v[[1]], well)[seq_len(nSamples),]
  return(plate)
}

-1
投票

我就是这样做的。

put_samples_in_plates = function(sample_list, nwells=96, direction="across")
  {
  if(!nwells %in% c(96, 384)){
    stop("Invalid plate size")
  }
  nsamples = nrow(sample_list)
  nplates  = ceiling(nsamples/nwells);

  if(nwells==96){
    rows = LETTERS[1:8]
    cols = 1:12
  }else if(nwells==384){
    rows = LETTERS[1:16]
    cols = 1:24  
  }else{
    stop("Unrecognized nwells")
  }
  nrows = length(rows)
  ncols = length(cols)

  if(tolower(direction)=="down"){
    single_plate_df = data.frame(row = rep(rows, times=ncols), 
                                 col = rep(cols, each=nrows))    
  }else if(tolower(direction)=="across"){
    single_plate_df = data.frame(row = rep(rows, each=ncols), 
                                 col = rep(cols, times=nrows))    
  }else{
    stop("Unrecognized direction")
  }
  single_plate_df   = transform(single_plate_df, 
                                well = sprintf("%s%02d", row, col))  
  toobig_plate_df   = cbind(data.frame(plate=rep(1:nplates, each=nwells)), 
                            do.call("rbind", replicate(nplates, 
                                                       single_plate_df, 
                                                       simplify=FALSE)))
  res = cbind(sample_list, toobig_plate_df[1:nsamples,])
  return(res)}

# Quick test

a_sample_list = data.frame(x=1:386, y=rnorm(386))

r.096.across = put_samples_in_plates(sample_list = a_sample_list,
                                     nwells= 96, 
                                     direction="across")
r.096.down   = put_samples_in_plates(sample_list = a_sample_list,
                                     nwells= 96, 
                                     direction="down")
r.384.across = put_samples_in_plates(sample_list = a_sample_list,
                                     nwells=384, 
                                     direction="across")
r.384.down   = put_samples_in_plates(sample_list = a_sample_list,
                                     nwells=384, 
                                     direction="down")

上面的函数中有两点值得注意:

  • 使用代表函数中的时间和每个参数来区分“横向”和“向下”方向,并且
  • 使用replicate根据需要多次重复单个板,并使用do.call对rbind的调用。
© www.soinside.com 2019 - 2024. All rights reserved.