我创建了一个名为getExpressionLevel
的函数,问题要求我使用此函数将数字替换为下面的语句。那么我需要用什么来实现这一目标呢?
getExpressionLevel的函数;
function(a) {
if (a<5) {
cat ("none")
}
if (a>=5&a<20) {
cat ("low")
}
if (a>=20&a<60) {
cat ("medium")
}
if (a>=60) {
cat ("high")
}
}
问题是;
创建一个名为data.frame
的expression_levels
,它有10行(每个基因一个)和3列(每个细胞系一个)。然后计算每个细胞系中每个基因的平均表达,并使用getExpressionLevel
函数标记相应的表达。
这是我当前的data.frame。其中的数据需要替换为getExpression函数的结果。
genename Kc167 BG3 S2
1 Clic 7.333333 48.33333 75.00000
2 Treh 24.666667 12.66667 52.33333
3 bib 31.333333 79.33333 82.00000
4 CalpC 65.000000 69.33333 63.66667
5 tud 59.666667 81.66667 16.33333
6 cort 74.333333 50.66667 28.66667
7 S2P 72.000000 39.66667 50.66667
8 Mitofilin 38.333333 29.00000 54.66667
9 Oxp 73.666667 49.33333 42.66667
10 Ada1-2 87.333333 42.00000 28.00000
这是预期的data.frame:
Kc167 BG3 S2
Clic low medium high
Treh medium low medium
bib medium high high
CalpC high high high
tud medium high low
cort high medium medium
S2P high medium medium
MitofiliN medium medium medium
Oxp high medium medium
Ada1-2 high medium medium
功能方式。了解如何使用功能总是有帮助的。
## sample data
df <- data.table(genename = c('Clic','Treh','bib','CalpC'),
Kc167 = c(7.333,24.666,31.3333,65),
BG3 = c(48.33,12.66,79.33,69.33),
S2 = c(75.00,52.33,82.00,63.66))
## this function updates values based on following criterias
get_values <- function(x)
{
if(x < 5) return ('None')
else if ((x >= 5) && (x < 20)) return ('low')
else if ((x >= 20) && (x < 60)) return ('medium')
else if (x >= 60) return ('high')
}
## creating a new data frame with answers
df2 <- df$genename
df2$Kc167 <- sapply(df$Kc167, get_values)
df2$BG3 <- sapply(df$BG3, get_values)
df2$S2 <- sapply(df$S2, get_values)
genename Kc167 BG3 S2
1: Clic low medium high
2: Treh medium low medium
3: bib medium high high
4: CalpC high high high
希望这可以帮助!
bin_breaks <- c(-Inf, 5, 20, 60, Inf)
bin_labels <- c("none", "low", "medium", "high")
df[,-1] <- sapply(df[,-1], function(x) cut(x,
breaks = bin_breaks,
labels = bin_labels,
right = F))
df
输出是:
genename Kc167 BG3 S2
1 Clic low medium high
2 Treh medium low medium
3 bib medium high high
4 CalpC high high high
5 tud medium high low
6 cort high medium medium
7 S2P high medium medium
8 Mitofilin medium medium medium
9 Oxp high medium medium
10 Ada1-2 high medium medium
样本数据:
df <- structure(list(genename = c("Clic", "Treh", "bib", "CalpC", "tud",
"cort", "S2P", "Mitofilin", "Oxp", "Ada1-2"), Kc167 = c(7.333333,
24.666667, 31.333333, 65, 59.666667, 74.333333, 72, 38.333333,
73.666667, 87.333333), BG3 = c(48.33333, 12.66667, 79.33333,
69.33333, 81.66667, 50.66667, 39.66667, 29, 49.33333, 42), S2 = c(75,
52.33333, 82, 63.66667, 16.33333, 28.66667, 50.66667, 54.66667,
42.66667, 28)), .Names = c("genename", "Kc167", "BG3", "S2"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))
编辑:在代码中添加适当的right
参数以满足边界条件和OP的要求(由@drf提供)。