我正在尝试运行此代码#甲基化数据:
prefix_ad <- data.frame(
Chromosome = c("Ecoli_K12", "Ecoli_K12", "Ecoli_K12", "Ecoli_K12", "Ecoli_K12", "Ecoli_K12", "Ecoli_K12", "Ecoli_K12", "Ecoli_K12", "Ecoli_K12"),
start = c(91, 153, 244, 249, 257, 259, 284, 298, 312, 1645),
end = c(91, 153, 244, 249, 257, 259, 284, 298, 312, 1645),
count = c(10, 20, 0, 11, 1, 15, 2, 5, 9, 8)
)
# Gene annotations
annotation <- data.frame(
Chromosome = c("Ecoli_K12", "Ecoli_K12", "Ecoli_K12"),
start = c(50, 210, 400),
end = c(200, 350, 600),
gene = c("gene_1", "gene_2", "gene_3")
)
library(GenomicRanges)
library(dplyr)
# Create GRanges objects for methylation data and gene annotations
gr_methylation <- GRanges(
seqnames = prefix_ad$Chromosome,
ranges = IRanges(start = prefix_ad$start, end = prefix_ad$end)
)
gr_genes <- GRanges(
seqnames = annotation$Chromosome,
ranges = IRanges(start = annotation$start, end = annotation$end),
id = annotation$gene)
# Match CpG sites to genes
hits <- findOverlaps(gr_methylation, gr_genes)
# Extract the overlapping data
methylation_within_genes <- prefix_ad[hits]
methylated_counts <- tapply(methylation_within_genes$count, methylation_within_genes$, sum, na.rm = TRUE)
我想要这样的输出:
chromosome gene start end methylated_count
Ecoli_K12 gene_1 50 200 30
Ecoli_K12 gene_2 210 350 34
我有两个问题:
methylation_within_genes <- prefix_ad[hits]
(prefix_ad,hits)中的错误:无效的下标类型“S4”[.default
您的
hits
对象不能直接用于子集化操作,这里是子集 prefix_ad
。这就是错误告诉您的内容(诚然,以一种最无益的方式)。 hits
包含两个输入之间重叠的信息,称为 from
和 to
。当您将 hits
打印到控制台时,您会看到这一点。您可以使用 from()
和 to()
访问值,然后使用输出进行子集化,例如:
prefix_ad[from(hits),]
prefix_ad[to(hits),]