使用 Bioconductor 下载寨卡病毒基因组数据

问题描述 投票:0回答:1

我目前正在参加有关生物导体包的数据营课程。有一个部分可以通过 BioStrings 包来使用寨卡基因组。我想知道在哪里可以加载这个变量?该课程称基因组是从https://www.ncbi.nlm.nih.gov/nuccore/NC_012532.1下载的。在数据营内部,如果我执行 dput(zikaVirus) (会话中的变量是 zikaVirus 我得到

new("DNAStringSet", pool = new("SharedRaw_Pool", xp_list = list(
    <pointer: (nil)>), .link_to_cached_object_list = list(<environment>)), 
    ranges = new("GroupedIRanges", group = 1L, start = 1L, width = 10794L, 
        NAMES = "NC_012532.1 Zika virus isolate ZIKV/Monkey/Uganda/MR766/1947, complete genome", 
        elementType = "ANY", elementMetadata = NULL, metadata = list()), 
    elementType = "DNAString", elementMetadata = NULL, metadata = list())

我无法在 R 中使用它来重新创建变量。

r bioconductor
1个回答
0
投票

试试这个:

library(rentrez)
library(Biostrings)

tmp <- tempfile()

writeLines(
  entrez_fetch(
    db = "nuccore",
    id = "NC_012532.1",
    rettype = "fasta",
    retmode = "text"
  ),
  tmp
)

dna <- readDNAStringSet(tmp, format = "fasta")

检查结果:

> print(dna)
DNAStringSet object of length 1:
    width seq                                                                   names               
[1] 10794 AGTTGTTGATCTGTGTGAGTCAG...TCGGCGGCCGGTGTGGGGAAATCCATGGTTTCT NC_012532.1 Zika ...

> dput(dna)
new("DNAStringSet", pool = new("SharedRaw_Pool", xp_list = list(
    <pointer: (nil)>), .link_to_cached_object_list = list(<environment>)), 
    ranges = new("GroupedIRanges", group = 1L, start = 1L, width = 10794L, 
        NAMES = "NC_012532.1 Zika virus, complete genome", elementType = "ANY", 
        elementMetadata = NULL, metadata = list()), elementType = "DNAString", 
    elementMetadata = NULL, metadata = list())
© www.soinside.com 2019 - 2024. All rights reserved.