如何在R中使用多个相同类型的对象填充插槽？

Question

假设我要定义两个类，Sentence和Word。每个单词对象都有一个字符串和一个词性（pos）。每个句子包含一些单词，并有一个额外的数据槽。

Word类很容易定义。

wordSlots <- list(word = "character", pos = "character")
wordProto <- list(word = "", pos = "")
setClass("Word", slots = wordSlots, prototype = wordProto)    
Word <- function(word, pos) new("Word", word=word, pos=pos)

现在我想制作一个Sentence类，它可以包含一些Words和一些数值数据。

如果我将Sentence类定义为：

sentenceSlots <- list(words = "Word", stats = "numeric")
sentenceProto <- list(words = Word(), stats = 0)
setClass("Sentence", slots = sentenceSlots, prototype = sentenceProto)

那句话只能包含一个单词。我显然可以用许多插槽来定义它，每个字一个，但是它的长度会受到限制。

但是，如果我像这样定义Sentence类：

sentenceSlots <- list(words = "list", stats = "numeric")
sentenceProto <- list(words = list(Word()), stats = 0)
setClass("Sentence", slots = sentenceSlots, prototype = sentenceProto)

它可以包含我想要的任意数量的单词，但插槽words可以包含不属于类Word的对象。

有没有办法实现这个目标？这类似于C ++，你可以拥有相同类型的对象向量。

Answer 1

记住R在向量上运行良好，第一步是想到'单词'而不是'单词'

## constructor, accessors, subset (also need [[, [<-, [[<- methods)
.Words <- setClass("Words",
    representation(words="character", parts="character"))
words <- function(x) x@words
parts <- function(x) x@parts
setMethod("length", "Words", function(x) length(words(x)))
setMethod("[", c("Words", "ANY", "missing"), function(x, i, j, ...) {
    initialize(x, words=words(x)[i], parts=parts(x)[i], ...)
})

## validity
setValidity("Words", function(object) {
    if (length(words(object)) == length(parts(object)))
        NULL
    else
        "'words()' and 'parts()' are not the same length"
})

@ nicola建议有一个单词列表已在IRanges包中形式化（实际上，S4Vectors位于Bioconductor的'devel'/ 3.0分支中），其中'SimpleList'采用'天真'方法要求所有元素list具有相同的类，而'CompressedList'具有相似的行为，但实际上实现为类似矢量的对象（一个具有length（），[和[[方法]'被'分区'（通过结束或宽度）成组。

library(IRanges)
.Sentences = setClass("Sentences",
    contains="CompressedList",    
    prototype=c(elementType="Words"))

然后，人们会编写一个更加用户友好的构造函数，但基本功能是

## 0 Sentences
.Sentences()
## 1 sentence of 0 words
.Sentences(unlistData=.Words(), partitioning=PartitioningByEnd(0))
## 3 sentences of 2, 0, and 3 words
s3 <- .Sentences(unlistData=.Words(words=letters[1:5], parts=LETTERS[1:5]), 
    partitioning=PartitioningByEnd(c(2, 2, 5)))

导致

> s3[[1]]
An object of class "Words"
Slot "word":
[1] "a" "b"

Slot "part":
[1] "A" "B"

> s3[[2]]
An object of class "Words"
Slot "word":
character(0)

Slot "part":
character(0)

> s3[[3]]
An object of class "Words"
Slot "word":
[1] "c" "d" "e"

Slot "part":
[1] "C" "D" "E"

请注意，一些典型的操作很快，因为它们可以对“未列出”元素进行操作而不会创建或破坏S4实例，例如，将所有“单词”强制转换为大写

setMethod(toupper, "Words", function(x) { x@word <- toupper(x@word); x })
setMethod(toupper, "Sentences", function(x) relist(toupper(unlist(x)), x))

这对于大量句子来说是“快速的”，因为unlist / relist实际上是在插槽访问和创建单个“Words”实例。 Scalable Genomics with R and Bioconductor概述了这个和其他策略。

在回答@nicola说'R不完全适合OO编程风格'但是它可能更有助于意识到R的S4面向对象风格不同于C ++和Java，就像R与C不同。特别是继续它是非常有价值的在使用S4时用向量思考 - 单词而不是单词，人而不是人......

Answer 2

我建议只针对这类问题进行解决。请记住，R不完全适合OO编程风格，并且每个解决方案都很难显示其他语言（如Java或C ++）的可靠性。但是，您可以使用Sentence插槽作为列表声明您的words类。然后你定义你的构造函数：

   Sentence<-function(words,stats) {
     #check for the components' class of words argument
     if (!is.list(words) || !all(sapply(words,function(x) class(x)=="Word"))) stop("Not valid words argument")
     #create the object
      new("Sentence", words=words, stats=stats)
   }

这种构造函数的一个例子可以在sp包中找到Polygons类。你可以看到该功能的主体。

如果你想避免那个用户错误设置words插槽，你可以重新定义@<-操作符，如：

    "@<-.Sentence"<-function(sentence,...) invisible(sentence)

我不认为最后一步是必要的。无论你做什么，用户都可以随时搞砸。例如，他可以绕过你的构造函数直接调用new函数。或者他可以将Word类设置为任意对象，然后将其传递给Sentence。正如我所说，R对于这种编程风格并不完美，所以你应该经常采用某种非最佳解决方案。

如何在R中使用多个相同类型的对象填充插槽？

问题描述投票：6回答：2

2个回答

最新问题

如何在R中使用多个相同类型的对象填充插槽？

问题描述 投票：6回答：2

2个回答

最新问题

问题描述投票：6回答：2