R:下标的赋值不会覆盖值,也不会更改所有值(已计算出一半)

问题描述 投票:0回答:1

我的R代码有时会出现问题,在该问题中,我将尝试使用带下标的赋值覆盖变量的值,而某些/所有值都不会被覆盖。 (此后我已经解决了一半问题,但问题的后半部分仍然适用。)

这是代码的简化示例,它比较两个变量以查看哪个更大,然后找到它们相等的位置,并将“更大”变量设置为-1以表示两个都不大。

a <- rep(0:2,96)
b <- rep(0:3,72)
dataset <- data.frame(cbind(a,b))
dim(dataset) # Show dimensions

> [1] 288   2

# Add a few random NAs
dataset$a[15] <- NA
dataset$b[27] <- NA
dataset$a_bigger <- (dataset$a > dataset$b)
dataset$b_bigger <- (dataset$b > dataset$a)
table(dataset[,c('a_bigger','b_bigger')],useNA='ifany')

>        b_bigger
>a_bigger FALSE TRUE <NA>
>   FALSE    70  144    0
>   TRUE     72    0    0
>   <NA>      0    0    2

dataset$same <- (dataset$a == dataset$b) # Find values where they are the same and neither is bigger
table(dataset$same,useNA='ifany') # Show that there are NAs in dataset$same.

> FALSE  TRUE  <NA>
>  216    70     2

dataset$same[is.na(dataset$a) | is.na(dataset$b)] <- 0 # Fix the NAs. A and B can't be the same if one of them is NA.
table(dataset$same,useNA='ifany') # Show that there are no longer NAs

>   0   1
> 218  70

dataset$a_bigger[dataset$same] <- -1
dataset$b_bigger[dataset$same] <- -1
table(dataset[,c('a_bigger','b_bigger')],useNA='ifany') # Wait, there should be 70 changed, not 1...?

>         b_bigger
> a_bigger  -1   0   1 <NA>
>    -1     1   0   0    0
>    0      0  69 144    0
>    1      0  72   0    0
>    <NA>   0   0   0    2

到目前为止,我已经知道发生了什么。将“ same”的一些值设置为0会将其从逻辑true / false更改为0/1,然后当我将其用于索引另一个变量时,“ 1s”被认为是“覆盖第一行”,而不是逻辑正确。

这让我感到困惑,因为在其他情况下R 将0/1视为true / false(实际上,如果我将赋值行重写为dataset$a_bigger[dataset$same & dataset$same] <- -1,那是可行的),但至少我可以理解现在发生了什么。

但是我仍然不明白为什么这样做:

dataset$even_weirder[dataset$same] <- -1 # But now if I do the assignment on a column/variable that's not initialized...
table(dataset[,'even_weirder'],useNA='ifany') # They all change!!!

>  -1
> 288

如果真的认为我写dataset$somevar[dataset$same]时是指位置0(它会忽略)和位置1(它会一遍又一遍地覆盖),那么当我使用未初始化的列进行操作时,为什么要赋值-1分配给每一行,而不是将其分配给第一行并保留其余的NA?

r boolean variable-assignment overwrite subscript
1个回答
0
投票

基本上是问题

class(dataset$same)
#[1] "numeric"

不是逻辑而是二进制,即0和1

head(dataset$same)
#[1] 1 1 1 0 0 0

应该是]

as.logical(dataset$same)

因为赋值在索引位置1处发生,即值-1在第一个元素上而不是其他位置得到更新

dataset$a_bigger[as.logical(dataset$same)] <- -1
dataset$b_bigger[as.logical(dataset$same)] <- -1

table(dataset[,c('a_bigger','b_bigger')],useNA='ifany')
#        b_bigger
#a_bigger  -1   0   1 <NA>
#    -1    70   0   0    0   #### 70 is showing up now
#    0      0   0 144    0
#    1      0  72   0    0
#    <NA>   0   0   0    2

关于'even_weider',它是动态创建的,因此,当第一个元素分配给-1时,它会循环到整个列的长度

dataset$even_weirder[dataset$same]
#NULL
dataset$even_weirder[dataset$same] <- -1
sum(dataset$same)
#[1] 70
table(dataset[,'even_weirder'],useNA='ifany')

# -1 
#288 
dataset$even_weirder
#  [1] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
# [39] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
# [77] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
#[115] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
#[153] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
#[191] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
#[229] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
#[267] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
© www.soinside.com 2019 - 2024. All rights reserved.