R,在特定条件下从映射表在数据框/data.table 中创建附加列

问题描述 投票:0回答:1

我有这两个表,我需要创建未出现的列。

这是我之前问过的问题的延伸。 有一定的修改,所以我发布另一个问题。

df <- data.table::as.data.table(structure(list(types=c("x","y","x","x","x","x","x","z","z","y","z","z"),class = c("a", "c", "v", "f", "r", "b", "t", "o","z","c","l","e"), value = c(0.76, 0.91, 1.94, 0.37, 1.35, 0.75, 1.95, 1.69,0,0.9,0.1,0.17), vehicle = c("we", "df", "rt", "yh", "uj", "er", "ed", "we","x","rt","xt","ca"), carbon = c(0.984, 0.27, 0.419, 0.469, 0.132, 0.865, 0.562, 0.133,0.05,0.001,0.7,0.37), cap = c("3", "2", "1", "6", "y", "t", "4", "6","e","4",6,"e"), up = c(4, 2, 3, "d", "t", "y", "u", "i","d","y",4,"t"), down = c("t", "e", "r", 3, 4, 5 ,2,1,2,5,1,2), amt = c(34, 23, 12, 67, 87, 43, 23, 12,34,15,14,45)), row.names = c(NA, -12L), class = c("data.table", "data.frame" )));
map <- data.table::as.data.table(structure(list(types=c("x","x","x","y","z","z"),up = c("d", "y", 4,"","",""), vehicle = c("yh", "er", "we","","",""), exercise = c("ty", 45, "k","cf","th","sh"),class = c("","","","c","",""),cap = c("","","","",6,"e"),down = c("","","","",1,2)), class = c("data.table", "data.frame"), row.names = c(NA, -6L)));

简而言之,需要在

exercise
中创建向量
df

需要满足的条件是。

需要使用

df
中存在的列创建映射。

根据向量

up
vehicle
class
cap
down
创建映射,并在
df
中查找这些项目,并在
exercise
中拟合
df
的值。

例如

复杂的是映射表的每一列中并非所有项目都出现

map

示例 1。 匹配列

up
vehicle
,以获得要插入到df中的练习值。

示例 2.

map
class
中,因此值
c
的任何其他相应行中都不会出现任何内容。因此,只要表
c
的向量
class
中存在
df
exercise
的值就应该是
cf

示例 3.

同样

使用

cap
中的
dowm
map
列来获取
exercise
的值。

下面有更多解释。

不会有任何冲突,并且

df
的所有条目都存在唯一的映射。

还有一个附加列

types
将这些不同类型的映射分开。

对于

x
,列
up
vehicle
将始终存在。 对于
y
class
列将始终存在。 对于
z
,列
cap
down
将始终存在

可以使用或不使用

types
列来编写代码。

所以如果

up
vehicle
cap
down
中没有任何内容。仅使用表格
class
的列
map

仅考虑那些在

map
中出现的具有某些值的列需要用于创建映射。

理想情况下,

df
中的新向量看起来像这样
exercise<-c("k","cf","","ty","",45,"","th","sh","cf","th","sh")

r data.table
1个回答
0
投票

对于 x,列向上并且车辆将始终存在。对于你来说 列类将始终存在。对于 z 列上限和下限 会一直存在

这允许我们按类型拆分地图表,然后删除空列。因此,我们有三个一致且不同的映射表,我们与原始映射表进行自然连接。然后我们将所有三个连接的锻炼结果与 fcoalesce 结合起来。

maps <- split(map, by = "types")

df[, exercise := fcoalesce(lapply(maps, \(x) {
  x[, .SD, .SDcols = \(x) !all(x == "")][df, on = .NATURAL]$exercise
}))]

结果

df

    types class value vehicle carbon cap up down amt exercise
 1:     x     a  0.76      we  0.984   3  4    t  34        k
 2:     y     c  0.91      df  0.270   2  2    e  23       cf
 3:     x     v  1.94      rt  0.419   1  3    r  12     <NA>
 4:     x     f  0.37      yh  0.469   6  d    3  67       ty
 5:     x     r  1.35      uj  0.132   y  t    4  87     <NA>
 6:     x     b  0.75      er  0.865   t  y    5  43       45
 7:     x     t  1.95      ed  0.562   4  u    2  23     <NA>
 8:     z     o  1.69      we  0.133   6  i    1  12       th
 9:     z     z  0.00       x  0.050   e  d    2  34       sh
10:     y     c  0.90      rt  0.001   4  y    5  15       cf
11:     z     l  0.10      xt  0.700   6  4    1  14       th
12:     z     e  0.17      ca  0.370   e  t    2  45       sh

数据

df <- data.table::as.data.table(structure(list(types=c("x","y","x","x","x","x","x","z","z","y","z","z"),class = c("a", "c", "v", "f", "r", "b", "t", "o","z","c","l","e"), value = c(0.76, 0.91, 1.94, 0.37, 1.35, 0.75, 1.95, 1.69,0,0.9,0.1,0.17), vehicle = c("we", "df", "rt", "yh", "uj", "er", "ed", "we","x","rt","xt","ca"), carbon = c(0.984, 0.27, 0.419, 0.469, 0.132, 0.865, 0.562, 0.133,0.05,0.001,0.7,0.37), cap = c("3", "2", "1", "6", "y", "t", "4", "6","e","4",6,"e"), up = c(4, 2, 3, "d", "t", "y", "u", "i","d","y",4,"t"), down = c("t", "e", "r", 3, 4, 5 ,2,1,2,5,1,2), amt = c(34, 23, 12, 67, 87, 43, 23, 12,34,15,14,45)), row.names = c(NA, -12L), class = c("data.table", "data.frame" )));
map <- data.table::as.data.table(structure(list(types=c("x","x","x","y","z","z"),up = c("d", "y", 4,"","",""), vehicle = c("yh", "er", "we","","",""), exercise = c("ty", 45, "k","cf","th","sh"),class = c("","","","c","",""),cap = c("","","","",6,"e"),down = c("","","","",1,2)), class = c("data.table", "data.frame"), row.names = c(NA, -6L)));
© www.soinside.com 2019 - 2024. All rights reserved.