将具有唯一新分类变量的排列添加到数据集

问题描述 投票:0回答:0

我有一个数据集(如下),其中包含多个分类变量和数值变量。分类变量有一组独特的组合,但并非所有排列都在数据集中(并且不需要,因为这些组合没有值)。

我打算向所有(分类)变量添加一个名为“[无选择]”的分类级别,并且需要创建一个排列,以便所有现有排列与所述新变量有一个独特的组合。

数值变量的值(示例数据中仅包含少数)应全部具有值 0,但如果提供空值,我可以稍后插入该值。

structure(list(`Primary Outcome` = structure(c(1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L), levels = c("Laboratory-Confirmed Influenza", 
"Influenza-Like Illness", "Complications from Influenza"), class = "factor"), 
    `Virus (Sub)Type` = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 
    4L, 5L, 5L, 6L), levels = c("Influenza A and B", "Influenza A", 
    "Influenza B", "A(H1N1)pdm09", "A(H3N2)", "B/Yamagata", "B/Victoria", 
    "B/Wisconsin"), class = "factor"), Season = structure(c(1L, 
    1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L), levels = c("All Seasons", 
    "2019/20", "2011/12", "2014/15", "2018/19", "2017/18", "2013/14", 
    "2012/13", "2015/16", "2016/17", "2012", "2013", "2018", 
    "2014", "2019", "2016", "2017", "2010-2019", "2015"), class = "factor"), 
    `Meta-Analysis Model` = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 
    1L, 2L, 1L, 2L, 1L), levels = c("Fixed Effect Model", "Random Effects Model"
    ), class = "factor"), k = c(398, 398, 126, 126, 212, 7, 2, 
    2, 10, 10, 2), Q = c(1236.50029339945, 1236.50029339945, 
    352.741540498925, 352.741540498925, 380.231921977758, 11.3144127452121, 
    2.68421098744275, 2.68421098744275, 5.75656937709005, 5.75656937709005, 
    1.74656825505281), p = c(5.89614495232528e-87, 5.89614495232528e-87, 
    1.38614414562279e-23, 1.38614414562279e-23, 8.10697738675785e-12, 
    0.0791317973899664, 0.101347413621071, 0.101347413621071, 
    0.764012962020228, 0.764012962020228, 0.186308732770133), 
    `Vaccine Matching Status` = c("No Stratification", "No Stratification", 
    "No Stratification", "No Stratification", "No Stratification", 
    "No Stratification", "No Stratification", "No Stratification", 
    "No Stratification", "No Stratification", "No Stratification"
    ), `Risk Group` = c("General Population (Aged 5-64)", "General Population (Aged 5-64)", 
    "General Population (Aged 5-64)", "General Population (Aged 5-64)", 
    "General Population (Aged 5-64)", "General Population (Aged 5-64)", 
    "General Population (Aged 5-64)", "General Population (Aged 5-64)", 
    "General Population (Aged 5-64)", "General Population (Aged 5-64)", 
    "General Population (Aged 5-64)")), row.names = c(NA, -11L
), class = c("tbl_df", "tbl", "data.frame"))

扩展网格不起作用,因为我不需要现有变量的所有排列,只需要具有值的现有排列和具有新分类级别的新排列。

非常感谢任何帮助

编辑: 这是 Excel 中的示例(我已经开始手动过程)

干杯

r permutation categorical-data
© www.soinside.com 2019 - 2024. All rights reserved.