如何在R中通过分组包围均值来填充NA

Question

我真的不知道如何通过按组计算其周围环境的平均值来填充 NA。换句话说，我不希望其他组中的数据计算在周围的平均值中。

我有一个这样的stock数据集：

Key | Company_Name | Price    |
--  | --------     | -------- |
1   | A            | 12       |
2   | A            | 13       |
3   | A            | 12       |
4   | A            | NA       |
5   | A            | NA       |
6   | B            | 20       |
7   | B            | 21       |
8   | B            | NA       |

我想通过计算其 4 个环境的平均值来估算这些 NA。然而，挑战是我不知道如何做这个by group。例如，对于第 5 个观察，我不希望它受到 B 的价格的影响。

我的预期输出是：

Key | Company_Name | Price    |
--  | --------     | -------- |
1   | A            | 12       |
2   | A            | 13       |
3   | A            | 12       |
4   | A            | 12.33    |
5   | A            | 12.33    |
6   | B            | 20       |
7   | B            | 21       |
8   | B            | 20.5     |

我尝试学习的一些链接：

感谢任何帮助，谢谢！

我尝试过的：

没有分组，我使用 ImputeTS 库中的 na_ma() 如下，它有效：

stock$Price = na_ma(stock$Price, k=2, weighting = 'simple')

但是当我想将 group_by() 纳入考虑范围以提高我插补的准确性时，它导致了一个错误：

stock2 = stock %>% 
      group_by(Company_Name) %>%
      mutate(Price = na_ma(stock$Price, k=2, weighting = 'simple'))

mutate()

中的错误： ℹ 在争论中：

Price = na_ma(stock$Price, k=2, weighting = 'simple')

。 ℹ 在第 1 组中：

Name = "A."

。错误原因：！

Stock

必须是 76 号或 1 号，而不是 120471。回溯：

bm_no_na %>% group_by(公司名称) %>% ...
dplyr:::dplyr_internal_error(...)

Answer 1

如果你想用简单的

mean()

函数替换缺失值：

fill_na <- function(x) ifelse( is.na(x),mean(x, na.rm=T), x)

   stock2 <- stock %>% 
          group_by(Company_Name) %>% 
          mutate_at(vars("Price") , fill_na) 

stock2
   Key Company_Name Price
  <int> <chr>        <dbl>
1     1 A             12  
2     2 A             13  
3     3 A             12  
4     4 A             12.3
5     5 A             12.3
6     6 B             20  
7     7 B             21  
8     8 B             20.5

或使用

ImputeTS

（请注意，正如 GuedesBF 所提到的，您的代码的问题是避免在

中使用

dplyr

）：

library(ImputeTS)
stock2 <- stock %>% 
  group_by(Company_Name) %>% 
  mutate(Price = na_ma(Price, k=2, weighting = 'simple'))

stock2

   Key Company_Name Price
  <int> <chr>        <dbl>
1     1 A             12  
2     2 A             13  
3     3 A             12  
4     4 A             12.5
5     5 A             12.5
6     6 B             20  
7     7 B             21  
8     8 B             20.5

示例数据

stock <- read.table(text = "Key Company_Name    Price
1   A   12
2   A   13
3   A   12
4   A   NA
5   A   NA
6   B   20
7   B   21
8   B   NA
",header=T)

如何在R中通过分组包围均值来填充NA

问题描述投票：0回答：1

感谢任何帮助，谢谢！

1个回答

示例数据

最新问题

如何在R中通过分组包围均值来填充NA

问题描述 投票：0回答：1

感谢任何帮助，谢谢！

1个回答

示例数据

最新问题

问题描述投票：0回答：1