在R中创建键值存储的问题

问题描述 投票:0回答:1

我试图创建一个键值存储,键是实体,值是实体在新闻文章中的平均情感分数。

我有一个数据框架,其中包含新闻文章和一个名为组织1的实体列表,这些实体在这些新闻文章中由一个分类器识别。organization1列表的第一行包含了news_us数据框架第一行的文章中识别的实体。我试图在组织列表中进行迭代,并创建一个键值存储,键是组织1列表中的实体,值是提到该实体的新闻描述的情感分数。我的代码并没有改变情感列表中的分数,我不知道为什么。我的第一个猜测是,我必须在情感列表中使用$操作符来添加值,但这也没有任何改变。这是我目前的代码。

library(syuzhet)
sentiment <- list()
organization1 <- list(NULL, "US", "Bath", "Animal Crossing", "World Health Organization", 
    NULL, c("Microsoft", "Facebook"))
news_us <- structure(list(title = c("Stocks making the biggest moves after hours: Bed Bath & Beyond, JC Penney, United Airlines and more - CNBC", 
"Los Angeles mayor says 'very difficult to see' large gatherings like concerts and sporting events until 2021 - CNN", 
"Bed Bath & Beyond shares rise as earnings top estimates, retailer plans to maintain some key investments - CNBC", 
"6 weeks with Animal Crossing: New Horizons reveals many frustrations - VentureBeat", 
"Timeline: How Trump And WHO Reacted At Key Moments During The Coronavirus Crisis : Goats and Soda - NPR", 
"Michigan protesters turn out against Whitmer’s strict stay-at-home order - POLITICO"
), description = c("Check out the companies making headlines after the bell.", 
"Los Angeles Mayor Eric Garcetti said Wednesday large gatherings like sporting events or concerts may not resume in the city before 2021 as the US grapples with mitigating the novel coronavirus pandemic.", 
"Bed Bath & Beyond said that its results in 2020 \"will be unfavorably impacted\" by the crisis, and so it will not be offering a first-quarter nor full-year outlook.", 
"Six weeks with Animal Crossing: New Horizons has helped to illuminate some of the game's shortcomings that weren't obvious in our first review.", 
"How did the president respond to key moments during the pandemic? And how did representatives of the World Health Organization respond during the same period?", 
"Many demonstrators, some waving Trump campaign flags, ignored organizers‘ pleas to stay in their cars and flooded the streets of Lansing, the state capital."
), name = c("CNBC", "CNN", "CNBC", "Venturebeat.com", "Npr.org", 
"Politico")), na.action = structure(c(`35` = 35L, `95` = 95L, 
`137` = 137L, `154` = 154L, `213` = 213L, `214` = 214L, `232` = 232L, 
`276` = 276L, `321` = 321L), class = "omit"), row.names = c(NA, 
6L), class = "data.frame")
i = as.integer(0)
for(index in organizations1){
  i <- i+1
   if(is.character(index)) { #if entity is not null/NA
     val <- get_sentiment(news_us$description[i], method = "afinn")
     #print(val)
     print(sentiment[[index[1]]])
     sentiment[[index[1]]] <- sentiment[[index[1]]]+val
   }
}

这是运行上述代码块后的情感列表。

$US
integer(0)

$Bath
integer(0)

$`Animal Crossing`
integer(0)

$`World Health Organization`
integer(0)

$`Apple TV`
integer(0)

$`Pittsburgh Steelers`
integer(0)

而我希望它看起来像这样:

$US
1.3

$Bath
0.3

$`Animal Crossing`
2.4

$`World Health Organization`
1.2

$`Apple TV`
-0.7

$`Pittsburgh Steelers`
0.3

值列可以为文章中的多个实体提供多个值。

r machine-learning data-science sentiment-analysis
1个回答
1
投票

我不知道该怎么做 organization1news_us$description 是相关的,但也许,你的意思是使用它这样的东西?

library(syuzhet)

setNames(lapply(news_us$description, get_sentiment), unlist(organization1))

#$US
#[1] 0

#$Bath
#[1] -0.4

#$`Animal Crossing`
#[1] -0.1

#$`World Health Organization`
#[1] 1.1

#$Microsoft
#[1] -0.6

#$Facebook
#[1] -1.9
© www.soinside.com 2019 - 2024. All rights reserved.