如何创建一个循环,只接受以大写字母开头的单词

问题描述 投票:0回答:1

我有一个包含很多行的 Excel 工作表,我想要:用逗号分割特定列中的行(该列描述祖先,它有数字和逗号),然后创建一个函数,其中我只接受以以下开头的单词大写字母。然后抽象这些单词并将它们放入一个循环中,这样我就可以创建一个以大写字母开头的连续单词列表。之后我想创建一个列表,在其中我可以看到每个单词的频率。

我使用了函数 str_extract_all(data$

INITIAL SAMPLE DESCRIPTION
, " [A-Z]\w*") |> unique() 其中
INITIAL SAMPLE DESCRIPTION
是我感兴趣的专栏名称。

r loops analytics capitalization
1个回答
0
投票

有这样的事吗?提取首字母大写字母后跟任意字母字符零次或多次的单词。将以下代码应用到每个列元素。
要在上面列出结果,好吧,

unlist
table
它。

x <- 'I have an excel sheet with a lot of rows and i want: to split the rows in a specific column by commas (this column describes ancestry and it has numbers and commas), then create a function where i only take words that start with capital letters. Then abstract these words and put them in a loop, so I can create a list of words that go together in a row that start with capital letters. After that i want to create a list where i can see the frequencies of each of these words.
I used the function str_extract_all(data$INITIAL SAMPLE DESCRIPTION, "\\b[A-Z]\\w*") |> unique() Where INITIAL SAMPLE DESCRIPTION is the name of the column of my interest.
'
cap <- stringr::str_extract_all(x, "[A-Z][[:alpha:]]*")
cap
#> [[1]]
#>  [1] "I"           "Then"        "I"           "After"       "I"          
#>  [6] "INITIAL"     "SAMPLE"      "DESCRIPTION" "A"           "Z"          
#> [11] "Where"       "INITIAL"     "SAMPLE"      "DESCRIPTION"

cap |> unlist() |> table()
#> 
#>           A       After DESCRIPTION           I     INITIAL      SAMPLE 
#>           1           1           2           3           2           2 
#>        Then       Where           Z 
#>           1           1           1

创建于 2023 年 12 月 22 日,使用 reprex v2.0.2

© www.soinside.com 2019 - 2024. All rights reserved.