如何在Google表格中列出最常用的3个单词的字符串

问题描述 投票:2回答:2

我有一个行业列表,并有一个相邻的行业列表来对其进行分类。我想知道哪个行业是最常见的,但是我无法使Sheets将两个单词的类别解释为一个。

首先,我想知道总体上最常见的5个类别。我也想知道排名前5位的一个单词(黑色),两个单词(红色)和三个单词(蓝色)类别。

此外,我想摆脱逗号。

这是我想要实现的,并且是指向Google表格文档的链接,在该文档中我已经列出了所有数据:

Document Screenshot

https://docs.google.com/spreadsheets/d/13N8gc4POPhFhTvyqq-UugWS5GCgcONwliacSL8-MAr8/edit#gid=0

如何将这些类别分组和列出?

regex google-sheets transpose array-formulas google-sheets-query
2个回答
2
投票

总称:

=ARRAYFORMULA(QUERY(TRANSPOSE(SPLIT(QUERY(B3:B11&",",,99^99), ", ")), 
 "select Col1,count(Col1) 
  group by Col1
  order by count(Col1) desc
  limit 5
  label count(Col1)''"))

总短语:

=ARRAYFORMULA(QUERY(TRANSPOSE(SPLIT(QUERY(B3:B11&",",,99^99), ",")), 
 "select Col1,count(Col1) 
  group by Col1
  order by count(Col1) desc
  limit 5
  label count(Col1)''"))

一个字:

=ARRAYFORMULA(QUERY(TRIM(TRANSPOSE(SPLIT(QUERY(B3:B11&",",,99^99), ","))), 
 "select Col1,count(Col1)
  where not Col1 contains ' '
  group by Col1
  order by count(Col1) desc
  limit 5
  label count(Col1)''"))

两个字:

=ARRAYFORMULA(QUERY(TRIM(TRANSPOSE(SPLIT(QUERY(B3:B11&",",,99^99), ","))), 
 "select Col1,count(Col1)
  where Col1 matches '\w+ \w+'
  group by Col1
  order by count(Col1) desc
  limit 5
  label count(Col1)''"))

三个字:

=ARRAYFORMULA(QUERY(TRIM(TRANSPOSE(SPLIT(QUERY(B3:B11&",",,99^99), ","))), 
 "select Col1,count(Col1)
  where Col1 matches '\w+ \w+ \w+'
  group by Col1
  order by count(Col1) desc
  limit 5
  label count(Col1)''"))

1
投票

将问题分解成3个公式将使您可以支持任意数量的“单词”。

第1步,将公式放在D29中,将所有单词视为一个单词(看着您的问题,这似乎是您真正需要的唯一步骤)

=query(arrayformula(trim(substitute(transpose(split(query({substitute(B3:B," ","_")},"select * where Col1 is not null",counta(B3:B)),", ")),"_"," "))),"select Col1, count(Col1) group by Col1 order by count(Col1) desc label Col1 'Descriptions', count(Col1) 'Frequency'")

步骤2)将公式放入F29中,将此下一个公式放在上面公式生成的表旁边。如果使用其他范围,则应替换D30:D

=arrayformula({"Words";if(D30:D="","",1+LEN(D30:D)-len(SUBSTITUTE(D30:D," ","")))})

第3步,将公式放入G29,这将输出按字数排序的最大频率[如果您使用其他位置,应替换为D29:F]

=query({D29:F},"select * where Col1 is not null order by Col3,Col2 desc")

这样做的好处是您支持1,2,3,4 ...词频。

© www.soinside.com 2019 - 2024. All rights reserved.