词频

Question

我正在编写一个 Java 程序，用于计算输入中单词的频率。初始整数表示后面有多少个单词。到目前为止，这是我的代码：

import java.util.Scanner; 
public class LabProgram {
   public static void main(String[] args) {

      Scanner scnr = new Scanner(System.in);
      int numberWords = scnr.nextInt();
      String[] wordsList = new String[numberWords];
      int i;
      int j;
      int[] frequency = new int[numberWords];
      

      for (i = 0; i < numberWords; ++i) {
         wordsList[i] = scnr.next();
         frequency[i] = 0;
         for (j = 0; j < numberWords; ++j) {
            if (wordsList[i].equals(wordsList[j])) {
               frequency[i] = frequency[i] + 1;
            }
         }
      }
      for (i = 0; i < numberWords; ++i) {
         System.out.print(wordsList[i] + " - " + frequency[i]);
         System.out.print("\n");
      }
      
   }
}

当我输入以下内容时：

6 pickle test rick Pickle test pickle

这是输出：

pickle - 1
test - 1
rick - 1
Pickle - 1
test - 2
pickle - 2

但是，这是预期的输出：

pickle - 2
test - 2
rick - 1
Pickle - 1
test - 2
pickle - 2

看起来它为以后的出现获取了正确的频率，但不是为最初的出现获取了正确的频率。

Answer 1

对于这种情况，您可以使用映射来保存每个单词的频率，或者甚至可以使用流和分组来实现。您甚至不需要提前知道单词数，假设您只是用空格将它们分开。

对于流来说，它基本上是一个单行代码：

String input = "pickle test rick Pickle test pickle";
// With a stream:
Map<String, Long> result = Arrays.stream(input.split(" ")).collect(Collectors.groupingBy(s->s, Collectors.counting()));

地图包含：

{Pickle=1, test=2, rick=1, pickle=2}

如果您不喜欢流，只需手动迭代单词，并增加单词的值（在地图中用作键）。

Answer 2

我会为此使用地图。映射可用于存储给定键的值。您可以使用您的单词作为键，并将它们的计数作为值。使用映射可以使您的代码更容易理解、更短并且更不容易出错。例如：

import java.util.Scanner;
import java.util.Map;
import java.util.HashMap;

public class MyClass {
    public static void main(String args[]) {
        HashMap<String, Integer> frequencies = new HashMap<>();
        Scanner s = new Scanner(System.in);

        int wordCount = s.nextInt();
  
        for (int i = 0; i < wordCount; ++i) {
            String word = s.next();
            int count = frequencies.getOrDefault(word, 0);
            frequencies.put(word, count + 1);
        }
  
        for (Map.Entry<String, Integer> item : frequencies.entrySet()) {
            System.out.println(item.getKey() + ": " + item.getValue());
        }
    }
}

Answer 3

创建一个包含每个单词频率的

HashMap

是最有效的方法。

实现此目的的方法之一是使用方法

merge()

，该方法是在 Java 8 的

Map

接口中引入的。它需要三个参数：一个 key、一个与之关联的 value key（如果地图中不存在）和一个function，如果已经存在则将对其进行评估，并且我们需要合并先前的值和一个新值。

public static void main(String[] args) {
    Scanner scanner = new Scanner(System.in);
    int numberWords = scanner.nextInt();
    
    Map<String, Integer> frequencies = new HashMap<>();
    for (int i = 0; i < numberWords; i++) {
        frequencies.merge(scanner.next(), 1, Integer::sum);
    }
    
    frequencies.forEach((k, v) -> System.out.println(k + " -> " + v));
}

输出：

Pickle -> 1
test -> 2
rick -> 1
pickle -> 2

如果您对 Stream API 感到满意，您可以使用内置的 collector

toMap()

（注意，我们可以直接从流中读取输入）来解决此问题：

public static void main(String[] args) {
    Scanner scanner = new Scanner(System.in);
    
    Map<String, Integer> frequencies = IntStream.range(0,scanner.nextInt())
        .mapToObj(i -> scanner.next())
        .collect(Collectors.toMap(
            Function.identity(),
            i -> 1,
            Integer::sum
        ));
    
    frequencies.forEach((k, v) -> System.out.println(k + " -> " + v));
}

输出：

Pickle -> 1
test -> 2
rick -> 1
pickle -> 2

Answer 4

问题出在这个指令中：

if (wordsList[i].equals(wordsList[j]))

如果 j > i，则

wordsList[j]

为空。发生这种情况是因为您扫描了外部 for 中的单词，因此在内部 for 中您不知道索引 > i 处的单词。

因此，当您执行比较时，索引 j > i 的第二个单词结果为 null。

Answer 5

我使用了另一个名为 currWord 的字符串变量来保存wordsList[i]中的值，然后将currWord与wordsList[j]进行比较。这是代码：

import java.util.Scanner; 

public class LabProgram {
    public static void main(String[] args) {
        Scanner scnr = new Scanner(System.in);
        String [] userString = new String[20];
        int [] wordFreq = new int[20];
        String currWord;
        int stringLength;
        int i;
        int j;
      
        stringLength = scnr.nextInt();
        userString = new String[stringLength];
        wordFreq = new int[stringLength];
      
        for (i = 0; i < stringLength; ++i) {
            userString[i] = scnr.next();
            wordFreq[i] = 0;
        }
      
        for (i = 0; i < stringLength; ++i) {
            currWord = userString[i];
            for (j = 0; j < stringLength; ++j) {
                if (userString[j].compareTo(currWord) == 0) {
                wordFreq[i] = wordFreq[i] + 1;
                }
            }
        }
         
        for (i = 0; i < stringLength; ++i) {
            System.out.println(userString[i] + " - " + wordFreq[i]);
        }
    }
}

Answer 6

import java.util.Scanner; 

public class LabProgram {
   public static void main(String[] args) {
      /* Type your code here. */
      Scanner scnr = new Scanner(System.in);
      int numWords = scnr.nextInt();
      String[] wordsList = new String[numWords];
      int i;
      int j;
      int[] numOccur = new int[numWords];
      

      for (i = 0; i < numWords; ++i) {
         wordsList[i] = scnr.next();
      }
      
      for (i = 0; i < numWords; ++i) {
         for (j = 0; j < numWords; ++j) {
            if (wordsList[i].equals(wordsList[j])) {
               numOccur[i]++;
            }
         }
      }
      
      for (i = 0; i < numWords; ++i) {
         System.out.println(wordsList[i] + " - " + numOccur[i]);
      }
      
   }
}

词频

问题描述投票：0回答：6

6个回答

最新问题

词频

问题描述 投票：0回答：6

6个回答

最新问题

问题描述投票：0回答：6