//this code is to compare two files and delet stop list word from file algorithm
FileReader reader = new FileReader("C:\\Users\\Sara\\Desktop\\IRP\\Information Retrieval\\Algorithm.txt");
BufferedReader bufferedReader = new BufferedReader(reader);
FileReader readerStopList = new FileReader("C:/Users/Sara/Desktop/IRP/stopwords2.txt");
BufferedReader bufferedReaderStopList = new BufferedReader(readerStopList);
String word, stopword, newWord = "";
while ((word = bufferedReader.readLine()) != null) {
for (int k = 0; k < word.split(" ").length; k++) {
int count = 0;
newWord = word.split(" ")[k];
int n = newWord.length();
if (n > 2) { //this statment to skip words of length 2
while ((stopword = bufferedReaderStopList.readLine()) != null) {
for (int j = 0; j < stopword.split(" ").length; j++) {
if (newWord.equalsIgnoreCase(stopword.split(" ")[j])) {
count++;
}
}
}
if (count == 0) {
System.out.println(newWord);
}
}
}
假设n> 2为真,则从bufferedReaderStopList读取所有行,直到达到EOF。这意味着每当n> 2为真时,永远不会输入bufferedReaderStopList上的内部循环,因为readLine()从现在开始总是返回null。
对于初学者来说,您的代码需要更好地构建,至少首先将bufferedReaderStopList的内容添加到数组中。还要避免多次对单词串进行拆分。做一次,然后使用生成的数组。