从asp.net c#中的字符串中删除停用词>> [

问题描述 投票:0回答:4
我在创建从字符串中删除停用词的代码时遇到麻烦。这是我的代码:

String Review="The portfolio is fine except for the fact that the last movement of sonata #6 is missing. What should one expect?"; string[] arrStopword = new string[] {"a", "i", "it", "am", "at", "on", "in", "to", "too", "very","of", "from", "here", "even", "the", "but", "and", "is","my","them", "then", "this", "that", "than", "though", "so", "are"}; StringBuilder sbReview = new StringBuilder(Review); foreach (string word in arrStopword){ sbReview.Replace(word, "");} Label1.Text = sbReview.ToString();

运行Label1.Text = "The portfolo s fne except for fct tht lst movement st #6 s mssng. Wht should e expect? "

我希望它必须返回"portofolio fine except for fact last movement sonata #6 is missing. what should one expect?"

有人知道如何解决这个问题?

我在创建从字符串中删除停用词的代码时遇到麻烦。这是我的代码:String Review =“除了没有奏鸣曲#6的最后动作的事实,组合还不错。...

c# asp.net stop-words
4个回答
0
投票
您可以使用“ a”,“ I”等来确保程序仅在将这些单词用作单词时才删除这些单词(因此它们之间有空格)。只需将它们替换为空格以保持格式不变。

2
投票
您可以使用LINQ解决此问题。您首先需要使用string函数将Split转换为以list(space)分隔的string" ",然后使用Except来获取结果将包含的单词,然后可以申请string.Join

1
投票
问题是您正在比较子字符串,而不是单词。您需要拆分原始文本,删除项目,然后再次加入。

0
投票
或者您可以使用dotnet-stop-words package。只需调用RemoveStopWords方法
© www.soinside.com 2019 - 2024. All rights reserved.