ConcurrentHashMap值未更新

问题描述 投票:-1回答:1

我正在尝试使用一组包含单词的文档来创建一个简单的多线程字典/索引。字典存储在带有字符串键和Vector值的ConcurrentHashMap中。对于字典中的每个单词,都有一个外观列表,该外观列表是带有一系列Tuple对象(自定义对象)的向量。(在我的情况下,Tuple是2个数字的组合)。

每个线程将一个文档作为输入,在其中找到所有单词,然后尝试更新ConcurrentHashMap。另外,我必须指出,2个线程可能会尝试通过添加其值(新的元组)来更新Map的相同键。我只对Vector进行写操作。

下面您将看到用于提交新线程的代码。如您所见,我将字典作为输入,该字典是带有字符串键和向量值的ConcurrentHashMap

public void run(Crawler crawler) throws InterruptedException {
        while (!crawler.getFinishedPages().isEmpty()) {
            this.INDEXING_SERVICE.submit(new IndexingTask(this.dictionary, sources, 
                                                          crawler.getFinishedPages().take()));
        }
        this.INDEXING_SERVICE.shutdown();
}

下面您可以看到和索引线程的代码:

public class IndexingTask implements Runnable {

    private ConcurrentHashMap<String, Vector<Tuple>> dictionary;
    private HtmlDocument document;

    public IndexingTask(ConcurrentHashMap<String, Vector<Tuple>> dictionary,
                        ConcurrentHashMap<Integer, String> sources, HtmlDocument document) {
        this.dictionary = dictionary;
        this.document = document;
        sources.putIfAbsent(document.getDocId(), document.getURL());
    }

    @Override
    public void run() {

        for (String word : document.getTerms()) {

            this.dictionary.computeIfAbsent(word, k -> new Vector<Tuple>())
                    .add(new Tuple(document.getDocId(), document.getWordFrequency(word)));

        }
    }
}

该代码似乎正确,但是字典未正确更新。我的意思是原始词典中缺少某些单词(键),而其他一些键在其Vector中的项目较少。

我已经进行了一些调试,我发现在终止线程实例之前,它已计算出正确的键和值。尽管线程中作为输入提供的原始词典(请看第一段代码)没有正确更新,您有任何想法或建议吗?

java dictionary java-threads concurrenthashmap
1个回答
0
投票
import java.util.Arrays; import java.util.List; import java.util.Vector; import java.util.concurrent.*; import java.util.concurrent.atomic.AtomicInteger; class Tuple { private Integer key; private String value; public Tuple(Integer key, String value) { this.key = key; this.value = value; } @Override public String toString() { return "(" + key + ", " + value + ")"; } } class HtmlDocument { private int docId; private String URL; private List<String> terms; public int getDocId() { return docId; } public void setDocId(int docId) { this.docId = docId; } public String getURL() { return URL; } public void setURL(String URL) { this.URL = URL; } public List<String> getTerms() { return terms; } public void setTerms(List<String> terms) { this.terms = terms; } public String getWordFrequency(String word) { return "query"; } } class IndexingTask implements Runnable { private ConcurrentHashMap<String, Vector<Tuple>> dictionary; private HtmlDocument document; public IndexingTask(ConcurrentHashMap<String, Vector<Tuple>> dictionary, ConcurrentHashMap<Integer, String> sources, HtmlDocument document) { this.dictionary = dictionary; this.document = document; sources.putIfAbsent(document.getDocId(), document.getURL()); } @Override public void run() { for (String word : document.getTerms()) { this.dictionary.computeIfAbsent(word, k -> new Vector<Tuple>()) .add(new Tuple(document.getDocId(), document.getWordFrequency(word))); } Crawler.RUNNING_TASKS.decrementAndGet(); } } class Crawler { protected BlockingQueue<HtmlDocument> finishedPages = new LinkedBlockingQueue<>(); public static final AtomicInteger RUNNING_TASKS = new AtomicInteger(); public BlockingQueue<HtmlDocument> getFinishedPages() { return finishedPages; } } public class ConcurrentHashMapExample { private ConcurrentHashMap<Integer, String> sources = new ConcurrentHashMap<>(); private ConcurrentHashMap<String, Vector<Tuple>> dictionary = new ConcurrentHashMap<>(); private static final ExecutorService INDEXING_SERVICE = Executors.newSingleThreadExecutor(); public void run(Crawler crawler) throws InterruptedException { while (!crawler.getFinishedPages().isEmpty()) { Crawler.RUNNING_TASKS.incrementAndGet(); this.INDEXING_SERVICE.submit(new IndexingTask(this.dictionary, sources, crawler.getFinishedPages().take())); } //when you call ```this.INDEXING_SERVICE.shutdown()``` may 'IndexingTask' has not run yet while (Crawler.RUNNING_TASKS.get() > 0) Thread.sleep(3); this.INDEXING_SERVICE.shutdown(); } public ConcurrentHashMap<Integer, String> getSources() { return sources; } public ConcurrentHashMap<String, Vector<Tuple>> getDictionary() { return dictionary; } public static void main(String[] args) throws Exception { ConcurrentHashMapExample example = new ConcurrentHashMapExample(); Crawler crawler = new Crawler(); HtmlDocument document = new HtmlDocument(); document.setDocId(1); document.setURL("http://127.0.0.1/abc"); document.setTerms(Arrays.asList("hello", "world")); crawler.getFinishedPages().add(document); example.run(crawler); System.out.println("source: " + example.getSources()); System.out.println("dictionary: " + example.getDictionary()); } }

输出:

source: {1=http://127.0.0.1/abc}
dictionary: {world=[(1, query)], hello=[(1, query)]}

我认为,在您的企业中,应该使用“生产者”,“消费者”设计模式

© www.soinside.com 2019 - 2024. All rights reserved.