Java树形集的奇怪行为

问题描述 投票:0回答:1

我想用平衡BST代替优先队列来实现Prim的最小生成树算法。我的实现是在Java中进行的。由于Java已经有一个以TreeSet形式的红黑树库实现,我想用它来代替我自己的自定义实现。

一个典型的使用Min Priority Queue的Prim的实现需要O(ElogE),而我这个实现背后的意图是将时间复杂度降低到O(ElogV)。我相信这也可以使用索引优先队列(IPQ)来实现,但我选择了BST版本,因为有自平衡BST的库实现(在Java和C++中都有)。

对于我测试过的例子,这个实现似乎工作得很好,并且产生了正确的结果。但当我做了更深层次的检查,以确保TC实际上是O(ElogV)时,我发现Java TreeSet对我来说表现得很奇怪,原因是什么。

这是我的实现。

    package graph;

    import java.util.Comparator;
    import java.util.TreeSet;

    /**
     * Implementation of Prim's algorithm (eager version) to
     * find Minimum Spanning Tree using a self-balancing BST
     * Time Complexity: O(ElogV)
     * 
     * This implementation uses a self-balancing BST (specifically Red-Black Tree)
     * 
     * We can do eager Prim's implementation using an Indexed Priority Queue (IPQ) as well
     * 
     * Comparison: IPQ vs BST
     * To get next best edge in IPQ, we pop the min element from root, and 
     * then heapify the tree, which overall takes O(lgn). To get next best edge in 
     * BST, it would take O(lgn) as well, and then we’ll have to delete that entry 
     * which would take another O(lgn), but overall it is still O(lgn)
     * 
     * Insertion into both BST and IPQ takes O(lgn) anyway
     * 
     * Update in IPQ takes O(lgn). Update in BST as well can be done in 
     * O(lgn) [search the element in O(lgn) then delete that entry in another 
     * O(lgn) and then insert new entry with updated edge weight (and source node) 
     * in yet another O(lgn). Intotal, it takes 3*logn but overall still O(lgn)]
     *
     */
    public class PrimsMstUsingSelfBalancingBST extends Graph {

        private int n, m, edgeCount;
        private boolean[] visited;
        private Edge[] mst;
        private double cost;
        private TreeSet<Edge> sortedSet;

        public PrimsMstUsingSelfBalancingBST(int numberOfVertices) {
            super(numberOfVertices);
            n = numberOfVertices;
        }

        public Double mst(int s) {
            m = n - 1; // number of expected edges in mst
            edgeCount = 0;
            mst = new Edge[m];
            visited = new boolean[n];
            sortedSet = new TreeSet<>(getComparator());

            relaxEdgesAtNode(s);

            while (!sortedSet.isEmpty() && edgeCount != m) {
                System.out.println(sortedSet);
                // pollFirst() retrieves and removes smallest element from TreeSet
                Edge edge = sortedSet.pollFirst();
                int nodeIndex = edge.to;

                // skip edges pointing to already visited nodes
                if (visited[nodeIndex]) continue;

                mst[edgeCount] = edge;
                edgeCount++;
                cost += edge.wt;

                relaxEdgesAtNode(nodeIndex);
            }

            return (edgeCount == m) ? cost : null;
        }

        private void relaxEdgesAtNode(int index) {
            visited[index] = true;

            for (Edge edge : adjList.get(index))  {
                int to = edge.to;

                if (visited[to]) continue;

                if (!sortedSet.contains(edge)) {
                    sortedSet.add(edge);
                }
                else {
                    Edge existingEdge = search(edge);
                    if (existingEdge.wt > edge.wt) {
                        sortedSet.remove(existingEdge);
                        sortedSet.add(edge);
                    }
                }
            }
        }

        private Comparator<Edge> getComparator() {
            return new Comparator<Edge>() {
                @Override
                public int compare(Edge e1, Edge e2) {
                    // Java TreeSet is implemented in a way that it uses 
                    // Comparable/Comparator logics for all comparisons.

                    // i.e., it will use this comparator to do comparison 
                    // in contains() method.

                    // It will use this same comparator to do comparison 
                    // during remove() method.

                    // It will also use this same comparator, to perform 
                    // sorting during add() method.

                    // While looking up an edge from contains() or remove(), 
                    // we need to perform check based on destinationNodeIndex,

                    // But to keep elements in sorted order during add() operation
                    // we need to compare elements based on their edge weights

                    // For correct behavior of contains() and remove()
                    if (e1.to == e2.to) return 0;

                    // For correct sorting behavior
                    if (e1.wt > e2.wt) return 1;
                    else if (e1.wt < e2.wt) return -1;

                    // Return -1 or 1 to make sure that different edges with equal 
                    // weights are not ignored by TreeSet.add()
                    // this check can be included in either 'if' or 'else' part 
                    // above. Keeping this separate for readability.
                    return -1;
                }
            };
        }

        // O(log n) search in TreeSet
        private Edge search(Edge e) {
            Edge ceil  = sortedSet.ceiling(e); // smallest element >= e
            Edge floor = sortedSet.floor(e);   // largest element <= e

            return ceil.equals(floor) ? ceil : null; 
        }

        public static void main(String[] args) {
            example1();
        }

        private static void example1() {
            int n = 8;
            PrimsMstUsingSelfBalancingBST graph = 
                    new PrimsMstUsingSelfBalancingBST(n);

            graph.addEdge(0, 1, true, 10);
            graph.addEdge(0, 2, true, 1);
            graph.addEdge(0, 3, true, 4);
            graph.addEdge(2, 1, true, 3);
            graph.addEdge(2, 5, true, 8);
            graph.addEdge(2, 3, true, 2);
            graph.addEdge(3, 5, true, 2);
            graph.addEdge(3, 6, true, 7);
            graph.addEdge(5, 4, true, 1);
            graph.addEdge(5, 7, true, 9);
            graph.addEdge(5, 6, true, 6);
            graph.addEdge(4, 1, true, 0);
            graph.addEdge(4, 7, true, 8);
            graph.addEdge(6, 7, true, 12);

            int s = 0;
            Double cost = graph.mst(s);
            if (cost != null) {
                System.out.println(cost); // 20.0
                for (Edge e : graph.mst)
                    System.out.println(String.format("Used edge (%d, %d) with cost: %.2f", e.from, e.to, e.wt));
                /*
                 * Used edge (0, 2) with cost: 1.00
                 * Used edge (2, 3) with cost: 2.00
                 * Used edge (3, 5) with cost: 2.00
                 * Used edge (5, 4) with cost: 1.00
                 * Used edge (4, 1) with cost: 0.00
                 * Used edge (5, 6) with cost: 6.00
                 * Used edge (4, 7) with cost: 8.00
                 */
            }
            else {
                System.out.println("MST not found!");
            }
        }
    }

下面是我测试的不定向加权图(代码中也使用了同样的例子)

example image

我面临的问题是,TreeSet似乎增加了重复的条目,因为它的contains()方法有时会返回false,即使已经存在一个具有相同键的对应条目(在这种情况下是边的目的节点)。

下面是上述程序的输出。

[{from=0, to=2, weight=1.00}, {from=0, to=3, weight=4.00}, {from=0, to=1, weight=10.00}]
[{from=2, to=3, weight=2.00}, {from=2, to=1, weight=3.00}, {from=2, to=5, weight=8.00}, {from=0, to=1, weight=10.00}]
[{from=3, to=5, weight=2.00}, {from=2, to=1, weight=3.00}, {from=3, to=6, weight=7.00}, {from=0, to=1, weight=10.00}]
[{from=5, to=4, weight=1.00}, {from=2, to=1, weight=3.00}, {from=5, to=6, weight=6.00}, {from=5, to=7, weight=9.00}, {from=0, to=1, weight=10.00}]
[{from=4, to=1, weight=0.00}, {from=5, to=6, weight=6.00}, {from=4, to=7, weight=8.00}, {from=0, to=1, weight=10.00}]
[{from=5, to=6, weight=6.00}, {from=4, to=7, weight=8.00}, {from=0, to=1, weight=10.00}]
[{from=4, to=7, weight=8.00}, {from=0, to=1, weight=10.00}]
20.0
Used edge (0, 2) with cost: 1.00
Used edge (2, 3) with cost: 2.00
Used edge (3, 5) with cost: 2.00
Used edge (5, 4) with cost: 1.00
Used edge (4, 1) with cost: 0.00
Used edge (5, 6) with cost: 6.00
Used edge (4, 7) with cost: 8.00

可以清楚地看到,即使目的节点1已经有一个条目{from=0, to=1, weight=10.00})[输出的第1行],它也会为其添加另一个条目{from=2, to=1, weight=3.00}[输出的第2行],而不是更新现有的条目。

当我通过在自定义比较器中添加一个断点来调试时,我注意到比较器从来没有为现有的条目被调用,因此与现有条目的比较没有发生。例如在本例中,当处理边缘{from=2,to=1,weight=3.00}时,Comparator.compare()对条目{from=2,to=3,weight=2.00}和{from=2,to=5,weight=8.00}被调用,但对条目{from=0,to=1,weight=10. 00},因此它的结论是没有目的节点1的条目,所以它增加了一个新的条目,因此我得到目的节点1的两个条目[输出的第2行]。

我怀疑这与Java Collections框架中对象的不可变性和并发修改限制有关。但我无法理解问题的根本原因。

感谢任何帮助。

java treeset minimum-spanning-tree prims-algorithm
1个回答
2
投票

Comparator 违反了其合同,例如。

实施者必须确保 sgn(compare(x, y)) == -sgn(compare(y, x)) 对于所有 xy. (这意味着 compare(x, y) 必须在以下情况下抛出异常 compare(y, x) 抛出一个异常)。)

这就是 compare 方法,不含所有注释。

public int compare(Edge e1, Edge e2) {
    if (e1.to == e2.to) return 0;

    if (e1.wt > e2.wt) return 1;
    else if (e1.wt < e2.wt) return -1;

    return -1;
}

例如,你有两条重量为1的边。

a = {from=0, to=2, weight=1.00}
b = {from=5, to=4, weight=1.00}

由于它们的重量不同 to 值,但相同 wt 值,该方法返回 -1,而不考虑参数顺序,即 compare(a, b) = -1compare(b, a) = -1.

这违反了上面列出的规则,将导致不可预知的行为。

© www.soinside.com 2019 - 2024. All rights reserved.