在存储器中存储大地图

问题描述 投票:3回答:2

首先,问题的内容:我有一个非常大的图形,成本约4GB存储。关于3M节点和34M的边缘。我的计划借此大图和递归构建小图从它。在递归的每个级别我有两个图形 - 原始图和从原来创建的图形。递归继续,直到图形降低到非常小的图表说,大约10个节点。

因为我需要这些图表对整个程序的执行,记忆效率是我的应用程序的关键。

现在,这里就是我目前遇到的问题:这是创建从一个较大的小图的算法:

public static Graph buildByTriples(Graph g, ArrayList<Integer> seeds) {
    ArrayList<Edge> edges = new ArrayList(g.getEdgeCount());
    for (int i = 0; i < g.size(); i++) {
        for (Edge e : g.adj(i)) {
            int v = e.getEndpoint(i);
            if (i < v) {
                edges.add(e);
            }
        }
    }

    Table<Integer, Integer, Double> coarseEgdes = HashBasedTable.create(seeds.size(),seeds.size());
    //compute coarse weights
    edges.stream().forEach((e) -> {
        int v = e.getV();
        int u = e.getU();
        if (g.isC(u) && g.isC(v)) {
            addToTable(coarseEgdes, u, v, e.getWeight());
        }else if(!g.isC(u) && g.isC(v)){ //F-C
            for(Edge cEdge: g.cAdj(u)){//get coarse neighbors of the fine edges
                int nb = cEdge.getEndpoint(u);
                if(nb != v){
                    addToTable(coarseEgdes, v, nb, cEdge.getPij() * e.getWeight());

                }
            }
        }else if(g.isC(u) && !g.isC(v)){//C-F
            for(Edge cEdge: g.cAdj(v)){//get coarse neighbors of the fine edges
                int nb = cEdge.getEndpoint(v);
                if(nb != u){
                    addToTable(coarseEgdes, u, nb, cEdge.getPij() * e.getWeight());
                }
            }
        }else{//F-F
            for(Edge cEdgeU: g.cAdj(u)){//get coarse neighbors of the fine edges
                int uNb = cEdgeU.getEndpoint(u);
                for(Edge cEdgeV: g.cAdj(v)){
                    int vNb = cEdgeV.getEndpoint(v);
                    if(uNb != vNb){
                        addToTable(coarseEgdes, uNb, vNb, cEdgeU.getPij() * e.getWeight() * cEdgeV.getPij());
                    }
                }
            }
        }
    });

    return createGraph(g, coarseEgdes); //use the edges to build new graph. Basically loops through coarseEdges and add edge and weight to the new graph.
}

private static void addToTable(Table<Integer, Integer,Double> tbl, int r, int c, double val){
    int mn = Math.min(r, c);//the smaller of the two nodeIds
    int mx = Math.min(r, c);//the largest of the two nodeId
    if(tbl.contains(mn, mx)){
        tbl.put(mn, mx, tbl.get(mn, mx) + val);
    }else{
        tbl.put(mn, mx,val);
    }
}

现在,当我这样做,我很快耗尽内存。我外形有YourKit的应用程序,和内存使用情况是在屋面(> 6GB它用完之前),因此CPU占用太多。 coarseEdges可以得到真正的大。是否有内存更好的Map实现,在那里,与大型数据集尺度?还是有更好的方式来做到这一点不存储coarseEdges

PS:请注意,我的图形无法检索(U,V)在固定时间的优势。它基本上是列表的列表,这提供更好的我的应用程序的其他关键部分的性能。

**Also See my graph implementation code below: **
public class Graph{
    private final int SIZE;
    private final EdgeList[] nodes;
    private final float[] volumes;
    private final double[] weightedSum;
    private final double[] weightedCoarseSum;
    private final int[] nodeDegrees;
    private final int[] c_nodeDegrees;
    private int edge_count=0;
    private final boolean[] coarse;
    private final EdgeList[] coarse_neighbors;
    public Graph(int SIZE){
        this.SIZE =SIZE;
        nodes = new EdgeList[SIZE];
        coarse_neighbors = new EdgeList[SIZE];

        volumes = new float[SIZE];
        coarse = new boolean[SIZE];

        //initialize data
        weightedSum = new double[SIZE];
        weightedCoarseSum = new double[SIZE];
        nodeDegrees= new int[SIZE];
        c_nodeDegrees = new int[SIZE];

        for(int i=0;i<SIZE;i++){
            nodes[i]=new EdgeList();
            coarse_neighbors[i] = new EdgeList();
            volumes[i]=1;
        }
    }

    public void addEdge(int u, int v, double w){
        //graph is undirected
        //In order to traverse edges in order such that u < v. We store edge u,v such that u<v
        Edge e=null;
        if(u<v){
            e = new Edge(u,v,w);
        }else if(u>v){
            e = new Edge(v,u,w);
        }else{
            throw new UnsupportedOperationException("Self loops not allowed in graph"); //TODO: Need a graph validation routine
        }

        nodes[u].add(e);
        nodes[v].add(e);

        //update the weighted sum of each edge
        weightedSum[u] += w;
        weightedSum[v] += w;

        //update the degree of each edge
        ++nodeDegrees[u];
        ++nodeDegrees[v];

        ++edge_count;
    }

    public int size(){
        return SIZE;
    }

    public EdgeList adj(int v){
        return nodes[v];
    }

    public EdgeList cAdj(int v){
        return coarse_neighbors[v];
    }

    public void sortAdj(int u, Comparator<Edge> c){
        nodes[u].sort(c);
    }

    public void sortCoarseAdj(int u, Comparator<Edge> c){
        coarse_neighbors[u].sort(c);
    }

    public void setCoarse(int node, boolean c){
        coarse[node] = c;
        if(c){
            //update the neighborHood of node
            for(Edge e: adj(node)){
                int v = e.getEndpoint(node);
                coarse_neighbors[v].add(e);
                weightedCoarseSum[v] += e.getWeight();
                ++c_nodeDegrees[v];
            }
        }
    }

    public int getEdgeCount(){
        return edge_count;
    }

    public boolean isC(int id){
        return coarse[id];
    }

    public double weightedDegree(int node){
        return weightedSum[node];
    }

    public double weightedCoarseDegree(int node){
        return weightedCoarseSum[node];
    }

    public int degree(int u){
        return nodeDegrees[u];
    }

    public int cDegree(int u){
        return c_nodeDegrees[u];
    }

    public Edge getCNeighborAt(int u,int idx){
        return coarse_neighbors[u].getAt(idx);
    }

    public float volume(int u){
        return volumes[u];
    }

    public void setVolume(int node, float v){
        volumes[node] = v;
    }

    @Override
    public String toString() {
        return "Graph[nodes:"+SIZE+",edges:"+edge_count+"]";
    }

}


//Edges are first class objects.
public class Edge {
    private boolean deleted=false;
    private int u;
    private int v;
    private double weight;
    private double pij;
    private double algebraicDist = (1/Constants.EPSILON);

    public Edge(int u, int v, double weight) {
        this.u = u;
        this.v = v;
        this.weight = weight;
    }

    public Edge() {
    }

    public int getU() {
        return u;
    }

    public void setU(int u) {
        this.u = u;
    }

    public int getV() {
        return v;
    }

    public void setV(int v) {
        this.v = v;
    }

    public int getEndpoint(int from){
        if(from == v){
            return u;
        }

        return v;
    }

    public double getPij() {
        return pij;
    }

    public void setPij(double pij) {
        this.pij = pij;
    }

    public double getAlgebraicDist() {
        return algebraicDist;
    }

    public void setAlgebraicDist(double algebraicDist) {
        this.algebraicDist = algebraicDist;
    }

    public boolean isDeleted() {
        return deleted;
    }

    public void setDeleted(boolean deleted) {
        this.deleted = deleted;
    }

    public double getWeight() {
        return weight;
    }

    public void setWeight(double weight) {
        this.weight = weight;
    }

    @Override
    public String toString() {
        return "Edge[u:"+u+", v:"+v+"]";
    }
}


// The Edge iterable
public class EdgeList implements Iterable<Edge>{
    private final ArrayList<Edge> data= new ArrayList();

    public void add(Edge e){
        data.add(e);
    }

    @Override
    public Iterator<Edge> iterator() {
        Iterator<Edge> it = new IteratorImpl();
        return it;
    }

    private class IteratorImpl implements Iterator<Edge> {

        public IteratorImpl() {
        }
        private int currentIndex = 0;
        private final int N = data.size();
        @Override
        public boolean hasNext() {

            //skip deleted
            while(currentIndex < N && data.get(currentIndex).isDeleted()){
                currentIndex++;
            }

            return currentIndex < N;
        }

        @Override
        public Edge next() {
            return data.get(currentIndex++);
        }

        @Override
        public void remove() {
            throw new UnsupportedOperationException();
        }
    }

    public Edge getAt(int idx){
        return data.get(idx);
    }

    public void sort(Comparator<Edge> c){
        data.sort(c);
    }
}
java performance graph guava
2个回答
4
投票

在这里瞎几刺 - 你将需要实现他们看到多少帮助。

1)您可以考虑使用复合键(INT,INT)与HashMap中,而不是番石榴表。这将是只是边缘的权重无疑更有效率。如果您需要查询的边缘某些顶点外发,那是不太明显的,但你需要看到CPU与内存权衡。

2)如果你使用普通的HashMap,你可以考虑使用离堆实现之一。看看https://github.com/OpenHFT/Chronicle-Map例如,它可能

3)如果你留在记忆,想挤出一些额外的空间,你可以做一些肮脏的把戏与语映射。使用长>双地图,例如http://labs.carrotsearch.com/download/hppc/0.4.1/api/com/carrotsearch/hppc/LongDoubleMap.htmlhttp://trove4j.sourceforge.net/javadocs/gnu/trove/map/hash/TLongDoubleHashMap.html,编码您2xint顶点对为长,看看它有多少帮助。如果使用的是64位,整数可以采取16个字节(假定压缩糟糕),双24个字节 - 这给出了每个条目32 + 24 = 56字节,相较于8 + 8与原始地图


1
投票

我建议给番石榴的ValueGraph一看像这些情况。

这可能是因为你也许可以使你的数据结构进行递归图表更有效;多少递归步骤是有你的数据集,以及如何在不断变化的曲线图的大小?

© www.soinside.com 2019 - 2024. All rights reserved.