Is there any Spark GraphX constructor with merge function for duplicate Vertices -
i have graph many duplicate vertices, different attributes(long).
val vertices: rdd[(vertexid, long)] ... val edges: rdd[edge[long]] ... val graph = graph(vertices, edges, 0l)
by default graphx merge duplicate vertices` attributes default function
vertexrdd(vertices, edges, defaultval, (a, b) => a)
so depends on order of vertices attribute stay in final graph.
i wonder there way set merge func? becase example need merge duplicate vertices following function
(a,b) => min(a,b)
i did not find public constructor or else.
do need create graph following code
val edgerdd = edgerdd.fromedges(edges)(classtag[ed], classtag[vd]) .withtargetstoragelevel(edgestoragelevel).cache() val vertexrdd = vertexrdd(vertices, edgerdd, defaultvertexattr, (a,b)=>min(a,b)) .withtargetstoragelevel(vertexstoragelevel).cache() graphimpl(vertexrdd, edgerdd)
you've answered of own question, if looking way control merge , otherwise still use existing constructor do:
val vertices: rdd[(vertexid, long)] ... val edges: rdd[edge[long]] ... val mergedvertices = vertexrdd(vertices, edges, default, mergefun) val graph = graph(mergedvertices, edges, 0l)
this possible since vertexrdd subclass of rdd[(vertexid, vd)] (in case vd long).
Comments
Post a Comment