Is there any Spark GraphX constructor with merge function for duplicate Vertices -


i have graph many duplicate vertices, different attributes(long).

     val vertices: rdd[(vertexid, long)] ...     val edges: rdd[edge[long]] ...      val graph = graph(vertices, edges, 0l) 

by default graphx merge duplicate vertices` attributes default function

vertexrdd(vertices, edges, defaultval, (a, b) => a) 

so depends on order of vertices attribute stay in final graph.

i wonder there way set merge func? becase example need merge duplicate vertices following function

(a,b) => min(a,b) 

i did not find public constructor or else.

do need create graph following code

val edgerdd = edgerdd.fromedges(edges)(classtag[ed], classtag[vd])    .withtargetstoragelevel(edgestoragelevel).cache()  val vertexrdd = vertexrdd(vertices, edgerdd, defaultvertexattr, (a,b)=>min(a,b))     .withtargetstoragelevel(vertexstoragelevel).cache()  graphimpl(vertexrdd, edgerdd) 

you've answered of own question, if looking way control merge , otherwise still use existing constructor do:

val vertices: rdd[(vertexid, long)] ... val edges: rdd[edge[long]] ... val mergedvertices = vertexrdd(vertices, edges, default, mergefun)  val graph = graph(mergedvertices, edges, 0l) 

this possible since vertexrdd subclass of rdd[(vertexid, vd)] (in case vd long).


Comments

Popular posts from this blog

Android : Making Listview full screen -

javascript - Parse JSON from the body of the POST -

javascript - Chrome Extension: Interacting with iframe embedded within popup -