hadoop - Does Spark not support arraylist when writing to elasticsearch? -


i have following structure:

mylist = [{"key1":"val1"}, {"key2":"val2"}] myrdd = value_counts.map(lambda item: ('key', {      'field': somelist  })) 

i error: 15/02/10 15:54:08 info scheduler.tasksetmanager: lost task 1.0 in stage 2.0 (tid 6) on executor ip-10-80-15-145.ec2.internal: org.apache.spark.sparkexception (data of type java.util.arraylist cannot used) [duplicate 1]

rdd.saveasnewapihadoopfile(              path='-',              outputformatclass="org.elasticsearch.hadoop.mr.esoutputformat",              keyclass="org.apache.hadoop.io.nullwritable",              valueclass="org.elasticsearch.hadoop.mr.linkedmapwritable",              conf={          "es.nodes" : "localhost",          "es.port" : "9200",          "es.resource" : "mboyd/mboydtype"      })  

what want document end when written es is:

{ field:[{"key1":"val1"}, {"key2":"val2"}] } 

just had problem, , solution passes converting lists tuples . converting json same.


Comments

Popular posts from this blog

Android : Making Listview full screen -

javascript - Parse JSON from the body of the POST -

javascript - Chrome Extension: Interacting with iframe embedded within popup -