hadoop - Does Spark not support arraylist when writing to elasticsearch? -
i have following structure:
mylist = [{"key1":"val1"}, {"key2":"val2"}] myrdd = value_counts.map(lambda item: ('key', { 'field': somelist }))
i error: 15/02/10 15:54:08 info scheduler.tasksetmanager: lost task 1.0 in stage 2.0 (tid 6) on executor ip-10-80-15-145.ec2.internal: org.apache.spark.sparkexception (data of type java.util.arraylist cannot used) [duplicate 1]
rdd.saveasnewapihadoopfile( path='-', outputformatclass="org.elasticsearch.hadoop.mr.esoutputformat", keyclass="org.apache.hadoop.io.nullwritable", valueclass="org.elasticsearch.hadoop.mr.linkedmapwritable", conf={ "es.nodes" : "localhost", "es.port" : "9200", "es.resource" : "mboyd/mboydtype" })
what want document end when written es is:
{ field:[{"key1":"val1"}, {"key2":"val2"}] }
just had problem, , solution passes converting lists tuples . converting json same.
Comments
Post a Comment