How to use ConvertToWritableTypes in scala spark? -
i looking @ basicsavesequencefile , tried follow in scala
so had:
val input = seq(("coffee", 1), ("coffee", 2), ("pandas", 3)) val inputrdd = sc.parallelize(input) // no parallelizepairs
but when try:
val outputrdd = inputrdd.map(new converttowritabletypes()) // have no maptopair how write instead?
how use converttowritabletypes in scala spark?
right get:
error:(29, 38) type mismatch; found : sparkexamplewriteseqlzo.converttowritabletypes required: ((string, int)) => ? val outputrdd = inputrdd.map(new converttowritabletypes()) ^
so you're looking @ java version, should looking @ scala version api's different. example give, don't need maptopair
, can use normal map
without static class:
import org.apache.hadoop.io.intwritable import org.apache.hadoop.io.text val input = seq(("coffee", 1), ("coffee", 2), ("pandas", 3)) val inputrdd = sc.parallelize(input) val outputrdd = inputrdd.map(record => (new text(record._1), new intwritable(record._2)))
you don't need though, scala example linked shows you:
val data = sc.parallelize(list(("holden", 3), ("kay", 6), ("snail", 2))) data.saveassequencefile(outputfile)
Comments
Post a Comment