sql - regexp in PySpark -


i trying reproduce results of django orm query in pyspark:

social_filter = '(facebook|flipboard|linkedin|pinterest|reddit|twitter)' collection.objects.filter(social__iregex=social_filter) 

my main problem should case insensitive.

i have tried this:

social_filter = "social ilike 'facebook' or social ilike 'flipboard' or social ilike 'linkedin' or social ilike 'pinterest' or social ilike 'reddit' or social ilike 'twitter'" df = sessions.filter(social_filter) 

which result in following error:

py4jjavaerror: error occurred while calling o31.filter. : java.lang.runtimeexception: [1.22] failure: end of input expected  social ilike 'facebook' or social ilike 'flipboard' or social ilike 'linkedin' or social ilike 'pinterest' or social ilike 'reddit' or social ilike 'twitter' 

and following expression:

social_filter = "social  ~* (facebook|flipboard|linkedin|pinterest|reddit|twitter)" df = sessions.filter(social_filter) 

crashes this:

py4jjavaerror: error occurred while calling o31.filter. : java.lang.runtimeexception: [1.17] failure: identifier expected  social  ~* (facebook|flipboard|linkedin|pinterest|reddit|twitter)        ^     @ scala.sys.package$.error(package.scala:27)     @ org.apache.spark.sql.catalyst.sqlparser.parseexpression(sqlparser.scala:45)     @ org.apache.spark.sql.dataframe.filter(dataframe.scala:652)     @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) 

please, help!

how following:

>>> rdd = sc.parallelize([row(name='bob', social='twitter'),                            row(name='steve', social='facebook')]) >>> df = sqlcontext.createdataframe(rdd) >>> df.where("lower(social) 'twitter'").collect() [row(name=u'bob', social=u'twitter')] 

you can of social networks want if need actual regular expression. otherwise, if match exact, can this:

>>> df.where("lower(social) in ('twitter', 'facebook')").collect() [row(name=u'bob', social=u'twitter'), row(name=u'steve', social=u'facebook')] 

Comments

Popular posts from this blog

Android : Making Listview full screen -

javascript - Parse JSON from the body of the POST -

javascript - Chrome Extension: Interacting with iframe embedded within popup -