I have a hive table with the following schema:
COOKIE | PRODUCT_ID | CAT_ID | QTY 1234123 [1,2,3] [r,t,null] [2,1,null]
How
If you are using Spark 2.4 in pyspark, use arrays_zip with posexplode:
arrays_zip
posexplode
df = (df .withColumn('zipped', arrays_zip('col1', 'col2')) .select('id', posexplode('zipped')))