ARRAY_CONTAINS muliple values in hive

匿名 (未验证) 提交于 2019-12-03 10:24:21

问题:

Is there a convenient way to use the ARRAY_CONTAINS function in hive to search for multiple entries in an array column rather than just one? So rather than:

WHERE ARRAY_CONTAINS(array, val1) OR ARRAY_CONTAINS(array, val2) 

I would like to write:

WHERE ARRAY_CONTAINS(array, val1, val2) 

The full problem is that I need to read val1 and val2 dynamically from the command line arguments when I run the script and I generally don't know how many values will be conditioned on. So you can think of vals being a comma separated list (or array) containing values val1, val2, ..., and I want to write

WHERE ARRAY_CONTAINS(array, vals) 

Thanks in advance!

回答1:

There is a UDF here that will let you take the intersection of two arrays. Assuming your values have the structure

values_array = [val1, val2, ..., valn] 

You could then do

where array_intersection(array, values_array)[0] is not null 

If they don't have any elements in common, [] will be returned and therefore [][0] will be null



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!