Anyone out there know how to improve this function? I\'m not worried about shortening the code, I\'m sure this could be done with better regex, I am more concerned about cor
In Hive, the SSN validation or ITIN validation is like:
select case when '078051120' rlike '^(?!1{9}|2{9}|3{9}|4{9}|5{9}|6{9}|7{9}|8{9}|9{9}|219099999|078051120|123456789)(?!666|000|9[0-9]{2})[0-9]{3}(?!00)[0-9]{2}(?!0{4})[0-9]{4}$' then 'SSN'
when '078051120' rlike '^9[0-9]{2}(7[0-9]|80|81|82|83|84|85|86|87|88)[0-9]{4}$' then 'ITIN'
else 'INVLD'
end as ssn_flg;
change the dummy '078051120' to column name of your table.