I\'m trying to search for the word Gadaffi. What\'s the best regular expression to search for this?
My best attempt so far is:
\\b[KG]h?add?af?fi$\\
I know this is an old question, but...
Neither of these two regexes is the prettiest, but they are optimized and both match ALL the variations in the original post.
"Little Beauty" #1
(?:G(?:a(?:d(?:d(?:af[iy]|hafi)|af(?:f?i|y)|hafi)|thafi)|h(?:ad(?:daf[iy]|af?fi)|eddafi))|K(?:a(?:d(?:['dh]a|af?)|zza)fi|had(?:af?fy|dafi))|Q(?:a(?:d(?:(?:(?:hd)?|t)h|d)?|th)|u(?:at|d)h)afi)
"Little Beauty" #2
(?:(?:Gh|[GK])adaff|(?:(?:Gh|[GKQ])ad|(?:Ghe|(?:[GK]h|[GKQ])a)dd|(?:Gadd|(?:[GKQ]a|Q(?:adh|u))d|(?:Qad|(?:Qu|[GQ])a)t)h|Ka(?:zz|d'))af)i|(?:Khadaff|(?:(?:Kh|G)ad|Gh?add)af)y
Rest in Peace, Muammar.
What else starts with Q, G, or K, has a d, z or t in the middle, and ends in "fi" the people actually search for?
/\b[GQK].+[dzt].+fi\b/i
Done.
>>> print re.search(a, "Gadasadasfiasdas") != None
False
>>> print re.search(a, "Gadasadasfi") != None
True
>>> print re.search(a, "Qa'dafi") != None
True
Interesting that I'm getting downvoted. Can someone leave some false positives in the comments?
If you've got a concrete listing of all 30 possibilities, just concatenate them all together with a bunch of "ors". Then you can be sure that it only matches the exact things you've listed, and no more. Your RE engine will probably be able to optimize in further, and, well, with 30 choices even if it doesn't it's still not a big deal. Trying to fiddle around with manually turning it into a "clever" RE can't possibly turn out better and may turn out worse.
(G|Gh|K|Kh|Q|Qh|Q|Qu)(a|au|e|u)(dh|zz|th|d|dd)(dh|th|a|ha|)(\x27|)(a|)(ff|f)(i|y)
Certainly not the most optimized version, split on syllables to maximize matches while trying to make sure we don't get false positives.
A possible alternative is the online tool for generate regular expressions from examples http://regex.inginf.units.it. Give it a chance!
Just an addendum: you should add "Gheddafi" as alternate spelling. So the RE should be
\b[KG]h?[ae]dd?af?fi$\b