I\'m trying to match SHA1\'s in generic text with a regular expression.
Ideally I want to avoid matching words.
It\'s safe to say that full SHA1\'s have a d
I'm going to assume you want to match against hexadecimal printed representation of a SHA1, and not against the equivalent 20 raw bytes. Furthermore, I'm going to assume that the SHA1's in question use only lower-case letters to represent hex digits. You'll have to adjust the regular expression if your requirements differ.
grep -o -E -e "[0-9a-f]{40}"
Will match such a SHA1. You'll need to translate the above regular expression from egrep's dialect to whatever tool you happen to be using. Since the match must be exactly 40 characters long I don't think you're in danger of accidentally matching words. I don't know of any 40-character words that consist only of the letters a through f.
edit:
Better yet: use A Regex to match a SHA1 as his solution includes checking for word boundaries at both ends. I overlooked that above.