I\'m trying to match SHA1\'s in generic text with a regular expression.
Ideally I want to avoid matching words.
It\'s safe to say that full SHA1\'s have a d
What exactly are you trying to do? You shouldn't need to parse anything git outputs with heuristics -- you can always request exactly the data you need.
If you want to match a full hex representation of an SHA1 sum, try:
/\b([a-f0-9]{40})\b/
That is, a word consisting of 40 characters which are either digits or the letters a through f.
If you only have a few characters and don't know where they are, then you are pretty much out of luck. Is "e78fd98" an abbreviated commit ID? Maybe, but what about "1234567"? Is that a commit ID? A problem ticket number? A number that makes a test fail?
Without context, you can't really know what the data means.
To answer your direct question, there is no property of SHA1 that would make the first three characters (in hex form) digits. You are just lucky, or perhaps unlucky, depending on how you look at it.