I know that there are similar questions to mine that have been answered, but after reading through them I still don\'t have the solution I\'m looking for.
Using Pyth
You have this regular expression:
pattern = "(January|February|March|April|May|June|July|August|September|October|November|December)[,][ ](0[1-9]|[12][0-9]|3[01])[,][ ]((19|20)[0-9][0-9])"
One feature of regular expressions is a "character class". Characters in square brackets make a character class. Thus [,] is a character class matching a single character, , (a comma). You might as well just put the comma.
Perhaps you wanted to make the comma optional? You can do that by putting a question mark after it: ,?
Anything you put into parentheses makes a "match group". I think the mysterious extra "19" came from a match group you didn't mean to have. You can make a non-matching group using this syntax: (?:
So, for example:
r'(?:red|blue) socks'
This would match "red socks" or "blue socks" but does not make a match group. If you then put that inside plain parentheses:
r'((?:red|blue) socks)'
That would make a match group, whose value would be "red socks" or "blue socks"
I think if you apply these comments to your regular expression, it will work. It is mostly correct now.
As for validating the date against the month, that is way beyond the scope of a regular expression. Your pattern will match "February 31" and there is no easy way to fix that.