My input consists of user-posted strings.
What I want to do is create a dictionary with words, and how often they’ve been used. This means I want to parse a string,
You should look into Natural Language Processing (NLP), not regular expressions, and if you are targeting more than one spoken language, you need to factor that in as well. Since you're using C#, check out the SharpNLP project.
Edit: This approach is only necessary if you care about the semantic content of the words you're trying to split up.