In general you're looking at doing nGram identification. Since this is a python question, you might take a look at http://github.com/koblas/ngramj-python which is a pure python port of the java ngram library (another open source project).
The documentation is lacking, but it has really good accuracy.