I don't know that a whole lot of the curriculum for your class will be that useful for either problems 1 or 2. Some of the better techniques for these kinds of problems do really simple linguistic stuff (like part of speech tagging, simply removing stop words, and looking at bigrams and trigrams), and have a machine-learning text classification component that's not too sophisticated on its own (standard techniques like Naive Bayesian classifiers, Maximum Entropy classifiers, Support Vector Machines are pretty much black boxes algorithm-wise and perform well). Have a look at these survey papers about topical text classification and authorship detection to get an idea of where you can get started.
Something better suited to the curriculum you've described might be to construct a morphological analyzer for a foreign language that you're familiar with, or to construct a stemmer (a poor man's version of a morphological analyzer) that maps morphologically-related terms to the same entry in an index -- something that can be used by search engines.
If you don't need to come up with a new technique for your class (i.e. if you're an undergrad), then there are a large number of standard NLP tasks that you could implement in OCaml, for example a parser trained on the Penn Treebank, a parser for some other grammar formalism, a part-of-speech tagger, or literally dozens of other applications.