I have to write a not-so-large program in C++, using boost::thread.
The problem at hand, is to process a large (maybe thousands or tens of thousands. Hundreds and millon
If the workload is anywhere near as I/O bound as it sounds, then you're probably going to get maximum throughput with about as many threads as you have spindles. If you have more than one disk and all data is on the same RAID 0, you probably don't want any more than one thread. If more than one thread is trying to access non-sequential parts of the disk, the OS must stop reading one file, even though it may be right under the head, and move to another part of the disk to service another thread, so that it doesn't starve. With only one thread, the disk need never stop reading to move the head.
Obviously that depends on the access patterns being very linear (such as with video recoding) and the data actually being unfragmented on disk, which it depends on a lot. If the workload is more CPU bound, then it won't matter quite as much and you can use more threads, since the disk will be twiddling its thumbs anyway.
As other posters suggest, profile first!