I\'d like to analyze a continuous stream of data (accessed over HTTP) using a MapReduce approach, so I\'ve been looking into Apache Hadoop. Unfortunately, it appears that Hadoop
Your use case sounds similar to the issue of writing a web crawler using Hadoop - the data streams back (slowly) from sockets opened to fetch remote pages via HTTP.
If so, then see Why fetching web pages doesn't map well to map-reduce. And you might want to check out the FetcherBuffer class in Bixo, which implements a threaded approach in a reducer (via Cascading) to solve this type of problem.