I am doing now some data analyse tests and in the first, really simple I have got very strange results.
The idea is the following: from an internet access log (a col
I think your result quite normal and will try to justify them<.br>
1. MySQL is using binary format which is optimized for the processing while MongoDB is working with JSON. So time of parsing is added to the processing. I would estimate it to factor 10x at least.
2. JS is indeed much slower then C. I think at least factor of 10 can be assumed.
Together we get about x100 - similar to what you see. 20 minut x 1000 is 2000 minutes or about 33 hours.
3. Hadoop is also not efficient for data processing but it is capable to use all cores you have and it makes difference. Java also has JIT developed and optimized for more then 10 years.
4. I would suggest to look not on MySQL but on TPC-H benchmark Q1 - which is pure aggregation. I think systems like VectorWise will show maximum possible throughput per core.