MapReduce with MongoDB really, really slow (30 hours vs 20 minutes in MySQL for a equivalent database)

后端 未结 3 931
悲哀的现实
悲哀的现实 2021-01-02 11:21

I am doing now some data analyse tests and in the first, really simple I have got very strange results.

The idea is the following: from an internet access log (a col

3条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-01-02 12:13

    I think your result quite normal and will try to justify them<.br> 1. MySQL is using binary format which is optimized for the processing while MongoDB is working with JSON. So time of parsing is added to the processing. I would estimate it to factor 10x at least.
    2. JS is indeed much slower then C. I think at least factor of 10 can be assumed. Together we get about x100 - similar to what you see. 20 minut x 1000 is 2000 minutes or about 33 hours.
    3. Hadoop is also not efficient for data processing but it is capable to use all cores you have and it makes difference. Java also has JIT developed and optimized for more then 10 years.
    4. I would suggest to look not on MySQL but on TPC-H benchmark Q1 - which is pure aggregation. I think systems like VectorWise will show maximum possible throughput per core.

提交回复
热议问题