Hadoop comparison to RDBMS

后端 未结 4 576
囚心锁ツ
囚心锁ツ 2020-12-16 07:59

I really do not understand the actual reason behind hadoop scaling better than RDBMS . Can anyone please explain at a granular level ? Has this got something to do with unde

4条回答
  •  猫巷女王i
    2020-12-16 08:16

    The one issue if have with the description above is that paralleled RDBMS required expensive hardware. Teridata and Netezza need special hardware. Greenplum and Vertica can be put on commodity hardware. (Now I will admit I am biased, like everyone else.) I have seen Greenplum scan petabytes of information daily. (Walmart was up to 2.5 petabytes last I hard.) I dealt with both Hawq and Impala. They both require about 30% more hardware to do the same job on structured data. Hbase is less efficient.

    There is no magic silver spoon. It has been my experience that both structured and unstructured have their place. Hadoop is great for ingesting large amounts of data and scanning through it a small amount of times. We use it as part of our load procedures. RDBMS is grate at scanning the same data over and over with highly complex queries.

    You always have to structure the data to make use of it. That structuring takes time somewhere. You ether structure before you put it in to an RDBMS or at query time .

提交回复
热议问题