问题
I know what is Data Warehouse & what is Big Data. But I am confused with Data Warehouse Vs Big Data. Both are same with different names or both are different(Conceptually & Physically).
回答1:
I know that this is an older thread but there have been some developments in the last year or so. Comparing the data warehouse to Hadoop is like comparing apples to oranges. The data warehouse is a concept: clean, integrated data of high quality. I don't think the need for a data warehouse will go away anytime soon. Hadoop on the other hand is a technology. It is a distributed compute framework to process large volumes of data. In the past data warehouses were typically built on relational databases and data warehouse appliances. However, over the last couple of years various limitations of the RDBMS have emerged (exploding license costs in the face of growing data volumes, poor fit for purpose for querying graphs and hierarchies and ingesting unstructured data types etc.). At the same time MPP SQL query engines on Hadoop have appeared such as Apache Drill that now make it possible to query data that sits on Hadoop.
I have written a whole series of posts on the subject if you are interested in all of the details. Data Warehousing in the age of big data. The end of an era?
回答2:
I think you will find the following article very usefull to your thoughts.
It’s important to divide the techniques of data warehousing from the implementation. Hadoop (and the advent of NoSQL databases) will auger the demise of data warehousing appliances and the “traditional” single database implementation of a data warehouse.
It is safe to say that traditional, single server relational databases or database appliances are not the future of big data or data warehouses.
On the other hand, the techniques of data warehousing to include Extract-Transform-and-Load (ETL), dimensional modeling and business intelligence will be adapted to the new Hadoop/NoSQL environments.
From: http://gcn.com/blogs/reality-check/2014/01/hadoop-vs-data-warehousing.aspx
回答3:
I have some great slides describing the difference between Hadoop and Data Warehouse, and how both complement each other:
http://www.kai-waehner.de/blog/2014/05/13/hadoop-and-data-warehouse-dwh-friends-enemies-or-profiteers-what-about-real-time-slides-including-tibco-examples-from-jax-2014-online/
回答4:
I found this http://www.b-eye-network.com/view/17017 which describes the difference of big data and data ware house
when we compare a big data solution to a data warehouse, what do we find? We find that a big data solution is a technology and that data warehousing is an architecture. They are two very different things. A technology is just that – a means to store and manage large amounts of data. A data warehouse is a way of organizing data so that there is corporate credibility and integrity. When someone takes data from a data warehouse, that person knows that other people are using the same data for other purposes. There is a basis for reconcilability of data when there is a data warehouse.
回答5:
Maybe this viewpoint can help you: Basically Data Warehouse is an architecture, while Big Data is a technology. The first one became a well-known trend in the recent 20 years, while the latter one gained popularity only in the last decade.
Big Data and Data Warehouse are both used for reporting and can be called subject-oriented technologies. This means that they are aimed to provide information about a certain subject (f.e. a customer, supplier, employee or even a product). Data Warehouse is more advanced when it comes to holistic data analysis, while the main advantage of Big Data is that you can gather and process information from almost all well-known sources (f.e. social media or even specific machine data).
More here gbksoft.com/blog/big-data-and-data-warehouse/
回答6:
The warehouse stores the actual data. It stores some of the entire cluster data. Data Warehouse is a system used for reporting and data analysis. It is central repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that are used for creating analytical reports.
vs.
Big data refers to large-scale data that is generated in digital environment. This big data is generally large in size and has a short generation cycle. It includes not only numeric data but also text and image data. Big data environment is more diverse than previous ones. As data types are diverse and the amount of size is huge, It is even possible to analyze and predict people's opinions and behaviors. In addition, Machbase database will launch the enterprise edition which has a warehouse concept.
来源:https://stackoverflow.com/questions/19043747/what-is-the-actual-difference-between-data-warehouse-big-data