What is Big Data & What classifies as Big data? [closed]

霸气de小男生 提交于 2019-12-04 21:26:15

Big data is nothing but an assortment of such huge and complex data that becomes very tedious to capture, store, process, retrieve and analyze it.

From ibmbigdatahub artcile and edureka article

Bigdata can be defined in terms of four Vs.

  1. Volume : The main characteristic that makes data “big” is the sheer volume. It could amount to hundreds of terabytes or even petabytes of information. For instance, 15 terabytes of Facebook posts or 400 billion annual medical records could mean Big Data!

  2. Velocity: Velocity means the rate at which data is flowing in the companies. Big data requires fast processing. Time factor plays a very crucial role in several organizations. For instance, processing 2 million records at share market or evaluating results of millions of students applied for competitive exams could mean Big Data!

  3. Variety : Big Data may not belong to a specific format. It could be in any form such as structured, unstructured, text, images, audio, video, log files, emails, simulations, 3D models, etc.

  4. Veracity: Veracity refers to the uncertainty of data available. Data available can sometimes get messy and maybe difficult to trust. With many forms of big data, quality and accuracy are difficult to control

Big data is:

When a big boss believes this is a big opportunity because data is the new oil and gold, and get a big pile of money to throw out a window and flush it down the bowels. And then your data warehouses and silos turn into a data lake and the data lake full of synergy into a data swamp full of bit rot; where the big vision hits the reality that not everything that shines is gold. And then the gates of doom open and there it comes, the big bubble that is about to burst. The bridge over the through of desillusionment is small, and thou shall not pass, but tumble into the big abyss where all useless data go, no latter how eagerly it was collected and mapped and reduced without plan or objective. Bingo!

The Big Data Definitions & Taxonomies Subgroup of the NIST Big Data Public Working Group released a volume on definitions NIST Big Data Interoperability Framework: Volume 1, Definitions

Quotes:

Big Data refers to the inability of traditional data architectures to efficiently handle the new datasets. Characteristics of Big Data that force new architectures are:

  • Volume (i.e., the size of the dataset);
  • Variety (i.e., data from multiple repositories, domains, or types);
  • Velocity (i.e., rate of flow); and
  • Variability (i.e., the change in other characteristics).

These characteristics—volume, variety, velocity, and variability—are known colloquially as the ‘Vs’ of Big Data

and:

Big Data consists of extensive datasets—primarily in the characteristics of volume, variety, velocity, and/or variability—that require a scalable architecture for efficient storage, manipulation, and analysis.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!