Dealing with these huge amounts of data is a real challenge in the IT community. In fact, traditional data processing technologies cannot properly handle them, and that’s why new technologies were needed (Big Data technologies). The question now is: when do we need these new technologies? Does Big Data just mean “a huge amount of data”? Does the “Big” in “Big Data” just refer to the volume of the data?
If you read several articles about Big Data, it’s highly likely that you will find the 3 V’s definition in most of them (or at least in half of them). The 3V ’s definition was firstly introduced in 2001 by Gartner Inc. analyst Doug Laney, and you can find the latest version of this definition in the Gartner’s IT Glossary (Gartner n.d.):
Big data is high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation
The 3-Vs here are volume, velocity and variety:
- Volume refers to vast amounts of data, which can be generated, for instance, from cell phones, social media, photographs…
- Velocity refers to the speed at which these incredible amounts of data are being generated, collected and analysed.
- Variety refers to the different types of data: we no longer have only structured data (data that can be properly displayed in a data table like name, phone number, ID, etc), but current data is mostly unstructured: images, audio, social media updates, etc.
In other words, Big Data is data that contains greater variety and is arriving in increasing volumes and with ever-higher velocity (Oracle (n.d.)), and the challenges of Big Data (and therefore, the need of Big Data technologies) result from the expansion of these three properties, rather than just the volume alone.
Now, apart from this 3 V’s definition, which is broadly accepted, there are several authors that claim that there is something missing in this 3V’s definition. Given that the IT industry seems to love acronyms, if you google it, you can find the 4 V’s of Big Data, the 5 V’s, the 8 V’s … and why not, even the 42 V’s of Big Data! (Shafer 2017).
Among all these additional V’s – which I encourage you to explore – there is one that I think it is really worth mentioning: Value. And I think that this particular V is very well explained by Cano (2014):
When we talk about value, we’re referring to the worth of the data being extracted. Having endless amounts of data is one thing, but unless it can be turned into value it is useless. While there is a clear link between data and insights, this does not always mean there is value in Big Data. The most important part of embarking on a big data initiative is to understand the costs and benefits of collecting and analyzing the data to ensure that ultimately the data that is reaped can be monetized.
And that’s the point: before investing a lot of money in Big Data management/technologies, first make sure that these are properly used, and second that the results might be valuable for your company.
We encourage you to come to MIUC to learn more about Big Data, including Big Data’s main technologies and when these can be really helpful for your company.
- Cano, J. (2014) The V’s of Big Data: Velocity, Volume, Value, Variety, and Veracity [Online]. Available at: https://www.xsnet.com/blog/bid/205405/the-v-s-of-big-data-velocity-volume-value-variety-and-veracity [Accessed: 14 January 2020].
- Gartner (n.d.) Information Technology Gartner Glossary [Online]. Available at: https://www.gartner.com/en/information-technology/glossary/big-data [Accessed: 14 January 2020].
- Oracle (n.d.) What Is Big Data? [Online]. Available at: https://www.oracle.com/big-data/guide/what-is-big-data.html [Accessed: 14 January 2020].
- Shafer, T. (2017) The 42 V’s of Big Data and Data Science [Online]. Available at: https://www.kdnuggets.com/2017/04/42-vs-big-data-data-science.html [Accessed: 14 January 2020].
- Featured Photo by Joshua Sortino on Unsplash