Tech Terminology De-mystified – Big Data

Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process the data within a tolerable elapsed time. Big data sizes are a constantly moving target, as of 2012 ranging from a few dozen terabytes to many petabytes of data in a single data set. The target moves due to constant improvement in traditional DBMS technology as well as new databases like NoSQL and their ability to handle larger amounts of data. With this difficulty, new platforms of “big data” tools are being developed to handle various aspects of large quantities of data.

This is the definition of Big Data by Wikipedia.
 
What is the buzz about big data?
There is a lot of excitement about the possibility of analyzing big data and trying to discover patterns – patterns that businesses hope will provide them insights into customer behavior and equip them with a competitive advantage. Also big data may be used in scientific research – possibly revealing new connections in existing scientific data that may lead to finding cures for diseases or new drug combinations.
 
The definition of big data is being constantly pushed higher due to advances in technology and the ability of tools and databases to handle larger data. Just possessing big data is not of any value to businesses. Those companies that understand that they hold big data and understand where to focus the power of their analytical tools on may be the ones that come up trumps.
 
Are there any open source tools that businesses can try?
Yes. Apache’s Hadoop is one that has grabbed a lot of eyeballs. But, there are others. According to this article, Spark, Drill, D3.js, HCatalog are promising tools that need to be given a look.