Skip to content

Category: Big Data Basics

Overview of various Hadoop Distributions

All of the big data enterprises today use Apache Hadoop in some way or the other. Hadoop is by no means an out-of-the-box solution. Hadoop is open source system but to simplify working with Hadoop, many enterprise versions like Cloudera, MapR and Hortonworks are available in market. Vendor distributions are aimed at overcoming the issues that the users typically encounter in the standard editions. Think of various distribution of Linux like Red Hat.

Leave a Comment

Big Data concepts in simple language

Let us try to understand Big Data concept in simple language.

Your friend brought you one of one hard disk of say 2 TB which contained lots of data in the form of dat files (exported from relational database), lots of excel files, csv files. He asked you that he needs some critical reports based on this data. You created one new database and related schemas. Then you took all this data and imported it into your datbase by standard tool. You used data querying language like SQL and fetched required reports. Mission accomplished.

Leave a Comment