Skip to content

Category: BIG DATA

Overview of various Hadoop Distributions

All of the big data enterprises today use Apache Hadoop in some way or the other. Hadoop is by no means an out-of-the-box solution. Hadoop is open source system but to simplify working with Hadoop, many enterprise versions like Cloudera, MapR and Hortonworks are available in market. Vendor distributions are aimed at overcoming the issues that the users typically encounter in the standard editions. Think of various distribution of Linux like Red Hat.

Leave a Comment

Big Data concepts in simple language

Let us try to understand Big Data concept in simple language.

Your friend brought you one of one hard disk of say 2 TB which contained lots of data in the form of dat files (exported from relational database), lots of excel files, csv files. He asked you that he needs some critical reports based on this data. You created one new database and related schemas. Then you took all this data and imported it into your datbase by standard tool. You used data querying language like SQL and fetched required reports. Mission accomplished.

Leave a Comment

What do Hadoop Administrators do?

Database admins and other professionals who are planning to switch to Hadoop world as Hadoop Administrator should be aware of what is expected from them. Below are just some of the basic requirements that this job demand. (This is not Hadoop Developer related activites but the Admin part of Hadoop)

  • Responsible for implementation and administration of Hadoop infrastructure.
  • Keep expanding Hadoop existing environments by deploying new hardware/software as required.
Leave a Comment