Module 1: Apache Hive and HiveQL
What is Hive, Hive DDL - Create/Show database, Hive DDL - Create/Show/Drop tables, Hive DML – Load files & Insert data, Hive SQL - Select, Filter, Join, Group By, Hive architecture & components, Difference between Hive and RDBMS
Module 2: Advance HiveQL
Multi-Table Inserts, Joins, Grouping Sets, Cubes, Rollups, Custom map and Reduce scripts, Hive SerDe, Hive UDF, Hive UDAF.
Module 3: Apache Flume
Sqoop, Oozie, Sqoop - How Sqoop works, Sqoop architecture, Flume complex Flow – Multiplexing, Oozie - Simple/Complex flow, Oozie service/ Scheduler, Use cases - Time and data triggers.
Module 4: NoSQL Databases
CAP theorem, RDBMS vs NoSQL, Key value stores: Memcached, Riak, Key Value stores: Redis, Dynamo DB, Column Family: Cassandra, HBase, Graph Store: Neo4J, Document Store: MongoDB, CouchDB.
Module 5: Apache HBase
When/Why to use HBase, Hbase architecture/Storage, Hbase data model, Hbase families/ column families, Hbase master, HBase vs RDBMS, Access Hbase data.
Module 6: Apache Zookeeper
Zookeeper Data model, Znokde Types, What is zookeeper, Sequential Znodes, Installing and configuring, Running zookeeper, Zookeeper use cases.
Module 7: Hadoop 2.0
YARN, MRv2, MapReduce limitations, HDFS 2: Architecture, HDFS 2: High availability, HDFS 2: Federation, YARN Architecture, Classic vs YARN, YARN multitenancy, YARN capacity scheduler.
Who Should Attend?
After completing this course and successfully passing the certification examination, the student will be awarded the “Big Data Analytics with Hadoop2” certification.
If a learner chooses not to take up the examination, they will still get a 'Participation Certificate'