1.INTRODUCTION
- What is Hadoop?
- History of Hadoop
- Building Blocks – Hadoop Eco-System
- Who is behind Hadoop?
- What Hadoop is good for and why it is Good
- Configuring HDFS
- Interacting With HDFS
- HDFS Permissions and Security
- Additional HDFS Tasks
- HDFS Overview and Architecture
- HDFS Installation
- Hadoop File System Shell
- File System Java API
- Map/Reduce Overview and Architecture
- Installation
- Developing Map/Red Jobs
- Input and Output Formats
- Job Configuration
- Job Submission
- Practicing Map Reduce Programs (atleast 10 Map Reduce Algorithms )
- Configuring Hadoop API on Eclipse IDE
- Connecting Eclipse IDE to HDFS
6.Advanced MapReduce Features
- Custom Data Types
- Input Formats
- Output Formats
- Partitioning Data
- Reporting Custom Metrics
- Distributing Auxiliary Job Data
8.Using Yahoo Web Services
9.Pig
- Pig Overview
- Installation
- Pig Latin
- Pig with HDFS
10. Hive
- Hive Overview
- Installation
- Hive QL
- Hive Unstructured Data Analyzation
- Hive Semistructured Data Analyzation
- HBase Overview and Architecture
- HBase Installation
- HBase Shell
- CRUD operations
- Scanning and Batching
- Filters
- HBase Key Design
12.ZooKeeper
- Zoo Keeper Overview
- Installation
- Server Mantainace
- Sqoop Overview
- Installation
- Imports and Exports
14.CONFIGURATION
- Basic Setup
- Important Directories
- Selecting Machines
- Cluster Configurations
- Small Clusters: 2-10 Nodes
- Medium Clusters: 10-40 Nodes
- Large Clusters: Multiple Racks
16.Putting it all together
- Distributed installations
- Best Practices