Posts

Hadoop - Introduction

Hadoop is an Apache open source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models. A Hadoop frame-worked application works in an environment that provides distributed storage and computation across clusters of computers. Hadoop is designed to scale up from single server to thousands of machines, each offering local computation and storage. Hadoop Architecture Hadoop framework incorporates following four modules:  Hadoop Common: These are Java libraries and utilities required by other Hadoop modules. These libraries gives filesystem and OS level deliberations and contains the important Java documents and contents required to begin Hadoop.  Hadoop YARN: This is a structure for work booking and group asset administration.  Hadoop Distributed File System (HDF): A dispersed record framework that gives high-throughput access to application information.  Ha...
Recent posts