Hadoop DP Notes
What is Difference between Secondary namenode, Checkpoint namenode & backupnod Secondary Namenode, a poorly named component of hadoop.
The NameNode stores the metadata of the HDFS. The state of HDFS is stored in a file called
Secondary Namenode
The Backup Node in hadoop is an extended checkpoint node that performs checkpointing and also support online streaming of file system edits.
Checkpoint Node
HDFS High Availability Architecture
In order
to provide a HOT back-up and consistent solution for NameNode failure, a
concept of using two NameNodes (one Active and one StandBy) was
introduced. The below diagram describes the architecture of HDFS high
availability.
In a cluster, two nodes can be configured as NameNodes. Each
NameNode is assigned a role, either Active or StandBy. The Active NameNode
handles the client requests in the cluster, and the Standby NameNode acts as a
back-up node and maintains enough state to provide a consistent FS-Image during
failure of Active NameNode.
In order of sync the state of the NameNodes, the edit logs from the
Active NameNode needs to be shared to the StandBy NameNode. There are two state
synchronization methods available with Hadoop, Quorum Journal Manager or using
a Network File System.
The DataNodes send block location information and heartbeats to
both the NameNodes. At any point in time, exactly one of the NameNodes
should be in Active state, or if both the NameNodes are in Active state, then
it’ll result in “split-brain scenario“. To avoid this scenario, an
administrator should configure a fencing method.
If Active NameNode failure occurs, the StandBy NameNode state is
changed to Active. This state transition from StandBy to Active is either
manual or automatic. After successful transition, the client requests will be
redirected to the new Active NameNode.
Subscribe to:
Posts (Atom)
-
Main driver class which provides job configuration parameters. Mapper class which must extend org.apache.hadoop.mapredu...
-
This will be used to extract various date formats. The available date formats as follows. Syntax: to_char ( date , fo...