Hadoop DP Notes: What is Difference between Secondary namenode, Checkpoint namenode & backupnod Secondary Namenode, a poorly named component of hadoop.

What is Difference between Secondary namenode, Checkpoint namenode & backupnod Secondary Namenode, a poorly named component of hadoop.

inode

Files and directories are represented on the NameNode by inodes. Inodes record attributes like permissions, modification and access times, namespace and disk space quotas

NameNode

The NameNode stores the metadata of the HDFS. The state of HDFS is stored in a file called `fsimage` and is the base of the metadata. During the runtime modifications are just written to a log file called `edits`. On the next start-up of the NameNode the state is read from `fsimage`, the changes from `edits` are applied to that and the new state is written back to `fsimage`. After this `edits` is cleared and contains is now ready for new log entries.

Secondary Namenode
Secondary namenode is solution for this issue. This is another machine having connectivity with namenode. It periodically copies FSImage and Editlog from name node and merged FSImage with log file. Moved back to updated FSImage file to Namenode. Secondary Namenode is not supposed to provide High Availability Namenode. Highlevel task performed by secondry namenode is

1. Received edit logs from the namenode and merged to fsimage

2. Copies back updated FSImage to namenode

3. Updated FSImage will reduce the startup time

Secondary Namenode whole purpose is to have a checkpoint in HDFS.

Backup Node

The Backup Node in hadoop is an extended checkpoint node that performs checkpointing and also support online streaming of file system edits.
The advantage over the checkpoint node is that the namespace presents in it’s main memory is always in sync with primary name node FS since it maintain an In memory up to date

Checkpoint Node
In Checkpoint Node checkpoints are created on their local FS by downloading FSImages and EditLogs files from active primary Namenode and merge these two files and new image is saved in their Local FS.
So checkpoint creation in backup node will always be faster than checkpointnode.