Hadoop DP Notes: Whether Mapreduce Job reducer output file blocks are also replicated? If yes how many copies are maintained?

Whether Mapreduce Job reducer output file blocks are also replicated? If yes how many copies are maintained?

Yes. Since Mapreduce Job reducer output files (part-r-00000) are stored on HDFS instead of on local FS as mapper output, each block of reducer output files are maintained in 3 copies which is equal to default replication factor. For each HDFS block of the reduce output, the first replica is stored on the local node, with other replicas being stored on off-rack nodes.

Hadoop DP Notes

Search

Whether Mapreduce Job reducer output file blocks are also replicated? If yes how many copies are maintained?

Visitors