Hadoop DP Notes
Home
Contact Us
Search
What is side data distribution in Mapreduce framework ?
The extra
read-only data needed by a mapreduce job
to process the main data set is called as
side data
.
There are two ways to make side data available to all the map or reduce tasks.
Job Configuration
Distributed cache
Newer Post
Older Post
Home
What are the main components of Mapreduce Job ?
Main driver class which provides job configuration parameters. Mapper class which must extend org.apache.hadoop.mapredu...
TO_CHAR
This will be used to extract various date formats. The available date formats as follows. Syntax: to_char ( date , fo...