We can control/increase/decrease speed of copying by configuring the number of map tasks to be run for each sqoop copying process. We can do this by providing argument -m 10 or –num-mappers 10 argument to sqoop import command. If we specify -m 10 then it will submit 10 map tasks parallel at a time. Based on our requirement we can increase/decrease this number to control the copy speed.
How can we control the parallel copying of RDBMS tables into hadoop ?
We can control/increase/decrease speed of copying by configuring the number of map tasks to be run for each sqoop copying process. We can do this by providing argument -m 10 or –num-mappers 10 argument to sqoop import command. If we specify -m 10 then it will submit 10 map tasks parallel at a time. Based on our requirement we can increase/decrease this number to control the copy speed.
-
Main driver class which provides job configuration parameters. Mapper class which must extend org.apache.hadoop.mapredu...
-
This will be used to extract various date formats. The available date formats as follows. Syntax: to_char ( date , fo...