Search

When the source data keeps getting updated frequently, what is the approach to keep it in sync with the data in HDFS imported by sqoop?



Sqoop can have 2 approaches.
a − To use the –incremental parameter with append option where value of some columns are checked and only in case of modified values the row is imported as a new row.
b − To use the –incremental parameter with lastmodified option where a date column in the source is checked for records which have been updated after the last import.