1. Connect MySQL DB and check the content of the tables.
2. Copy "retail_db.categories" table to hdfs, without specifying directory name.
3. Copy "retail_db.categories" table to hdfs, in a directory name "categories_target".
4. Copy "retail_db.categories" table to hdfs, in a warehouse directory name "categories_warehouse".
Solution :
Step 1 : Connecting to existing MySQL Database
mysql --user=root --password=root retail_db
Step 2 : Show all the available tables
show tables;
Step 3 : View/Count data from a table in MySQL
select count(1) from categories;
Step 4 : Check the currently available data in HDFS directory
hdfs dfs -ls
Step 5 : Import Single table (Without specifying directory).
sqoop import --connect jdbc:mysql://quickstart:3306/retail_db --username=root --password=root -table=categories
Note : Please check you dont have space between before or after '=' sign.
Sqoop uses the MapReduce framework to copy data from RDBMS to hdfs
Step 6 : Read the data from one of the partition, created using above command.
hdfs dfs -cat categories/part-m-00000
Step 7 : Specifying target directory in import command (We are using number of mappers =1, you can change accordingly)
sqoop import --connect jdbc:mysql://quickstart:3306/retail_db --username=root --password=root --table=categories --target-dir=categories_target --m 1
Step 8 : Check the content in one of the partition file.
hdfs dfs -cat categories_target/part-m-00000
Step 9 : Specifying parent directory so that you can copy more than one table in a specified target directory. Command to specify warehouse directory
sqoop import --connect jdbc:mysql://quickstart:3306/retail_db --username=root --password=root --table=categories --warehouse-dir=categories_warehouse --m 1
Step 10 : See the content in one of the file (partition)
hdfs dfs -cat categories_warehouse/categories/part-m-00000