SQOOP

INTERVIEW QUESTIONS

● How will you get data from RDBMS into HDFS?

● Can we store mysql table data as sequence file in hdfs via sqoop?

● Does sqoop support compression techniques to store data in HDFS?

● Can we load all the tables in a database into hdfs in a single shot?

● Can we copy a subset of data from a table in RDBMS into HDFS?(based on some criteria)

● How many reduce tasks will be run by default for a sqoop import command?How many mappers?

● If we get java heap space error and we have already given the maximum memory, what is the possible solution?

● What is the default port for connecting to MySQL server?

● How can we resolve a Communications Link Failure when connecting to MySQL?

● Verify that we can connect to the database from the node where we are running

● Can we provide SQL queries in SQOOP Import command?

● what is use of --fetch-size in sqoop?

● best sqoop file format?

● best compression technique for sqoop?

● how to uncompress in HDFS?

● exclude columns from table?

● load 2 tables at a time?

● what is difference between enclosed-by and optionly-enclosed-by?

● use of escape by in sqoop.

● how to merge two imported part files in HDFS

● Sqoop Merge

● sqoop gencod

● Mention the best features of Apache Sqoop.

● What is Sqoop Import? Explain its purpose.

● What is the default file format to import data using Apache Sqoop?

● How can I import large objects (BLOB and CLOB objects) in Apache Sqoop?

● How can you execute a free-form SQL query in Sqoop to import the rows in a sequential manner?

● Does Apache Sqoop have a default database?

● How will you list all the columns of a table using Apache Sqoop?

● If the source data gets updated every now and then, how will you synchronize the data in HDFS that is imported by Sqoop?

● Name a few import control commands. How can Sqoop handle large objects?

● How can we import data from particular row or column? What is the destination types allowed in Sqoop import command?

● When to use –target-dir and when to use –warehouse-dir while importing data?

● What is the process to perform an incremental data load in Sqoop?

● What is the significance of using –compress-codec parameter?

● Can free-form SQL queries be used with Sqoop import command? If yes, then how can they be used?

● What is the importance of eval tool?

● How can you import only a subset of rows from a table?

● What are the limitations of importing RDBMS tables into Hcatalog directly?

● What is the advantage of using –password-file rather than -P option while preventing the display of password in the sqoop import statement?

● What do you mean by Free Form Import in Sqoop?

● What is the role of JDBC driver in Sqoop?

● Is JDBC driver enough to connect sqoop to the databases?

● What is InputSplit in Hadoop?

● What is the work of Export in Hadoop sqoop?

● Use of Codegen command in Hadoop sqoop?

● Use of Help command in Hadoop sqoop?

● How can you schedule a sqoop job using Oozie?

● What is the importance of — the split-by clause in running parallel import tasks in sqoop?

● What is a sqoop metastore?

● What is the purpose of sqoop-merge?

● How can you see the list of stored jobs in sqoop metastore?

● Which database the sqoop metastore runs on?

● Where can the metastore database be hosted?

● Give the sqoop command to see the content of the job named myjob?

● How can you control the mapping between SQL data types and Java types?

● Is it possible to add a parameter while running a saved job?

● What is the usefulness of the options file in sqoop.

● How can you avoid importing tables one-by-one when importing a large number of tables from a database?

● How can you control the number of mappers used by the sqoop command?

● What is the default extension of the files produced from a sqoop import What is the significance of using –compress-codec parameter?

● What is a disadvantage of using –direct parameter for faster data load by sqoop?

● How will you update the rows that are already exported?

● What are the basic commands in Apache Sqoop and its uses?

● How Sqoop word came? Sqoop is which type of tool and the main use of sqoop?

●What is Sqoop Validation?

● What is Purpose to Validate in Sqoop?

● What is Sqoop Job?

● What is Sqoop Import Mainframe Tool and its Purpose?

● What is the purpose of Sqoop List Tables?

● Difference Between Apache Sqoop vs Flume.

● Name a few import control commands. How can Sqoop handle large objects?

● How can we import data from particular row or column? What is the destination types allowed in Sqoop import command?

●How Sqoop word came ? Sqoop is which type of tool and the main use of sqoop?

● I am getting connection failure exception during connecting to Mysql through Sqoop, what is the root cause and fix for this error scenario?

● I am getting java.lang.IllegalArgumentException: during importing tables from oracle database.what might be the root cause and fix for this error scenario?

● Is sqoop same as to distcp in hadoop?

● For each sqoop copying into HDFS how many MapReduce jobs and tasks will be submitted?

● How can Sqoop be used in Java programs?

● The incoming value from HDFS for a particular column is NULL. How will you load

● List all the basic commands in Apache Sqoop along with their applications?

● How can we import a subset of rows from a table without using the where clause?

● what are the common delimiters and escape character in sqoop?

● while loading table from MySQL into HDFS, if we need to copy tables with maximum possible speed, what can you do?

● How can you sync a exported table with HDFS data in which some rows are deleted?

● How do you clear the data in a staging table before loading it by Sqoop?

● You have a data in HDFS system, if you want to put some more data to into the same table, will it append the data or overwrite?

● Is it possible to add a parameter while running a saved job?

● What is the difference between the parameters sqoop.export.records.per.statement and sqoop.export.statements.per.transaction

● How to import only the updated rows form a table into HDFS using sqoop assuming the source has last update timestamp details for each row?

● By using the lastmodified mode. Rows where the check column holds a timestamp more recent than the timestamp specified with --last-value are imported.

● Write the Sqoop import command

● Why reducer are not work in sqoop command

● Default mapper in Sqoop.

● sqoop basic/(use of -m , use of $condition)