{"id":79615,"date":"2020-08-24T09:00:21","date_gmt":"2020-08-24T03:30:21","guid":{"rendered":"https:\/\/techvidvan.com\/tutorials\/?p=79615"},"modified":"2020-08-24T09:00:21","modified_gmt":"2020-08-24T03:30:21","slug":"apache-sqoop-import","status":"publish","type":"post","link":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/","title":{"rendered":"Sqoop Import Queries with Examples"},"content":{"rendered":"<p>We all know that for transferring data from RDBMS to HDFS or vice-versa, we use Apache Sqoop. In this Sqoop Import article, we will discuss the Sqoop Import tool used for importing tables from the RDBMS to the HDFS.<\/p>\n<p>In this article, you will explore how to import tables to HDFS, Hive, HBase, and Accumulo. You will also learn the syntax as well as the different arguments.<\/p>\n<p>Moreover, you will study the purpose of Sqoop Import as well as examples of the Sqoop import query to understand it well.<\/p>\n<p>Let us first explore what Sqoop Import is.<\/p>\n<p>&nbsp;<\/p>\n<h3>Introduction to Sqoop Import<\/h3>\n<p>The Sqoop import is a tool that imports an individual table from the relational database to the Hadoop Distributed File System. Each row from the table which you are importing is represented as a separate record in the HDFS.<\/p>\n<p>In HDFS, these records can be stored either as text files (one record per line) or in the binary representation as Avro or as the SequenceFiles.<\/p>\n<h4>Sqoop Import Syntax<\/h4>\n<p>The syntax for Sqoop Import command is:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import (generic-args) (import-args)<\/pre>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop-import (generic-args) (import-args)<\/pre>\n<p>We can pass import arguments in any order with respect to each other, but the Hadoop generic arguments must precede the import arguments.<\/p>\n<p>Let us first see the common arguments.<\/p>\n<table>\n<tbody>\n<tr>\n<td><b>Argument<\/b><\/td>\n<td><b>Description<\/b><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;connect &lt;jdbc-uri&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It specify the JDBC connect string<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;connection-manager &lt;class-name&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It specify the connection manager class used<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;driver &lt;class-name&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Manually specify the JDBC driver class to use<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;hadoop-mapred-home &lt;dir&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Override $HADOOP_MAPRED_HOME<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;help<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will print usage instructions<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;password-file<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will set the path for a file containing authentication password<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">-P<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will read the password from the console<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;password &lt;password&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will set the authentication password<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;username &lt;username&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will set the authentication username<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;verbose<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will print more information while working<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;connection-param-file &lt;filename&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Specify the Optional properties file who provides connection parameters<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;relaxed-isolation<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will set the connection transaction isolation to read the uncommitted for the mappers.<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>These are the common arguments.<\/p>\n<p>Now the article enlists the syntax and arguments for the various steps for importing data.<\/p>\n<h3>Connecting to the Database Server<\/h3>\n<p>Apache Sqoop is basically designed for importing tables from the database into the HDFS. For doing so, we have to specify the connect string, which describes how to connect to the relational database.<\/p>\n<p>This connect string should be similar to the URL and is communicated to Apache Sqoop via the &#8211;connect argument. This argument specifies the server and the database to connect to and the port.<\/p>\n<p><strong>For example:<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import --connect jdbc:mysql:\/\/localhost\/demo_db_db\n<\/pre>\n<p>The above example will connect to the MySQL database named demo_db on the localhost.<\/p>\n<p>Also, sometimes we have to authenticate against the relational database before accessing it. We can use the argument<strong> &#8211;username<\/strong> for supplying a username to a database. There are several ways Sqoop provides for supplying a password in a secure and non-secure mode.<\/p>\n<p>Generally, we use <strong>-P<\/strong> argument, which reads the password from the console.<\/p>\n<p>The Validation arguments are:<\/p>\n<table>\n<tbody>\n<tr>\n<td><b>Argument<\/b><\/td>\n<td><b>Description<\/b><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;validate<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will enable the validation of the data copied. It will support a single table copy only.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;validator &lt;class-name&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will specify the validator class to use.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;validation-threshold &lt;class-name&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will specify the validation threshold class to use.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;validation-failurehandler &lt;class-name&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will specify the validation failure handler class to use.<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>Selecting the Data to import<\/h3>\n<p><strong>1.<\/strong> Apache Sqoop imports the data in table-centric fashion. We can use the argument &#8211;table for selecting the table to be imported.<\/p>\n<p><strong>For example,<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">--table emp_info.<\/pre>\n<p>The <strong>&#8211;table<\/strong> argument can identify the VIEW or the other table-like entity in the database.<br \/>\nWith this argument, by default, all the columns in a table get selected for import. The imported data is written to the HDFS in their &#8220;natural order\u201d.<\/p>\n<p>For example, a table containing the columns A, B, and C will result in an import of the data such as:<br \/>\nA1,B1,C1<br \/>\nA2,B2,C2<br \/>\n\u2026<\/p>\n<p><strong>2.<\/strong> In Sqoop, we can also select the subset of columns, and we can control their ordering by using<strong> &#8211;columns<\/strong> argument. This argument must include the comma-delimited list of all the columns to be imported.<\/p>\n<p><strong>For example:<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">--columns \"emp_name,emp_id,emp_jobtitle\".<\/pre>\n<p><strong>3.<\/strong> We can also control the rows to be imported by adding a SQL WHERE clause to the import statement. Sqoop by default generates the statements as SELECT &lt;column list&gt; FROM &lt;table name&gt;. We can append the WHERE clause to this statement with the argument &#8211;where.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">--where \"emp_id &gt; 400\".<\/pre>\n<p>So those rows whose id column has a value greater than 400 will be imported.<\/p>\n<p>The other Sqoop import control arguments are:<\/p>\n<table>\n<tbody>\n<tr>\n<td><b>Argument<\/b><\/td>\n<td><b>Description<\/b><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;append<\/span><\/td>\n<td><span style=\"font-weight: 400\">Append data to the existing dataset in the HDFS<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;as-avrodatafile<\/span><\/td>\n<td><span style=\"font-weight: 400\">Imports the data to the Avro Data Files<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;as-sequencefile<\/span><\/td>\n<td><span style=\"font-weight: 400\">Imports the data to the SequenceFiles<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;as-textfile<\/span><\/td>\n<td><span style=\"font-weight: 400\">Imports the data as the plain text (default)<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;as-parquetfile<\/span><\/td>\n<td><span style=\"font-weight: 400\">Imports the data to the Parquet Files<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;boundary-query &lt;statement&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Specify the boundary query use for creating splits<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;columns &lt;col,col,col\u2026&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Specify the columns to be imported from table<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;delete-target-dir<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will delete the import target directory if it exists<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;direct<\/span><\/td>\n<td><span style=\"font-weight: 400\">Use the direct connector if exists for a database<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;fetch-size &lt;n&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Specify the number of entries to be read from the database at once.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;inline-lob-limit &lt;n&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will set the maximum size for the inline LOB<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">-m,&#8211;num-mappers &lt;n&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Specify the number of mapper. Use <\/span><i><span style=\"font-weight: 400\">n<\/span><\/i><span style=\"font-weight: 400\"> map tasks for importing in parallel<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">-e,&#8211;query &lt;statement&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Import the results of the statement.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;split-by &lt;column-name&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Specify the table column to be used to split the work units. We cannot use it with\u00a0 &#8211;autoreset-to-one-mapper option.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;autoreset-to-one-mapper<\/span><\/td>\n<td><span style=\"font-weight: 400\">It specifies that import should use only one mapper if the table does not have a primary key and no split-by column is provided. This option cannot be used with the &#8211;split-by &lt;col&gt; option.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;table &lt;table-name&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Table to read<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;target-dir &lt;dir&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">HDFS destination dir<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;warehouse-dir &lt;dir&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">HDFS parent for the table destination<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;where &lt;where clause&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Specify the WHERE clause to be used during import<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">-z,&#8211;compress<\/span><\/td>\n<td><span style=\"font-weight: 400\">Enable compression<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;compression-codec &lt;c&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Use Hadoop codec (default gzip)<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;null-string &lt;null-string&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">The string to be written for the null value for the string columns<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;null-non-string &lt;null-string&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">The string to be written for the null value for the non-string columns<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The arguments <strong>&#8211;null-string<\/strong> and the <strong>&#8211;null-non-string<\/strong> are optional. If they are not specified, then string &#8220;null&#8221; will be used.<\/p>\n<h3>Free-form Query Imports<\/h3>\n<p>Apache Sqoop can import the result set of the arbitrary SQL query. Rather than using the arguments <strong>&#8211;table<\/strong>, <strong>&#8211;columns<\/strong> and <strong>&#8211;where<\/strong>, we can use <strong>&#8211;query<\/strong> argument for specifying a SQL statement.<\/p>\n<p><strong><em>Note:<\/em><\/strong> While importing the table via the free-form query, we have to specify the destination directory with the &#8211;target-dir argument.<\/p>\n<h3>Controlling Parallelism in Sqoop<\/h3>\n<p><strong>1.<\/strong> Apache Sqoop can import the data in parallel from most of the database sources. We have to specify the total number of mappers to be used for performing the import by using the argument &#8211;num-mappers or -m.<\/p>\n<p>These two arguments take an integer value that corresponds to the degree of parallelism to employ.<\/p>\n<p><strong>2.<\/strong> Four tasks are used by default. We can increase this value to 8 or 6 to improve the performance.<\/p>\n<p><em><strong>Note:<\/strong><\/em> The degree of parallelism should not be increased than that available within our MapReduce cluster. This may increase the import time. Also, the degree of parallelism should not be increased greater than that which our database can reasonably support. This may increase the load on the database server.<\/p>\n<p><strong>3.<\/strong> For performing parallel imports, Apache Sqoop requires a criterion through which Sqoop can split the workload. For this, it uses a <em>splitting column<\/em>. Sqoop, by default, identifies the primary key column in the table as used for splitting the workload.<\/p>\n<p>The high and the low values for a splitting column are fetched from a database, and map tasks operate on the evenly-sized components of the total range.<\/p>\n<p><strong>For example<\/strong>, if we are having a table with the primary key column emp_id whose minimum value is 0 and the maximum value is 1000, and the Sqoop was directed to use 4 map tasks.<\/p>\n<p>The Sqoop will execute four processes which each execute SQL statements of the form SELECT * FROM emp_info WHERE id &gt;= lo AND id &lt; hi, with (lo, hi) set to (0, 250), (250, 500), (500, 750), and (750, 1001) in the different tasks.<\/p>\n<p><strong>4.<\/strong> We can explicitly choose the different splitting columns via the argument &#8211;split-by.<\/p>\n<p><strong>For example,<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">--split-by emp_id.<\/pre>\n<p><strong>5.<\/strong> At present, Sqoop cannot split on the multi-column indices. If the table is not having any index column or a multi-column key, then we have to choose the splitting column manually.<\/p>\n<h3>Controlling Distributed Cache in Sqoop<\/h3>\n<p>Apache Sqoop copies the jars in the <strong>$SQOOP_HOME\/<\/strong>lib folder to the job cache every time while starting a Sqoop job. When it is launched by Oozie, then it is unnecessary because Oozie uses its own Sqoop share lib, which keeps the Sqoop dependencies in a distributed cache.<\/p>\n<p>By using the argument <strong>&#8211;skip-dist-cache<\/strong> in Sqoop command when launched by the Oozie, it will skip the step by which the Sqoop copies all its dependencies to the job cache and save the massive I\/O.<\/p>\n<h3>Controlling Import Process in Sqoop<\/h3>\n<p>Sqoop import process by default uses the JDBC, which provides the reasonable cross-vendor import channel. Although some databases can perform the imports in a high-performance manner by using the database-specific data movement tools.<\/p>\n<p>For example, the mysqldump tool is provided by MySQL to export data from MySQL to the other systems quickly.<\/p>\n<p>We can pass the <strong>&#8211;direct<\/strong> argument, for specifying that Sqoop should choose the direct import channel. The direct import channel has higher performance than JDBC.<\/p>\n<h3>Controlling transaction isolation in Sqoop<\/h3>\n<p>Apache Sqoop is preconfigured for mapping most of the SQL types to the appropriate Java or the Hive representatives.<\/p>\n<p>The default mapping may not be suitable for all and can be overridden via <strong>&#8211;map-column-java<\/strong> argument (for changing Java mapping) or via <strong>&#8211;map-column-hive<\/strong> (for changing Hive mapping).<\/p>\n<table>\n<tbody>\n<tr>\n<td><span style=\"font-weight: 400\">Argument<\/span><\/td>\n<td><span style=\"font-weight: 400\">Description<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;map-column-java &lt;mapping&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Used for Overriding mapping from SQL to Java type for configured columns.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;map-column-hive &lt;mapping&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Used for Overriding mapping from SQL to Hive type for configured columns.<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Sqoop expects a comma-separated list of mapping in the form &lt;name of column&gt;=&lt;new type&gt;.<\/p>\n<p><strong>For example:<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import ... --map-column-java emp_id=String,value=Integer<\/pre>\n<p>Let us now talk about Incremental Import.<\/p>\n<h3>Incremental Imports in Sqoop<\/h3>\n<p>Sqoop provides the facility of incremental import mode, which retrieves only those rows which are newer than the previously-imported set of rows.<br \/>\nThe following arguments control incremental imports:<\/p>\n<table>\n<tbody>\n<tr>\n<td><b>Argument<\/b><\/td>\n<td><b>Description<\/b><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;check-column (col)<\/span><\/td>\n<td><span style=\"font-weight: 400\">It specifies the columns which are to be examined while determining which rows to import. Note that the column must not be of the type CHAR\/NCHAR\/VARCHAR\/VARNCHAR\/LONGVARCHAR\/LONGNVARCHAR.\u00a0<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;incremental (mode)<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will Specify how Sqoop determines which rows are new. The legal values for the mode includes append and lastmodified.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;last-value (value)<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will specify the maximum value of the check column from the previous import.<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Apache Sqoop supports 2 types of incremental imports. They are append and <strong>lastmodified<\/strong>. For specifying the type of incremental import to be performed, we have to use <strong>&#8211;incremental<\/strong> argument.<\/p>\n<p>We can specify the append mode when we are importing a table in which the new rows were added continually with the increasing row id values.<\/p>\n<p>We can specify the lastmodified mode when the rows of the table may be updated, and this update sets the value of the last-modified column to the current timestamp.<\/p>\n<h3>Sqoop File Formats<\/h3>\n<p>We can import the data in any of the two file formats. These two file formats are <strong>delimited text<\/strong> or <strong>SequenceFiles.<\/strong><\/p>\n<p><strong>1. Delimited text:<\/strong> It is the default import format. We can also specify it explicitly via the argument &#8211;as-textfile.<br \/>\n<strong>2. SequenceFiles:<\/strong> These are the binary formats which store individual records in the custom record-specific data types.<\/p>\n<h3>Large Objects<\/h3>\n<p>Apache Sqoop handles large objects such as BLOB and CLOB columns in a particular way. The large columns are not fully materialized in the memory for manipulation like the other columns. Large object data is handled in a streaming manner.<\/p>\n<p>The large objects can be stored inline with the rest of the data, in which they are fully materialized in the memory on every access. They can be stored in the secondary storage file, which is linked to a primary data storage.<\/p>\n<p>In Sqoop, by default, the large objects which are less than 16 MB in size were stored inline with the rest of the data.<\/p>\n<p>The Output line formatting arguments are:<\/p>\n<table>\n<tbody>\n<tr>\n<td><b>Argument<\/b><\/td>\n<td><b>Description<\/b><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;enclosed-by &lt;char&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will set the required field enclosing character<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;escaped-by &lt;char&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will set the escape character<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;fields-terminated-by &lt;char&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will set the field separator character<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;lines-terminated-by &lt;char&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will set the end-of-line character<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;mysql-delimiters<\/span><\/td>\n<td><span style=\"font-weight: 400\">Uses the MySQL\u2019s default delimiter set: fields: , lines: \\n escaped-by: \\ optionally-enclosed-by: &#8216;<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;optionally-enclosed-by &lt;char&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will set the field enclosing character<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The Input parsing arguments are:<\/p>\n<table>\n<tbody>\n<tr>\n<td><span style=\"font-weight: 400\">Argument<\/span><\/td>\n<td><span style=\"font-weight: 400\">Description<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;input-enclosed-by &lt;char&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Sets a required field encloser<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;input-escaped-by &lt;char&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Sets the input escape character<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;input-fields-terminated-by &lt;char&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Sets the input field separator<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;input-lines-terminated-by &lt;char&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Sets the input end-of-line character<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;input-optionally-enclosed-by &lt;char&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Sets a field enclosing character<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>Importing Data Into Hive<\/h3>\n<p>Sqoop is used mainly for uploading table data into HDFS. But if we have a Hive metastore associated with our HDFS cluster, then also we can use Apache Sqoop.<\/p>\n<p>Sqoop imports the data into the Hive by generating and executing the CREATE TABLE statement for defining data\u2019s layout in the Hive. We can import data into Hive just by adding the option <strong>&#8211;hive-import<\/strong> in our Sqoop command line.<\/p>\n<p>If in case the Hive table already exists, then we can specify <strong>&#8211;hive-overwrite<\/strong> option that indicates the already existing table in hive should be replaced.<\/p>\n<p>When the data is imported into HDFS, then Sqoop will generate the Hive script that contains the CREATE TABLE operation that defines our columns using the Hive\u2019s types, and the LOAD DATA INPATH statement for moving data files into the Hive\u2019s warehouse directory.<\/p>\n<p>This script is executed by calling an installed copy of Hive on the machine where Sqoop is run.<\/p>\n<p>If we have multiple Hive installations, or the hive is not in our $PATH, then we use the option <strong>&#8211;hive-home<\/strong> for identifying Hive installation directory. Apache Sqoop will use <strong>$HIVE_HOME\/bin\/hive<\/strong> from here.<\/p>\n<p>We can import data for Hive into the particular partition by specifying the arguments, <strong>&#8211;hive-partition-key<\/strong> and <strong>&#8211;hive-partition-value<\/strong> arguments.<\/p>\n<p>By using the <strong>&#8211;compress<\/strong> and <strong>&#8211;compression-codec options<\/strong>, we can import compressed tables into the Hive.<\/p>\n<p>The Hive arguments are:<\/p>\n<table>\n<tbody>\n<tr>\n<td><b>Argument<\/b><\/td>\n<td><b>Description<\/b><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;hive-home &lt;dir&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Override $HIVE_HOME<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;hive-import<\/span><\/td>\n<td><span style=\"font-weight: 400\">Import tables into Hive. It uses the Hive\u2019s default delimiters if none are set.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;hive-overwrite<\/span><\/td>\n<td><span style=\"font-weight: 400\">Overwrite the existing data in the Hive table.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;create-hive-table<\/span><\/td>\n<td><span style=\"font-weight: 400\">If we set this option, then the Sqoop job will fail if a target hive table exits. This property, by default, is set to false.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;hive-table &lt;table-name&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will set a table name to use while importing to Hive.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;hive-drop-import-delims<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will drop the <\/span><i><span style=\"font-weight: 400\">\\n<\/span><\/i><span style=\"font-weight: 400\">, <\/span><i><span style=\"font-weight: 400\">\\r<\/span><\/i><span style=\"font-weight: 400\">, and <\/span><i><span style=\"font-weight: 400\">\\01<\/span><\/i><span style=\"font-weight: 400\"> from the string fields while importing to Hive.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;hive-delims-replacement<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will replace the <\/span><i><span style=\"font-weight: 400\">\\n<\/span><\/i><span style=\"font-weight: 400\">, <\/span><i><span style=\"font-weight: 400\">\\r<\/span><\/i><span style=\"font-weight: 400\">, and <\/span><i><span style=\"font-weight: 400\">\\01<\/span><\/i><span style=\"font-weight: 400\"> from the string fields with the user defined string while importing to Hive.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;hive-partition-key<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will specify the name of the hive field to which the partition are sharded on<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;hive-partition-value &lt;v&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will specify the String-value which serves as a partition key for this imported into the hive in this job.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;map-column-hive &lt;map&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Override the default mapping from the SQL type to the Hive type for the configured columns.<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>Importing Data Into HBase using Sqoop<\/h3>\n<p>Apache Sqoop can import the records into the table in HBase as well. For importing a table to HBase instead of any directory in HDFS, we have to specify the &#8211;hbase-table option in the Sqoop command.<\/p>\n<p>Apache Sqoop will import the data to a table specified as an argument to the <strong>&#8211;hbase-table<\/strong> option. Each row of an input table is transformed into the HBase Put operation to the row of the output table. For each row, the key is taken from the column of the input.<\/p>\n<p>Sqoop, by default, uses a split-by column as a row key column.<\/p>\n<p>If the split-by column is not specified, then it will try to find the primary key column. We can also manually specify the row key column with the <strong>&#8211;hbase-row-key<\/strong>. Every output column is placed in the same column family specified with <strong>&#8211;column-family<\/strong>.<\/p>\n<p>While importing data into HBase, if the target table and the column family don\u2019t exist, then the Sqoop job exits with an error. So if you are importing using the <strong>&#8211;hbase-table<\/strong> option, then you have to create the target table and the column family before running the import.<\/p>\n<p>If we specify the option <strong>&#8211;hbase-create-table<\/strong>, then the Sqoop will itself create the target table and the column family if they don\u2019t exist.<\/p>\n<p>The HBase arguments are:<\/p>\n<table>\n<tbody>\n<tr>\n<td><b>Argument<\/b><\/td>\n<td><b>Description<\/b><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;column-family &lt;family&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will set a target column family for the import<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;hbase-create-table<\/span><\/td>\n<td><span style=\"font-weight: 400\">If it is specified, then it will create missing HBase tables<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;hbase-row-key &lt;col&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will specify which input column to be used as a row key. If the input table contains a composite key, then in such as case the&lt;col&gt; must be in the form of the comma-separated list of the composite key attributes<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;hbase-table &lt;table-name&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will Specify an HBase table to be used as a target instead of HDFS<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;hbase-bulkload<\/span><\/td>\n<td><span style=\"font-weight: 400\">Enables bulk loading<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>Importing Data Into Accumulo<\/h3>\n<p>Apache Sqoop also provides support for importing records into the table in Accumulo. It can be done by specifying<strong> &#8211;accumulo-table<\/strong> option. Apache Sqoop imports the data to a table specified as an argument to the &#8211;accumulo-table.<\/p>\n<p>Each row of an input table is transformed into the Accumulo Mutation operation to the row of an output table. For each row, the key is taken from the column of the input.<\/p>\n<p>Sqoop, by default, uses the split-by column as a row key column. If the split-by column is not specified, then it tries to find the primary key column. We can also manually specify a row key column via <strong>&#8211;accumulo-row-key<\/strong>.<\/p>\n<p>Each output column is placed in the same column family specified with <strong>&#8211;accumulo-column-family<\/strong>.<br \/>\nSpecify the &#8211;accumulo-create-table parameter if you have not created a target table.<\/p>\n<p>The Accumulo arguments are:<\/p>\n<table>\n<tbody>\n<tr>\n<td><b>Argument<\/b><\/td>\n<td><b>Description<\/b><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;accumulo-table &lt;table-nam&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will specify the Accumulo table to be used as the target instead of the HDFS<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;accumulo-column-family &lt;family&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will set the target column family for the import<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;accumulo-create-table<\/span><\/td>\n<td><span style=\"font-weight: 400\">If it is specified, then it will create the missing Accumulo tables<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;accumulo-row-key &lt;col&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It will specify which input column to be used as the row key<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;accumulo-visibility &lt;vis&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It is Optional.\u00a0 It will specify the visibility token to be applied to all the rows inserted into Accumulo. The default value is the empty string.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;accumulo-batch-size &lt;size&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It is optional. It will set the size of Accumulo\u2019s writer buffer in bytes. The default is 4MB.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;accumulo-max-latency &lt;ms&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">It is Optional. It will set the maximum latency in the milliseconds for the Accumulo batch writer. The default value is set to 0.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;accumulo-zookeepers &lt;host:port&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Specify the comma-separated list of the Zookeeper servers used by Accumulo instance<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;accumulo-instance &lt;table-name&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Specify the name of a target Accumulo instance<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;accumulo-user &lt;username&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Specify the name of Accumulo user to import as<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">&#8211;accumulo-password &lt;password&gt;<\/span><\/td>\n<td><span style=\"font-weight: 400\">Specify the password for Accumulo user<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>Additional Import Configuration Properties<\/h3>\n<p>There are some additional properties which we can configure by modifying the <strong>conf\/sqoop-site.xml file<\/strong>. We can specify the properties in the same manner as we do in Hadoop configuration files.<\/p>\n<p><strong>For example:<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">&lt;property&gt;\n    &lt;name&gt;property.name&lt;\/name&gt;\n    &lt;value&gt;property.value&lt;\/value&gt;\n  &lt;\/property&gt;\n<\/pre>\n<p>We can also specify it on the command line in the generic arguments.<br \/>\n<strong>For example:<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">sqoop import -D property.name=property.value ...<\/pre>\n<p>The Additional import configuration properties are:<\/p>\n<table>\n<tbody>\n<tr>\n<td><b>Argument<\/b><\/td>\n<td><b>Description<\/b><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">sqoop.bigdecimal.format.string<\/span><\/td>\n<td><span style=\"font-weight: 400\">Controls how BigDecimal columns will be <\/span><span style=\"font-weight: 400\">formatted when they are stored as a String. The default value is true that will use toPlainString for storing them without an exponent component (0.0000001). If set to false then it will use toString which may include an exponent (1E-7).<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">sqoop.hbase.add.row.key<\/span><\/td>\n<td><span style=\"font-weight: 400\">It is set to false by default which means that the\u00a0 Sqoop will not add the column which is used as a row key into the row data in the HBase. If we set this property to true, then the column which is used as a row key will be added to the row data in the HBase.<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>Sqoop Import Example Invocations<\/h3>\n<p>The below examples will illustrate how we can use the Sqoop import tool in a variety of situations.<\/p>\n<p><strong>1:<\/strong> In this example, we are just trying to import a table named emp_info in the demo_db_db database:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import --connect jdbc:mysql:\/\/localhost\/demo_db_db --table emp_info<\/pre>\n<p>The basic import requires a login which can be done as:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import --connect jdbc:mysql:\/\/localhost\/demo_db_db --table emp_info \\\n    --username SomeUser -P\nEnter password: (hidden)\n<\/pre>\n<p><strong>2:<\/strong> In this example we are importing the specific columns from the emp_info table:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import --connect jdbc:mysql:\/\/localhost\/demo_db_db --table emp_info \\\n    --columns \"emp_id,emp_name,emp_jobtitle\"\n<\/pre>\n<p><strong>3:<\/strong> Controlling the import parallelism by using the 8 parallel tasks:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import --connect jdbc:mysql:\/\/localhost\/demo_db --table emp_info \\\n    -m 8\n<\/pre>\n<p><strong>4:<\/strong> Storing data in the SequenceFiles, and setting generated class name to example.Emp:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import --connect jdbc:mysql:\/\/localhost\/demo_db --table emp_info \\\n    --class-name example.Emp --as-sequencefile\n<\/pre>\n<p><strong>5:<\/strong> Specifying delimiters to be use in the text-mode import:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import --connect jdbc:mysql:\/\/localhost\/demo_db --table emp_info\\\n    --fields-terminated-by '\\t' --lines-terminated-by '\\n' \\\n    --optionally-enclosed-by '\\\"'\n<\/pre>\n<p><strong>6:<\/strong> Trying to import the data to Hive:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import --connect jdbc:mysql:\/\/localhost\/demo_db --table emp_info\\\n    --hive-import\n<\/pre>\n<p><strong>7:<\/strong> Importing only the new employees:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import --connect jdbc:mysql:\/\/localhost\/demo_db --table emp_info \\\n    --where \"start_date &gt; '2020-06-11'\"\n<\/pre>\n<p><strong>8:<\/strong> Trying to change the splitting column from the default:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import --connect jdbc:mysql:\/\/localhost\/demo_db --table emp_info \\\n    --split-by dept_id\n<\/pre>\n<p><strong>9:<\/strong> We can verify that an import was successful by using:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ hadoop fs -ls emp_info<\/pre>\n<p><strong>10:<\/strong> In the below example, we are Performing an incremental import of the new data. This is after importing the the first 1000 rows of the table:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import --connect jdbc:mysql:\/\/localhost\/demo_db --table emp_info \\\n    --where \"id &gt; 1000\" --target-dir \/incremental_dataset --append\n<\/pre>\n<p><strong>11:<\/strong> Trying to import a table named emp_info in the demo_db database that uses validation for validating the import by using the table row count and the number of rows copied into the HDFS:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import --connect jdbc:mysql:\/\/localhost\/demo_db \\\n    --table emp_info --validate\n<\/pre>\n<h3>Summary<\/h3>\n<p>I hope after reading this article, you clearly understand how we can import tables from relational tables to HDFS, Hive, HBase, and Accumulo.<\/p>\n<p>The article had explained different types of arguments as well as the syntax for the Sqoop import tool.<\/p>\n<p>If you have any doubt related to Sqoop import, then please share it with us in the comment section.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We all know that for transferring data from RDBMS to HDFS or vice-versa, we use Apache Sqoop. In this Sqoop Import article, we will discuss the Sqoop Import tool used for importing tables from&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":79658,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3163],"tags":[3166],"class_list":["post-79615","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-sqoop","tag-sqoop-import"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.7 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Sqoop Import Queries with Examples - TechVidvan<\/title>\n<meta name=\"description\" content=\"Apache Sqoop Import - Learn how to import tables from relational tables to HDFS, Hive, HBase, and Accumulo. Learn different types of arguments with syntax\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Sqoop Import Queries with Examples - TechVidvan\" \/>\n<meta property=\"og:description\" content=\"Apache Sqoop Import - Learn how to import tables from relational tables to HDFS, Hive, HBase, and Accumulo. Learn different types of arguments with syntax\" \/>\n<meta property=\"og:url\" content=\"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/\" \/>\n<meta property=\"og:site_name\" content=\"TechVidvan\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/TechVidvan\/\" \/>\n<meta property=\"article:published_time\" content=\"2020-08-24T03:30:21+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/08\/sqoop-import-tv.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"TechVidvan Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:site\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"TechVidvan Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"17 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Sqoop Import Queries with Examples - TechVidvan","description":"Apache Sqoop Import - Learn how to import tables from relational tables to HDFS, Hive, HBase, and Accumulo. Learn different types of arguments with syntax","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/","og_locale":"en_US","og_type":"article","og_title":"Sqoop Import Queries with Examples - TechVidvan","og_description":"Apache Sqoop Import - Learn how to import tables from relational tables to HDFS, Hive, HBase, and Accumulo. Learn different types of arguments with syntax","og_url":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/","og_site_name":"TechVidvan","article_publisher":"https:\/\/www.facebook.com\/TechVidvan\/","article_published_time":"2020-08-24T03:30:21+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/08\/sqoop-import-tv.jpg","type":"image\/jpeg"}],"author":"TechVidvan Team","twitter_card":"summary_large_image","twitter_creator":"@vidvantech","twitter_site":"@vidvantech","twitter_misc":{"Written by":"TechVidvan Team","Est. reading time":"17 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/#article","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/"},"author":{"name":"TechVidvan Team","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22"},"headline":"Sqoop Import Queries with Examples","datePublished":"2020-08-24T03:30:21+00:00","mainEntityOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/"},"wordCount":3696,"commentCount":0,"publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/08\/sqoop-import-tv.jpg","keywords":["sqoop import"],"articleSection":["Sqoop Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/","url":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/","name":"Sqoop Import Queries with Examples - TechVidvan","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/#website"},"primaryImageOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/#primaryimage"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/08\/sqoop-import-tv.jpg","datePublished":"2020-08-24T03:30:21+00:00","description":"Apache Sqoop Import - Learn how to import tables from relational tables to HDFS, Hive, HBase, and Accumulo. Learn different types of arguments with syntax","breadcrumb":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/#primaryimage","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/08\/sqoop-import-tv.jpg","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/08\/sqoop-import-tv.jpg","width":1200,"height":628,"caption":"sqoop import"},{"@type":"BreadcrumbList","@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-import\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/techvidvan.com\/tutorials\/"},{"@type":"ListItem","position":2,"name":"Sqoop Import Queries with Examples"}]},{"@type":"WebSite","@id":"https:\/\/techvidvan.com\/tutorials\/#website","url":"https:\/\/techvidvan.com\/tutorials\/","name":"TechVidvan Blogs","description":"","publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/techvidvan.com\/tutorials\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/techvidvan.com\/tutorials\/#organization","name":"TechVidvan","url":"https:\/\/techvidvan.com\/tutorials\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","width":200,"height":50,"caption":"TechVidvan"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/TechVidvan\/","https:\/\/x.com\/vidvantech"]},{"@type":"Person","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22","name":"TechVidvan Team","description":"The TechVidvan Team delivers practical, beginner-friendly tutorials on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our experts are here to help you upskill and excel in today\u2019s tech industry."}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/79615","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/comments?post=79615"}],"version-history":[{"count":0,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/79615\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media\/79658"}],"wp:attachment":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media?parent=79615"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/categories?post=79615"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/tags?post=79615"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}