{"id":79620,"date":"2020-08-24T09:00:16","date_gmt":"2020-08-24T03:30:16","guid":{"rendered":"https:\/\/techvidvan.com\/tutorials\/?p=79620"},"modified":"2020-08-24T09:00:16","modified_gmt":"2020-08-24T03:30:16","slug":"apache-sqoop-validation","status":"publish","type":"post","link":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/","title":{"rendered":"Sqoop Validation &#8211; How Sqoop Validates Copied Data"},"content":{"rendered":"<p>Sqoop Validation refers to the validation of the data copied. In this Sqoop Validation article, you will explore the entire concept of Sqoop validation in detail. The article first gives a short introduction to Sqoop Validation.<\/p>\n<p>Then it explains the purpose and the Sqoop Validation syntax and configuration. Finally, it will also cover the Sqoop validation interface, examples, and limitations.<\/p>\n<p>&nbsp;<\/p>\n<h3>What is Sqoop Validation?<\/h3>\n<p>Sqoop validation means validating the data copied through either import or export. It validates the data by comparing row counts from source as well as from the target post copy.<\/p>\n<p>Its primary purpose is to validate the data copied by comparing the row counts from the source as well as target post copy.<\/p>\n<h3>Interfaces of Sqoop Validation<\/h3>\n<p>There are three interfaces of Sqoop Validation. They are:<\/p>\n<h4>a. ValidationThreshold<\/h4>\n<p>This interface determines whether the error margin in between the source and the target are acceptable, that is, Absolute, Percentage Tolerant, etc. The default implementation is the AbsoluteValidationThreshold who ensures that the row counts from the source and the targets are the same.<\/p>\n<h4>b. ValidationFailureHandler<\/h4>\n<p>This interface is responsible for failure handling, such as log an error or warning, abort, etc. It\u2019s default implementation is the LogOnFailureHandler, which logs a warning message to a configured logger.<\/p>\n<h4>c. Validator<\/h4>\n<p>This interface drives validation logic by delegating decisions to the ValidationThreshold and delegating the failure handling to the ValidationFailureHandler. It&#8217;s default implementation is the RowCountValidator who validates row counts from the source and the target.<\/p>\n<h4>Syntax of Sqoop Validation<\/h4>\n<p>The Syntax of Sqoop Validation is:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import (generic-args) (import-args)\n$ sqoop export (generic-args) (export-args)\n<\/pre>\n<p>The validation arguments are part of the import and export arguments.<\/p>\n<h3>Sqoop Validation Configuration<\/h3>\n<p>The Sqoop validation framework is pluggable and extensible. It comes with the default implementations, but we can extend the interfaces for custom implementations by passing them as the part of command line arguments as explained below.<\/p>\n<h3>Validator<\/h3>\n<p><strong>Property:<\/strong>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0validator<br \/>\n<strong>Description:<\/strong>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 Driver for validation, must implement org.apache.sqoop.validation.Validator \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <strong>Supported values:<\/strong>\u00a0 The value must be a fully qualified class name.<br \/>\n<strong>Default value:<\/strong>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0org.apache.sqoop.validation.RowCountValidator<\/p>\n<h3>Validation Threshold<\/h3>\n<p><strong>Property:<\/strong>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 validation-threshold<br \/>\n<strong>Description:<\/strong>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 IT Drives the decision on the basis of the validation whether meeting the threshold or not. It\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0must implement the org.apache.sqoop.validation.ValidationThreshold<br \/>\n<strong>Supported values:<\/strong> The value must be a fully qualified class name.<br \/>\n<strong>Default value:<\/strong>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 org.apache.sqoop.validation.AbsoluteValidationThreshold<\/p>\n<h3>Validation Failure Handler<\/h3>\n<p><strong>Property:<\/strong>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 validation-failurehandler<br \/>\n<strong>Description:<\/strong>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0It is responsible for handling failures. It must implement the org.apache.sqoop.validation.\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0ValidationFailureHandler<br \/>\n<strong>Supported values:<\/strong> The value must be a fully qualified class name.<br \/>\n<strong>Default value:<\/strong>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0org.apache.sqoop.validation.AbortOnFailureHandler<\/p>\n<h3>Limitations of Sqoop Validation<\/h3>\n<p>Currently, the Sqoop Validation validates only the data which is copied from a single table into the HDFS. So some of the limitations in the current implementation are:<\/p>\n<ul>\n<li>all-tables option<\/li>\n<li>free-form query option<\/li>\n<li>Data imported into Hive, Accumulo, or HBase.<\/li>\n<li>table import with the &#8211;where argument<\/li>\n<li>incremental imports<\/li>\n<\/ul>\n<h3>Example Invocations of the Sqoop Validation<\/h3>\n<p><strong>1:<\/strong> In this example, we are importing a table name emp_info present in the demo_db database that uses the Sqoop Validation for validating the row counts.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import \u2013connect jdbc:mysql:\/\/localhost\/demo_db  \\\n\u2013table emp_info \u2013validate\n<\/pre>\n<p><strong>2:<\/strong> In this example, we are trying to export a table named com_info in with the sqoop validation enabled:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop export \u2013connect jdbc:mysql:\/\/localhost\/demo_db \u2013table com_info  \\\n\u2013export-dir \/results\/com_info_data \u2013validate\n<\/pre>\n<p><strong>3:<\/strong> In this example, we are overriding the sqoop validation arguments:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ sqoop import \u2013connect jdbc:mysql:\/\/localhost\/demo_db \u2013table emp_info \\\n\u2013validate \u2013validator org.apache.sqoop.validation.RowCountValidator \\\n\u2013validation-threshold \\\norg.apache.sqoop.validation.AbsoluteValidationThreshold \\\n\u2013validation-failurehandler \\\norg.apache.sqoop.validation.AbortOnFailureHandler\n<\/pre>\n<h3>Summary<\/h3>\n<p>I hope after reading this Sqoop Validation article, you have learned the entire concepts of Sqoop Validation. The article had enlisted various examples to make it easy for you. Now you are aware of the syntax, various interfaces, as well as configuration for Sqoop Validation.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Sqoop Validation refers to the validation of the data copied. In this Sqoop Validation article, you will explore the entire concept of Sqoop validation in detail. The article first gives a short introduction to&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":79665,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3163],"tags":[3169,3170],"class_list":["post-79620","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-sqoop","tag-sqoop-validate-data","tag-sqoop-validation"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Sqoop Validation - How Sqoop Validates Copied Data - TechVidvan<\/title>\n<meta name=\"description\" content=\"Sqoop Validation - Validate the data copied, either import or export by comparing row counts from source and target post copy. Learn with syntax &amp; examples\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Sqoop Validation - How Sqoop Validates Copied Data - TechVidvan\" \/>\n<meta property=\"og:description\" content=\"Sqoop Validation - Validate the data copied, either import or export by comparing row counts from source and target post copy. Learn with syntax &amp; examples\" \/>\n<meta property=\"og:url\" content=\"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/\" \/>\n<meta property=\"og:site_name\" content=\"TechVidvan\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/TechVidvan\/\" \/>\n<meta property=\"article:published_time\" content=\"2020-08-24T03:30:16+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/08\/Sqoop-Validation-tv.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"TechVidvan Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:site\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"TechVidvan Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Sqoop Validation - How Sqoop Validates Copied Data - TechVidvan","description":"Sqoop Validation - Validate the data copied, either import or export by comparing row counts from source and target post copy. Learn with syntax & examples","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/","og_locale":"en_US","og_type":"article","og_title":"Sqoop Validation - How Sqoop Validates Copied Data - TechVidvan","og_description":"Sqoop Validation - Validate the data copied, either import or export by comparing row counts from source and target post copy. Learn with syntax & examples","og_url":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/","og_site_name":"TechVidvan","article_publisher":"https:\/\/www.facebook.com\/TechVidvan\/","article_published_time":"2020-08-24T03:30:16+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/08\/Sqoop-Validation-tv.jpg","type":"image\/jpeg"}],"author":"TechVidvan Team","twitter_card":"summary_large_image","twitter_creator":"@vidvantech","twitter_site":"@vidvantech","twitter_misc":{"Written by":"TechVidvan Team","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/#article","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/"},"author":{"name":"TechVidvan Team","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22"},"headline":"Sqoop Validation &#8211; How Sqoop Validates Copied Data","datePublished":"2020-08-24T03:30:16+00:00","mainEntityOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/"},"wordCount":595,"commentCount":0,"publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/08\/Sqoop-Validation-tv.jpg","keywords":["sqoop validate data","sqoop validation"],"articleSection":["Sqoop Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/","url":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/","name":"Sqoop Validation - How Sqoop Validates Copied Data - TechVidvan","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/#website"},"primaryImageOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/#primaryimage"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/08\/Sqoop-Validation-tv.jpg","datePublished":"2020-08-24T03:30:16+00:00","description":"Sqoop Validation - Validate the data copied, either import or export by comparing row counts from source and target post copy. Learn with syntax & examples","breadcrumb":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/#primaryimage","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/08\/Sqoop-Validation-tv.jpg","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2020\/08\/Sqoop-Validation-tv.jpg","width":1200,"height":628,"caption":"Sqoop Validation"},{"@type":"BreadcrumbList","@id":"https:\/\/techvidvan.com\/tutorials\/apache-sqoop-validation\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/techvidvan.com\/tutorials\/"},{"@type":"ListItem","position":2,"name":"Sqoop Validation &#8211; How Sqoop Validates Copied Data"}]},{"@type":"WebSite","@id":"https:\/\/techvidvan.com\/tutorials\/#website","url":"https:\/\/techvidvan.com\/tutorials\/","name":"TechVidvan Blogs","description":"","publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/techvidvan.com\/tutorials\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/techvidvan.com\/tutorials\/#organization","name":"TechVidvan","url":"https:\/\/techvidvan.com\/tutorials\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","width":200,"height":50,"caption":"TechVidvan"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/TechVidvan\/","https:\/\/x.com\/vidvantech"]},{"@type":"Person","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22","name":"TechVidvan Team","description":"The TechVidvan Team delivers practical, beginner-friendly tutorials on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our experts are here to help you upskill and excel in today\u2019s tech industry."}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/79620","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/comments?post=79620"}],"version-history":[{"count":0,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/79620\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media\/79665"}],"wp:attachment":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media?parent=79620"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/categories?post=79620"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/tags?post=79620"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}