{"id":241,"date":"2017-09-29T11:50:15","date_gmt":"2017-09-29T11:50:15","guid":{"rendered":"http:\/\/techvidvan.com\/tutorials\/?p=241"},"modified":"2017-09-29T11:50:15","modified_gmt":"2017-09-29T11:50:15","slug":"hadoop-outputformat-introduction","status":"publish","type":"post","link":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/","title":{"rendered":"What is Hadoop OutputFormat in MapReduce?"},"content":{"rendered":"<p>In our previous <strong>Hadoop tut<\/strong><strong>o<\/strong><strong>rial<\/strong>, we have provided you a detailed description of <b>InputFormat.<\/b>\u00a0Now in this blog, we are going to cover the Hadoop OutputFormat.<\/p>\n<p>We will discuss What is OutputFormat in Hadoop, What is RecordWritter in MapReduce OutputFormat. We will also cover the types of OutputFormat in MapReduce.<\/p>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2019\/11\/OutputFormat-in-Hadoop-MapReduce-01-1.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-73212\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2019\/11\/OutputFormat-in-Hadoop-MapReduce-01-1.jpg\" alt=\"OutputFormat in Hadoop MapReduce\" width=\"1200\" height=\"628\" \/><\/a><\/p>\n<h3>Introduction to Hadoop OutputFormat<\/h3>\n<p><strong>OutputFormat<\/strong> check the output specification for execution of the Map-Reduce job. It describes how RecordWriter implementation is used to write output to output files.<\/p>\n<p>Before we start with OutputFormat, let us first learn what is RecordWriter and what is the work of RecordWriter in MapReduce?<\/p>\n<h4>1. RecordWriter in Hadoop MapReduce<\/h4>\n<p>As we know, <strong>Reducer<\/strong> takes <strong>Mappers<\/strong> intermediate output as input.\u00a0Then it runs a reducer function on them to generate output that is again zero or more key-value pairs.<\/p>\n<p>So, RecordWriter in MapReduce job execution writes these output key-value pairs from the Reducer phase to output files.<\/p>\n<h4>2. Hadoop OutputFormat<\/h4>\n<p>From above it is clear that RecordWriter takes output data from Reducer. Then it writes this data to output files. OutputFormat determines the way these output key-value pairs are written in output files by RecordWriter.<\/p>\n<p>The OutputFormat and\u00a0InputFormat\u00a0functions are similar. OutputFormat instances are used to write to files on the local disk or in\u00a0<strong>HDFS.<\/strong> In MapReduce job execution on the basis of output specification;<\/p>\n<ul>\n<li>Hadoop MapReduce job checks that the output directory does not already present.<\/li>\n<li>OutputFormat in MapReduce job provides the RecordWriter implementation to be used to write the output files of the job. Then the output files are stored in a FileSystem.<\/li>\n<\/ul>\n<p>The framework uses <strong>FileOutputFormat.setOutputPath()<\/strong> method to set the output directory.<\/p>\n<h3>Types of OutputFormat in MapReduce<\/h3>\n<p>There are various types of OutputFormat which are as follows:<\/p>\n<h4>1. TextOutputFormat<\/h4>\n<p>The default OutputFormat is TextOutputFormat. It writes (key, value) pairs on individual lines of text files. Its keys and values can be of any type. The reason behind is that TextOutputFormat turns them to string by calling <strong>toString()<\/strong> on them.<\/p>\n<p>It separates key-value pair by a tab character. By using <strong>MapReduce.output.textoutputformat.separator<\/strong> property we can also change it.<\/p>\n<p>KeyValueTextOutputFormat is also used for reading these output text files.<\/p>\n<h4>2. SequenceFileOutputFormat<\/h4>\n<p>This OutputFormat writes sequences files for its output. SequenceFileInputFormat is also intermediate format use between MapReduce jobs. It serializes arbitrary data types to the file.<\/p>\n<p>And the corresponding SequenceFileInputFormat will deserialize the file into the same types. It presents the data to the next<strong> mapper<\/strong> in the same manner as it was emitted by the previous reducer. Static methods also control the compression.<\/p>\n<h4>3. SequenceFileAsBinaryOutputFormat<\/h4>\n<p>It is another variant of SequenceFileInputFormat. It also writes keys and values to sequence file in binary format.<\/p>\n<h4>4. MapFileOutputFormat<\/h4>\n<p>It is another form of FileOutputFormat. It also writes output as map files. The framework adds a key in a MapFile in order. So we need to ensure that reducer emits keys in sorted order.<\/p>\n<h4>5. MultipleOutputs<\/h4>\n<p>This format allows writing data to files whose names are derived from the output keys and values.<\/p>\n<h4>6. LazyOutputFormat<\/h4>\n<p>In MapReduce job execution, FileOutputFormat sometimes create output files, even if they are empty. LazyOutputFormat is also a wrapper OutputFormat.<\/p>\n<h4>7. DBOutputFormat<\/h4>\n<p>It is the OutputFormat for writing to relational databases and\u00a0HBase. This format also sends the reduce output to a SQL table. It also accepts key-value pairs. In this, the key has a type extending DBwritable.<\/p>\n<h3>Conclusion<\/h3>\n<p>Hence, different OutputFormats are used according to the need. Hope you find this blog helpful. If you have any query about Hadoop OutputFormat, so please leave a comment in a comment box. We will be glad to solve them.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In our previous Hadoop tutorial, we have provided you a detailed description of InputFormat.\u00a0Now in this blog, we are going to cover the Hadoop OutputFormat. We will discuss What is OutputFormat in Hadoop, What&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":73212,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[570],"tags":[538,457,539,571,572,543,573],"class_list":["post-241","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-mapreduce","tag-apache-hadoop","tag-big-data","tag-big-data-hadoop","tag-hadoop-outputformat","tag-hadoop-recordwriter","tag-hadoop-tutorial","tag-hdaoop"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.7 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>What is Hadoop OutputFormat in MapReduce? - TechVidvan<\/title>\n<meta name=\"description\" content=\"Introduction to Hadoop OutputFormat Cover What is OutputFormat in MapReduce, What is Hadoop MapReduce RecordWriter,Types of MapReduce OutputFormat in Hadoop\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"What is Hadoop OutputFormat in MapReduce? - TechVidvan\" \/>\n<meta property=\"og:description\" content=\"Introduction to Hadoop OutputFormat Cover What is OutputFormat in MapReduce, What is Hadoop MapReduce RecordWriter,Types of MapReduce OutputFormat in Hadoop\" \/>\n<meta property=\"og:url\" content=\"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/\" \/>\n<meta property=\"og:site_name\" content=\"TechVidvan\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/TechVidvan\/\" \/>\n<meta property=\"article:published_time\" content=\"2017-09-29T11:50:15+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/OutputFormat-in-Hadoop-MapReduce-01-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"TechVidvan Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:site\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"TechVidvan Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"What is Hadoop OutputFormat in MapReduce? - TechVidvan","description":"Introduction to Hadoop OutputFormat Cover What is OutputFormat in MapReduce, What is Hadoop MapReduce RecordWriter,Types of MapReduce OutputFormat in Hadoop","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/","og_locale":"en_US","og_type":"article","og_title":"What is Hadoop OutputFormat in MapReduce? - TechVidvan","og_description":"Introduction to Hadoop OutputFormat Cover What is OutputFormat in MapReduce, What is Hadoop MapReduce RecordWriter,Types of MapReduce OutputFormat in Hadoop","og_url":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/","og_site_name":"TechVidvan","article_publisher":"https:\/\/www.facebook.com\/TechVidvan\/","article_published_time":"2017-09-29T11:50:15+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/OutputFormat-in-Hadoop-MapReduce-01-1.jpg","type":"image\/jpeg"}],"author":"TechVidvan Team","twitter_card":"summary_large_image","twitter_creator":"@vidvantech","twitter_site":"@vidvantech","twitter_misc":{"Written by":"TechVidvan Team","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/#article","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/"},"author":{"name":"TechVidvan Team","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22"},"headline":"What is Hadoop OutputFormat in MapReduce?","datePublished":"2017-09-29T11:50:15+00:00","mainEntityOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/"},"wordCount":599,"commentCount":0,"publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/OutputFormat-in-Hadoop-MapReduce-01-1.jpg","keywords":["apache hadoop","big data","big data hadoop","Hadoop OutputFormat","Hadoop RecordWriter","hadoop tutorial","Hdaoop"],"articleSection":["MapReduce Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/","url":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/","name":"What is Hadoop OutputFormat in MapReduce? - TechVidvan","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/#website"},"primaryImageOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/#primaryimage"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/OutputFormat-in-Hadoop-MapReduce-01-1.jpg","datePublished":"2017-09-29T11:50:15+00:00","description":"Introduction to Hadoop OutputFormat Cover What is OutputFormat in MapReduce, What is Hadoop MapReduce RecordWriter,Types of MapReduce OutputFormat in Hadoop","breadcrumb":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/#primaryimage","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/OutputFormat-in-Hadoop-MapReduce-01-1.jpg","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/OutputFormat-in-Hadoop-MapReduce-01-1.jpg","width":1200,"height":628,"caption":"OutputFormat in Hadoop MapReduce"},{"@type":"BreadcrumbList","@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-outputformat-introduction\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/techvidvan.com\/tutorials\/"},{"@type":"ListItem","position":2,"name":"What is Hadoop OutputFormat in MapReduce?"}]},{"@type":"WebSite","@id":"https:\/\/techvidvan.com\/tutorials\/#website","url":"https:\/\/techvidvan.com\/tutorials\/","name":"TechVidvan Blogs","description":"","publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/techvidvan.com\/tutorials\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/techvidvan.com\/tutorials\/#organization","name":"TechVidvan","url":"https:\/\/techvidvan.com\/tutorials\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","width":200,"height":50,"caption":"TechVidvan"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/TechVidvan\/","https:\/\/x.com\/vidvantech"]},{"@type":"Person","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22","name":"TechVidvan Team","description":"The TechVidvan Team delivers practical, beginner-friendly tutorials on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our experts are here to help you upskill and excel in today\u2019s tech industry."}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/241","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/comments?post=241"}],"version-history":[{"count":0,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/241\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media\/73212"}],"wp:attachment":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media?parent=241"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/categories?post=241"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/tags?post=241"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}