{"id":1997,"date":"2017-10-04T11:36:28","date_gmt":"2017-10-04T11:36:28","guid":{"rendered":"http:\/\/techvidvan.com\/tutorials\/?p=347"},"modified":"2017-10-04T11:36:28","modified_gmt":"2017-10-04T11:36:28","slug":"map-only-job-in-hadoop","status":"publish","type":"post","link":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/","title":{"rendered":"What is Map Only job in Hadoop?"},"content":{"rendered":"<p>In our previous<strong> Hadoop<\/strong> blogs we have studied each <strong>component of the Hadoop<\/strong> MapReduce process in detail. In this we are going to discuss the very interesting topic i.e. Map Only job in Hadoop.<\/p>\n<p>Firstly, we will take a brief introduction of the <strong>Map<\/strong> and <strong>Reduce<\/strong> phase in Hadoop\u00a0Mapreduce, then after we will discuss\u00a0what is Map only job in Hadoop MapReduce.<\/p>\n<p>At last we will also discuss the advantages and disadvantages of Hadoop Map Only job in this tutorial.<\/p>\n<h3>What is Hadoop Map Only Job?<\/h3>\n<p><strong>Map-Only job <\/strong>in the Hadoop is the process in which <strong>mapper<\/strong> does all tasks. No task is done by the <strong>reducer<\/strong>. Mapper\u2019s output is the final output.<\/p>\n<p>MapReduce is the data processing layer of Hadoop. It processes large structured and unstructured data stored in <strong>HDFS<\/strong>. MapReduce also processes a huge amount of data in parallel.<\/p>\n<p>It does this by dividing the job (submitted job) into a set of independent tasks (sub-job). In Hadoop, MapReduce works by breaking the processing into phases: <strong>Map<\/strong> and <strong>Reduce<\/strong>.<\/p>\n<ul>\n<li><strong>Map:<\/strong> It is the first phase of processing, where we specify all the complex logic code. It takes a set of data and converts into another set of data. It breaks each individual element into tuples (<strong>key-value pairs<\/strong>).<\/li>\n<li><strong>Reduce:<\/strong> It is the second phase of processing. Here we specify light-weight processing like aggregation\/summation. It takes the output from the map as input. Then it combines those tuples based on the key.<\/li>\n<\/ul>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2019\/11\/Map-Only-Job-in-Hadoop-01.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-73201\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2019\/11\/Map-Only-Job-in-Hadoop-01.jpg\" alt=\"Hadoop Map Only Job\" width=\"1200\" height=\"628\" \/><\/a><br \/>\nFrom this word-count example, we can say that there are two sets of parallel process, map and reduce. In map process, the first input is split to distribute the work among all the map nodes as shown above.<\/p>\n<p>Then framework identifies each word and map to the number 1. Thus, it creates pairs called tuples (key-value) pairs.<\/p>\n<p>In the first\u00a0mapper\u00a0node, it passes three words lion, tiger, and the river. Thus, it produces 3 key-value pairs as the output of the node. Three different keys and value set to 1 and the same process repeat for all nodes.<\/p>\n<p>Then it passes these tuples to the\u00a0reducer\u00a0nodes. Partitioner carries out<strong> shuffling<\/strong> so that all tuples with the same key goes to the same node.<\/p>\n<p>In reduce process what basically happens is an aggregation of values or rather an operation on values that share the same key.<\/p>\n<p>Now, let us consider a scenario where we just need to perform the operation. We don\u2019t need aggregation, in such case, we will prefer \u2018<strong>Map-Only job<\/strong>\u2019.<\/p>\n<p>In Map-Only job, the map does all tasks with its <strong>InputSplit<\/strong>. Reducer does no job. Mappers output is the final output.<\/p>\n<h3>How to avoid Reduce Phase in MapReduce?<\/h3>\n<p>By setting <strong>job.setNumreduceTasks(0)<\/strong> in the configuration in a driver we can avoid reduce phase. This will make a number of reducer as <strong>0<\/strong>. Thus the only mapper will be doing the complete task.<\/p>\n<h3>Advantages of Map only job in Hadoop<\/h3>\n<p>In MapReduce job execution in between map and reduces phases there is key, sort and shuffle phase. <strong>Shuffling \u2013Sorting<\/strong> are responsible for sorting the keys in ascending order. Then grouping values based on the same keys. This phase is very expensive.<\/p>\n<p>If reduce phase is not required, we should avoid it. As avoiding reduce phase would eliminate sorting and shuffle phase as well. Therefore, this will also save network congestion.<\/p>\n<p>The reason is that in shuffling, an output of the mapper travels to reduce. And when the data size is huge, large data needs to travel to the reducer.<\/p>\n<p>The output of the mapper is written to local disk before sending to reduce. But in map only job, this output is directly written to\u00a0HDFS. This further saves time as well reduces cost.<\/p>\n<h3>Conclusion<\/h3>\n<p>Hence, we have seen that Map-only job reduces the network congestion by avoiding shuffle, sort and reduce phase. Map alone take care of overall processing and produce the output. BY using <strong>job.setNumreduceTasks(0)<\/strong> this is achieved.<\/p>\n<p>I hope you have understood the Hadoop map only\u00a0job and its significant because we have covered everything about Map Only job in Hadoop. But if you have any query so you can share with us in the comment section.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In our previous Hadoop blogs we have studied each component of the Hadoop MapReduce process in detail. In this we are going to discuss the very interesting topic i.e. Map Only job in Hadoop.&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":73201,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[570],"tags":[538,457,541,575,543,618],"class_list":["post-1997","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-mapreduce","tag-apache-hadoop","tag-big-data","tag-hadoop","tag-hadoop-mapreduce","tag-hadoop-tutorial","tag-map-only-job-in-hadoop"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.7 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>What is Map Only job in Hadoop? - TechVidvan<\/title>\n<meta name=\"description\" content=\"Hadoop Tutorial cover Map Only Job in Hadoop,What is map phase,what is Reduce phase,avoid Reduce Phase in Hadoop, Hadoop Map only job benefits &amp; Limitations\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"What is Map Only job in Hadoop? - TechVidvan\" \/>\n<meta property=\"og:description\" content=\"Hadoop Tutorial cover Map Only Job in Hadoop,What is map phase,what is Reduce phase,avoid Reduce Phase in Hadoop, Hadoop Map only job benefits &amp; Limitations\" \/>\n<meta property=\"og:url\" content=\"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/\" \/>\n<meta property=\"og:site_name\" content=\"TechVidvan\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/TechVidvan\/\" \/>\n<meta property=\"article:published_time\" content=\"2017-10-04T11:36:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Map-Only-Job-in-Hadoop-01.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"TechVidvan Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:site\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"TechVidvan Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"What is Map Only job in Hadoop? - TechVidvan","description":"Hadoop Tutorial cover Map Only Job in Hadoop,What is map phase,what is Reduce phase,avoid Reduce Phase in Hadoop, Hadoop Map only job benefits & Limitations","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/","og_locale":"en_US","og_type":"article","og_title":"What is Map Only job in Hadoop? - TechVidvan","og_description":"Hadoop Tutorial cover Map Only Job in Hadoop,What is map phase,what is Reduce phase,avoid Reduce Phase in Hadoop, Hadoop Map only job benefits & Limitations","og_url":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/","og_site_name":"TechVidvan","article_publisher":"https:\/\/www.facebook.com\/TechVidvan\/","article_published_time":"2017-10-04T11:36:28+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Map-Only-Job-in-Hadoop-01.jpg","type":"image\/jpeg"}],"author":"TechVidvan Team","twitter_card":"summary_large_image","twitter_creator":"@vidvantech","twitter_site":"@vidvantech","twitter_misc":{"Written by":"TechVidvan Team","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/#article","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/"},"author":{"name":"TechVidvan Team","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22"},"headline":"What is Map Only job in Hadoop?","datePublished":"2017-10-04T11:36:28+00:00","mainEntityOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/"},"wordCount":701,"commentCount":0,"publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Map-Only-Job-in-Hadoop-01.jpg","keywords":["apache hadoop","big data","hadoop","hadoop mapreduce","hadoop tutorial","map only job in Hadoop"],"articleSection":["MapReduce Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/","url":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/","name":"What is Map Only job in Hadoop? - TechVidvan","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/#website"},"primaryImageOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/#primaryimage"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Map-Only-Job-in-Hadoop-01.jpg","datePublished":"2017-10-04T11:36:28+00:00","description":"Hadoop Tutorial cover Map Only Job in Hadoop,What is map phase,what is Reduce phase,avoid Reduce Phase in Hadoop, Hadoop Map only job benefits & Limitations","breadcrumb":{"@id":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/#primaryimage","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Map-Only-Job-in-Hadoop-01.jpg","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Map-Only-Job-in-Hadoop-01.jpg","width":1200,"height":628,"caption":"Hadoop Map Only Job"},{"@type":"BreadcrumbList","@id":"https:\/\/techvidvan.com\/tutorials\/map-only-job-in-hadoop\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/techvidvan.com\/tutorials\/"},{"@type":"ListItem","position":2,"name":"What is Map Only job in Hadoop?"}]},{"@type":"WebSite","@id":"https:\/\/techvidvan.com\/tutorials\/#website","url":"https:\/\/techvidvan.com\/tutorials\/","name":"TechVidvan Blogs","description":"","publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/techvidvan.com\/tutorials\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/techvidvan.com\/tutorials\/#organization","name":"TechVidvan","url":"https:\/\/techvidvan.com\/tutorials\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","width":200,"height":50,"caption":"TechVidvan"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/TechVidvan\/","https:\/\/x.com\/vidvantech"]},{"@type":"Person","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22","name":"TechVidvan Team","description":"The TechVidvan Team delivers practical, beginner-friendly tutorials on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our experts are here to help you upskill and excel in today\u2019s tech industry."}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/1997","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/comments?post=1997"}],"version-history":[{"count":0,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/1997\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media\/73201"}],"wp:attachment":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media?parent=1997"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/categories?post=1997"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/tags?post=1997"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}