{"id":315,"date":"2017-10-04T05:10:50","date_gmt":"2017-10-04T05:10:50","guid":{"rendered":"http:\/\/techvidvan.com\/tutorials\/?p=315"},"modified":"2017-10-04T05:10:50","modified_gmt":"2017-10-04T05:10:50","slug":"hadoop-partitioner-introduction","status":"publish","type":"post","link":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/","title":{"rendered":"Hadoop Partitioner &#8211; Learn the Basics of MapReduce Partitioner"},"content":{"rendered":"<p>The main goal of this<strong> Hadoop Tutorial<\/strong> is to provide you a detailed description of each component\u00a0that is used in Hadoop working. In this tutorial, we are going to cover the Partitioner in Hadoop.<\/p>\n<p>What is Hadoop Partitioner, what is the need of Partitioner in Hadoop, What is the default Partitioner in MapReduce, How many MapReduce Partitioner are used in Hadoop?<\/p>\n<p>We will answer all these questions in this MapReduce tutorial.<\/p>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2019\/11\/partitioner-in-hadoop-MapReduce.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-73215\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2019\/11\/partitioner-in-hadoop-MapReduce.jpg\" alt=\"hadoop MapReduce partitioner\" width=\"1200\" height=\"628\" \/><\/a><\/p>\n<h3>What is Hadoop Partitioner?<\/h3>\n<p>Partitioner in MapReduce job execution\u00a0controls the partitioning of the keys of the intermediate map-outputs. With the help of hash function, key (or a subset of the key) derives the partition. The total number of partitions is equal to the number of reduce tasks.<\/p>\n<p>On the basis of <strong>key value<\/strong>, framework partitions, each<strong> mapper<\/strong> output. Records as having the same key value go into the same partition (within each mapper). Then each partition is sent to a <strong>reducer<\/strong>.<\/p>\n<p>Partition class decides which partition a given (key, value) pair will go. Partition phase in MapReduce data flow takes place after map phase and before reduce phase.<\/p>\n<h3>Need of MapReduce Partitioner in Hadoop<\/h3>\n<p>In MapReduce job execution, it\u00a0takes an input data set and produces the list of key value pair. These key-value pair is the result of map phase. In which input data are split and each task processes the split and each map, output the list of key value pairs.<\/p>\n<p>Then, framework sends the map output to reduce task. Reduce processes the user-defined reduce function on map outputs. Before reduce phase, partitioning of the map output take place on the basis of the key.<\/p>\n<p>Hadoop Partitioning specifies that all the values for each key are grouped together. It also makes sure that all the values of a single key go to the same reducer. This allows even distribution of the map output over the reducer.<\/p>\n<p>Partitioner in a MapReduce job redirects the mapper output to the reducer by determining which reducer handles the particular key.<\/p>\n<h3>Hadoop Default Partitioner<\/h3>\n<p><strong>Hash Partitioner<\/strong> is the default Partitioner. It computes a hash value for the key. It also assigns the partition based on this result.<\/p>\n<h3>How many Partitioner in Hadoop?<\/h3>\n<p>The total number of Partitioner depends on the number of reducers. Hadoop Partitioner divides the data according to the number of reducers. It is set by <strong>JobConf.setNumReduceTasks()<\/strong> method.<\/p>\n<p>Thus the single reducer processes the data from single partitioner. The important thing to notice is that the framework creates partitioner only when there are many reducers.<\/p>\n<h3>Poor Partitioning in Hadoop MapReduce<\/h3>\n<p>If in data input in MapReduce job one key appears more than any other key. In such case, to send data to the partition we use two mechanisms which are as follows:<\/p>\n<ul>\n<li>The key appearing more number of times will be sent to one partition.<\/li>\n<li>All the other key will be sent to partitions on the basis of their<strong> hashCode()<\/strong>.<\/li>\n<\/ul>\n<p>If <strong>hashCode()<\/strong> method does not distribute other key data over the partition range. Then data will not be sent to the reducers.<\/p>\n<p>Poor partitioning of data means that some reducers will have more data input as compared to other. They will have more work to do than other reducers. Thus the entire job has to wait for one reducer to finish its extra-large share of the load.<\/p>\n<p><strong>How to overcome poor partitioning in MapReduce?<\/strong><\/p>\n<p>To overcome poor partitioner in<a href=\"https:\/\/data-flair.training\/blogs\/important-big-data-terminologies-and-hadoop-concepts-you-must-know\/\">\u00a0<\/a>Hadoop MapReduce, we can create Custom partitioner. This allows sharing workload across different reducers.<\/p>\n<h3>Conclusion<\/h3>\n<p>In conclusion, Partitioner allows uniform distribution of the map output over the reducer. In MapReducer Partitioner, partitioning of map output take place on the basis of the key and value.<\/p>\n<p>Hence, we have covered the complete overview of Partitioner in this blog. Hope you liked it. If any doubt comes into your mind about Hadoop Partitioner, so don&#8217;t forget to share with us.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The main goal of this Hadoop Tutorial is to provide you a detailed description of each component\u00a0that is used in Hadoop working. In this tutorial, we are going to cover the Partitioner in Hadoop.&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":73215,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[570],"tags":[538,457,539,541,575,581,463,582],"class_list":["post-315","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-mapreduce","tag-apache-hadoop","tag-big-data","tag-big-data-hadoop","tag-hadoop","tag-hadoop-mapreduce","tag-hadoop-partitioner","tag-mapreduce","tag-mapreduce-partitioner"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.7 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Hadoop Partitioner - Learn the Basics of MapReduce Partitioner - TechVidvan<\/title>\n<meta name=\"description\" content=\"Hadoop Partitioner cover What is partitioner, why Partitioning,Default partitioner in Hadoop,poor partitioning in MapReduce,Working of MapReduce Partitioner\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Hadoop Partitioner - Learn the Basics of MapReduce Partitioner - TechVidvan\" \/>\n<meta property=\"og:description\" content=\"Hadoop Partitioner cover What is partitioner, why Partitioning,Default partitioner in Hadoop,poor partitioning in MapReduce,Working of MapReduce Partitioner\" \/>\n<meta property=\"og:url\" content=\"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/\" \/>\n<meta property=\"og:site_name\" content=\"TechVidvan\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/TechVidvan\/\" \/>\n<meta property=\"article:published_time\" content=\"2017-10-04T05:10:50+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/partitioner-in-hadoop-MapReduce.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"TechVidvan Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:site\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"TechVidvan Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Hadoop Partitioner - Learn the Basics of MapReduce Partitioner - TechVidvan","description":"Hadoop Partitioner cover What is partitioner, why Partitioning,Default partitioner in Hadoop,poor partitioning in MapReduce,Working of MapReduce Partitioner","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/","og_locale":"en_US","og_type":"article","og_title":"Hadoop Partitioner - Learn the Basics of MapReduce Partitioner - TechVidvan","og_description":"Hadoop Partitioner cover What is partitioner, why Partitioning,Default partitioner in Hadoop,poor partitioning in MapReduce,Working of MapReduce Partitioner","og_url":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/","og_site_name":"TechVidvan","article_publisher":"https:\/\/www.facebook.com\/TechVidvan\/","article_published_time":"2017-10-04T05:10:50+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/partitioner-in-hadoop-MapReduce.jpg","type":"image\/jpeg"}],"author":"TechVidvan Team","twitter_card":"summary_large_image","twitter_creator":"@vidvantech","twitter_site":"@vidvantech","twitter_misc":{"Written by":"TechVidvan Team","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/#article","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/"},"author":{"name":"TechVidvan Team","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22"},"headline":"Hadoop Partitioner &#8211; Learn the Basics of MapReduce Partitioner","datePublished":"2017-10-04T05:10:50+00:00","mainEntityOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/"},"wordCount":650,"commentCount":0,"publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/partitioner-in-hadoop-MapReduce.jpg","keywords":["apache hadoop","big data","big data hadoop","hadoop","hadoop mapreduce","Hadoop partitioner","MapReduce","MapReduce Partitioner"],"articleSection":["MapReduce Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/","url":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/","name":"Hadoop Partitioner - Learn the Basics of MapReduce Partitioner - TechVidvan","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/#website"},"primaryImageOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/#primaryimage"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/partitioner-in-hadoop-MapReduce.jpg","datePublished":"2017-10-04T05:10:50+00:00","description":"Hadoop Partitioner cover What is partitioner, why Partitioning,Default partitioner in Hadoop,poor partitioning in MapReduce,Working of MapReduce Partitioner","breadcrumb":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/#primaryimage","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/partitioner-in-hadoop-MapReduce.jpg","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/partitioner-in-hadoop-MapReduce.jpg","width":1200,"height":628,"caption":"hadoop partitioner in MapReduce"},{"@type":"BreadcrumbList","@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-partitioner-introduction\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/techvidvan.com\/tutorials\/"},{"@type":"ListItem","position":2,"name":"Hadoop Partitioner &#8211; Learn the Basics of MapReduce Partitioner"}]},{"@type":"WebSite","@id":"https:\/\/techvidvan.com\/tutorials\/#website","url":"https:\/\/techvidvan.com\/tutorials\/","name":"TechVidvan Blogs","description":"","publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/techvidvan.com\/tutorials\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/techvidvan.com\/tutorials\/#organization","name":"TechVidvan","url":"https:\/\/techvidvan.com\/tutorials\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","width":200,"height":50,"caption":"TechVidvan"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/TechVidvan\/","https:\/\/x.com\/vidvantech"]},{"@type":"Person","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22","name":"TechVidvan Team","description":"The TechVidvan Team delivers practical, beginner-friendly tutorials on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our experts are here to help you upskill and excel in today\u2019s tech industry."}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/315","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/comments?post=315"}],"version-history":[{"count":0,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/315\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media\/73215"}],"wp:attachment":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media?parent=315"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/categories?post=315"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/tags?post=315"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}