{"id":1994,"date":"2017-10-03T06:11:18","date_gmt":"2017-10-03T06:11:18","guid":{"rendered":"http:\/\/techvidvan.com\/tutorials\/?p=255"},"modified":"2017-10-03T06:11:18","modified_gmt":"2017-10-03T06:11:18","slug":"hadoop-mapreduce-shuffling-and-sorting","status":"publish","type":"post","link":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/","title":{"rendered":"MapReduce Shuffling and Sorting in Hadoop"},"content":{"rendered":"<p>This <strong>Hadoop tutorial<\/strong> is all about MapReduce Shuffling and Sorting. Here we will provide you a detailed description of Hadoop\u00a0Shuffling and Sorting phase.<\/p>\n<p>Firstly we will discuss what is MapReduce Shuffling, next with MapReduce\u00a0Sorting, then we will cover MapReduce secondary sorting phase in detail.<\/p>\n<p><a href=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2019\/11\/Hadoop-Shuffling-Sorting.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-73148\" src=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/sites\/2\/2019\/11\/Hadoop-Shuffling-Sorting.jpg\" alt=\"Shuffling &amp; Sorting in Hadoop MapReduce\" width=\"1200\" height=\"628\" \/><\/a><\/p>\n<h3>What is MapReduce Shuffling and Sorting?<\/h3>\n<p><strong>Shuffling<\/strong> is the process by which it transfers<strong> mappers<\/strong> intermediate output to the <strong>reducer.<\/strong> Reducer gets 1 or more keys and associated values on the basis of reducers.<\/p>\n<p>The intermediated key &#8211; value generated by mapper is sorted automatically by key. In Sort phase merging and sorting of map output takes place.<\/p>\n<p>Shuffling and Sorting in Hadoop occurs simultaneously.<\/p>\n<h3>Shuffling in MapReduce<\/h3>\n<p>The process of transferring data from the mappers to reducers is shuffling. \u00a0It is also the process by which the system performs the sort. Then it transfers the map output to the reducer as input. This is the reason shuffle phase is necessary for the reducers.<\/p>\n<p>Otherwise, they would not have any input (or input from every mapper). Since shuffling can start even before the map phase has finished. So this saves some time and completes the tasks in lesser time.<\/p>\n<h3>Sorting in MapReduce<\/h3>\n<p>MapReduce Framework automatically sort the keys generated by the mapper. Thus, before starting of reducer, all intermediate key-value pairs get sorted by key and not by value. It does not sort values passed to each reducer. They can be in any order.<\/p>\n<p>Sorting in a MapReduce job helps reducer to easily distinguish when a new reduce task should start.<\/p>\n<p>This saves time for the reducer. Reducer in MapReduce starts a new reduce task when the next key in the sorted input data is different than the previous. Each reduce task takes key value pairs as input and generates key-value pair as output.<\/p>\n<p>The important thing to note is that shuffling and sorting in Hadoop MapReduce are will not take place at all if you specify zero reducers (setNumReduceTasks(0)).<\/p>\n<p>If reducer is zero, then the MapReduce job stops at the map phase. And the map phase does not include any kind of sorting (even the map phase is faster).<\/p>\n<h3>Secondary Sorting in MapReduce<\/h3>\n<p>If we want to sort reducer values, then we use a secondary sorting technique. This technique enables us to sort the values (in ascending or descending order) passed to each reducer.<\/p>\n<h3>Conclusion<\/h3>\n<p>In conclusion, MapReduce Shuffling and Sorting occurs simultaneously to summarize the Mapper intermediate output. Hadoop Shuffling-Sorting will not take place if you specify zero reducers (setNumReduceTasks (0)).<\/p>\n<p>Framework sorts all intermediate key-value pair by key, not by value. It uses secondary sorting for sorting by value. If you have any suggestion or query related to MapReduce Shuffling and Sorting phase, so please leave a comment in a comment box.<\/p>\n<p>We will be happy to solve them.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This Hadoop tutorial is all about MapReduce Shuffling and Sorting. Here we will provide you a detailed description of Hadoop\u00a0Shuffling and Sorting phase. Firstly we will discuss what is MapReduce Shuffling, next with MapReduce\u00a0Sorting,&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":73148,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[570],"tags":[538,457,541,575,598,599,600],"class_list":["post-1994","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-mapreduce","tag-apache-hadoop","tag-big-data","tag-hadoop","tag-hadoop-mapreduce","tag-mapreduce-shuffling","tag-mapreduce-shuffling-and-sorting","tag-mapreduce-sorting"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.7 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>MapReduce Shuffling and Sorting in Hadoop - TechVidvan<\/title>\n<meta name=\"description\" content=\"Hadoop MapReduce Shuffling and Sorting Tutorial cover What is data Shuffling in MapReduce,MapReduce Sorting in Hadoop,Secondary sorting in Hadoop MapReduce\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"MapReduce Shuffling and Sorting in Hadoop - TechVidvan\" \/>\n<meta property=\"og:description\" content=\"Hadoop MapReduce Shuffling and Sorting Tutorial cover What is data Shuffling in MapReduce,MapReduce Sorting in Hadoop,Secondary sorting in Hadoop MapReduce\" \/>\n<meta property=\"og:url\" content=\"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/\" \/>\n<meta property=\"og:site_name\" content=\"TechVidvan\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/TechVidvan\/\" \/>\n<meta property=\"article:published_time\" content=\"2017-10-03T06:11:18+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Hadoop-Shuffling-Sorting.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"TechVidvan Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:site\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"TechVidvan Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"MapReduce Shuffling and Sorting in Hadoop - TechVidvan","description":"Hadoop MapReduce Shuffling and Sorting Tutorial cover What is data Shuffling in MapReduce,MapReduce Sorting in Hadoop,Secondary sorting in Hadoop MapReduce","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/","og_locale":"en_US","og_type":"article","og_title":"MapReduce Shuffling and Sorting in Hadoop - TechVidvan","og_description":"Hadoop MapReduce Shuffling and Sorting Tutorial cover What is data Shuffling in MapReduce,MapReduce Sorting in Hadoop,Secondary sorting in Hadoop MapReduce","og_url":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/","og_site_name":"TechVidvan","article_publisher":"https:\/\/www.facebook.com\/TechVidvan\/","article_published_time":"2017-10-03T06:11:18+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Hadoop-Shuffling-Sorting.jpg","type":"image\/jpeg"}],"author":"TechVidvan Team","twitter_card":"summary_large_image","twitter_creator":"@vidvantech","twitter_site":"@vidvantech","twitter_misc":{"Written by":"TechVidvan Team","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/#article","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/"},"author":{"name":"TechVidvan Team","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22"},"headline":"MapReduce Shuffling and Sorting in Hadoop","datePublished":"2017-10-03T06:11:18+00:00","mainEntityOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/"},"wordCount":472,"commentCount":0,"publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Hadoop-Shuffling-Sorting.jpg","keywords":["apache hadoop","big data","hadoop","hadoop mapreduce","MapReduce Shuffling","MapReduce Shuffling and Sorting","MapReduce Sorting"],"articleSection":["MapReduce Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/","url":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/","name":"MapReduce Shuffling and Sorting in Hadoop - TechVidvan","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/#website"},"primaryImageOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/#primaryimage"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Hadoop-Shuffling-Sorting.jpg","datePublished":"2017-10-03T06:11:18+00:00","description":"Hadoop MapReduce Shuffling and Sorting Tutorial cover What is data Shuffling in MapReduce,MapReduce Sorting in Hadoop,Secondary sorting in Hadoop MapReduce","breadcrumb":{"@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/#primaryimage","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Hadoop-Shuffling-Sorting.jpg","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Hadoop-Shuffling-Sorting.jpg","width":1200,"height":628,"caption":"Shuffling & Sorting in Hadoop MapReduce"},{"@type":"BreadcrumbList","@id":"https:\/\/techvidvan.com\/tutorials\/hadoop-mapreduce-shuffling-and-sorting\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/techvidvan.com\/tutorials\/"},{"@type":"ListItem","position":2,"name":"MapReduce Shuffling and Sorting in Hadoop"}]},{"@type":"WebSite","@id":"https:\/\/techvidvan.com\/tutorials\/#website","url":"https:\/\/techvidvan.com\/tutorials\/","name":"TechVidvan Blogs","description":"","publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/techvidvan.com\/tutorials\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/techvidvan.com\/tutorials\/#organization","name":"TechVidvan","url":"https:\/\/techvidvan.com\/tutorials\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","width":200,"height":50,"caption":"TechVidvan"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/TechVidvan\/","https:\/\/x.com\/vidvantech"]},{"@type":"Person","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22","name":"TechVidvan Team","description":"The TechVidvan Team delivers practical, beginner-friendly tutorials on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our experts are here to help you upskill and excel in today\u2019s tech industry."}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/1994","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/comments?post=1994"}],"version-history":[{"count":0,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/1994\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media\/73148"}],"wp:attachment":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media?parent=1994"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/categories?post=1994"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/tags?post=1994"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}