{"id":1637,"date":"2018-04-21T10:39:33","date_gmt":"2018-04-21T10:39:33","guid":{"rendered":"https:\/\/techvidvan.com\/tutorials\/?p=1637"},"modified":"2018-04-21T10:39:33","modified_gmt":"2018-04-21T10:39:33","slug":"apache-pig-operators","status":"publish","type":"post","link":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/","title":{"rendered":"Apache Pig Operators with Syntax and Examples"},"content":{"rendered":"<p><span style=\"font-weight: 400\">There is a huge set of Apache Pig Operators available in <strong>Apache Pig<\/strong>. In this article, \u201cIntroduction to Apache Pig Operators\u201d we will discuss all types of Apache Pig Operators in detail. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Such as Diagnostic Operators, Grouping &amp; Joining, Combining &amp; Splitting and many more. <\/span><span style=\"font-weight: 400\">They also have their subtypes.<\/span><\/p>\n<p><span style=\"font-weight: 400\"> So, here we will discuss each Apache Pig Operators in depth along with syntax and their examples.<\/span><\/p>\n<h3>What is Apache Pig Operators?<\/h3>\n<p><span style=\"font-weight: 400\">We have a huge set of Apache Pig Operators, for performing several types of Operations. Let\u2019s discuss types of Apache Pig Operators:<\/span><\/p>\n<ol>\n<li><span style=\"font-weight: 400\"> Diagnostic Operators<\/span><\/li>\n<li><span style=\"font-weight: 400\"> Grouping &amp; Joining<\/span><\/li>\n<li>Combining &amp; Splitting<\/li>\n<li><span style=\"font-weight: 400\">Filtering<\/span><\/li>\n<li><span style=\"font-weight: 400\"> Sorting<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400\">So, let\u2019s discuss each type of\u00a0Apache Pig Operators in detail.<\/span><\/p>\n<h3>Types of Pig Operators<\/h3>\n<h4><span style=\"font-weight: 400\">i. Diagnostic Operators:\u00a0Apache Pig Operators<\/span><\/h4>\n<p><span style=\"font-weight: 400\">Basically, we use Diagnostic Operators to verify the execution of the Load statement. There are four different types of diagnostic operators \u2212<\/span><\/p>\n<ol>\n<li style=\"list-style-type: none\">\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Dump operator<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<ol>\n<li style=\"list-style-type: none\">\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Describe operator<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<ol>\n<li style=\"list-style-type: none\">\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Explanation operator<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Illustration operator<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400\">Further, we will discuss each operator of Pig Latin in depth.<\/span><\/p>\n<h5><span style=\"font-weight: 400\">a. Dump Operator<\/span><\/h5>\n<p><span style=\"font-weight: 400\">In order to run the Pig Latin statements and display the results on the screen, we use Dump Operator. Generally, we use it for debugging Purpose.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Syntax<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">So the syntax of the Dump operator is:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump Relation_Name<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Example<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Here, is the example, in which a dump is performed after each statement.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">A = LOAD 'Employee' AS (name:chararray, age:int, gpa:float);\nDUMP A;\n(Shubham,18,4.0F)\n(Pulkit,19,3.7F)\n(Shreyash,20,3.9F)\n(Mehul,22,3.8F)\n(Rishabh,20,4.0F)\nB = FILTER A BY name matches 'J.+';\nDUMP B;\n(Shubham,18,4.0F)\n(Mehul,22,3.8F)\n(Rishabh,20,4.0F)<\/pre>\n<h5><span style=\"font-weight: 400\">b. Describe operator<\/span><\/h5>\n<p><span style=\"font-weight: 400\">To view the schema of a relation, we use the describe operator.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Syntax<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">So, the syntax of the describe operator is \u2212<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Describe Relation_name<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Example<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Let\u2019s suppose we have a file Employee_data.txt in HDFS. Its content is.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">001,mehul,chourey,9848022337,Hyderabad\n002,Ankur,Dutta,9848022338,Kolkata\n003,Shubham,Sengar,9848022339,Delhi\n004,Prerna,Tripathi,9848022330,Pune\n005,Sagar,Joshi,9848022336,Bhubaneswar\n006,Monika,sharma,9848022335,Chennai<\/pre>\n<p><span style=\"font-weight: 400\">Also, \u00a0using the LOAD operator, we have read it into a relation Employee.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Employee = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Employee_data.txt' USING PigStorage(',')\n  as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray );<\/pre>\n<p><span style=\"font-weight: 400\">Further, let\u2019s describe the relation named Employee. Then verify the schema.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; describe Employee<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Output<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">It will produce the following output, after execution of the above Pig Latin statement.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Employee: { id: int,firstname: chararray,lastname: chararray,phone: chararray,city: chararray<\/pre>\n<h5><span style=\"font-weight: 400\">c. Explanation operator<\/span><\/h5>\n<p><span style=\"font-weight: 400\">To display the logical, physical, and MapReduce execution plans of a relation, we use the explain operator.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Syntax<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">So, the syntax of the explain operator is-<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; explain Relation_name;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Example<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Let\u2019s suppose \u00a0we have a file Employee_data.txt in HDFS. Its content is:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">001,mehul,chourey,9848022337,Hyderabad\n002,Ankur,Dutta,9848022338,Kolkata\n003,Shubham,Sengar,9848022339,Delhi\n004,Prerna,Tripathi,9848022330,Pune\n005,Sagar,Joshi,9848022336,Bhubaneswar\n006,Monika,sharma,9848022335,Chennai<\/pre>\n<p><span style=\"font-weight: 400\">Also, using the LOAD operator, we have read it into a relation Employee<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Employee = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Employee_data.txt' USING PigStorage(',')\n  as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray );<\/pre>\n<p><span style=\"font-weight: 400\">Further, using the explain operator let &#8216;s explain the relation named Employee.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; explain Employee;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Output<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">It will produce the following output.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">$ explain Employee;\n2015-10-05 11:32:43,660 [main]\n2015-10-05 11:32:43,660 [main] INFO  org.apache.pig.newplan.logical.optimizer\n.LogicalPlanOptimizer -\n{RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator,\nGroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter,\nMergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer,\nPushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}  \n#-----------------------------------------------\n# New Logical Plan:\n#-----------------------------------------------\nEmployee: (Name: LOStore Schema:\nid#31:int,firstname#32:chararray,lastname#33:chararray,phone#34:chararray,city#\n35:chararray)\n|\n|---Employeet: (Name: LOForEach Schema:\nid#31:int,firstname#32:chararray,lastname#33:chararray,phone#34:chararray,city#\n35:chararray)\n   | |\n   | (Name: LOGenerate[false,false,false,false,false] Schema:\nid#31:int,firstname#32:chararray,lastname#33:chararray,phone#34:chararray,city#\n35:chararray)ColumnPrune:InputUids=[34, 35, 32, 33,\n31]ColumnPrune:OutputUids=[34, 35, 32, 33, 31]\n   | |   |\n   | |   (Name: Cast Type: int Uid: 31)\n   | |   | |  | |---id:(Name: Project Type: bytearray Uid: 31 Input: 0 Column: (*))\n   | |   |\n   | |   (Name: Cast Type: chararray Uid: 32)\n   | |   |\n   | |   |---firstname:(Name: Project Type: bytearray Uid: 32 Input: 1\nColumn: (*))\n   | |   |\n   | |   (Name: Cast Type: chararray Uid: 33)\n   | |   |\n   | |   |---lastname:(Name: Project Type: bytearray Uid: 33 Input: 2\nColumn: (*))\n   | |   |\n   | |   (Name: Cast Type: chararray Uid: 34)\n   | |   |\n   | |   |---phone:(Name: Project Type: bytearray Uid: 34 Input: 3 Column:\n(*))\n   | |   |\n   | |   (Name: Cast Type: chararray Uid: 35)\n   | |   |\n   | |   |---city:(Name: Project Type: bytearray Uid: 35 Input: 4 Column:\n(*))\n   | |\n   | |---(Name: LOInnerLoad[0] Schema: id#31:bytearray)\n   | |  \n   | |---(Name: LOInnerLoad[1] Schema: firstname#32:bytearray)\n   | |\n   | |---(Name: LOInnerLoad[2] Schema: lastname#33:bytearray)\n   | |\n   | |---(Name: LOInnerLoad[3] Schema: phone#34:bytearray)\n   | |\n   | |---(Name: LOInnerLoad[4] Schema: city#35:bytearray)\n   |\n   |---Employee: (Name: LOLoad Schema:\nid#31:bytearray,firstname#32:bytearray,lastname#33:bytearray,phone#34:bytearray\n,city#35:bytearray)RequiredFields:null\n#-----------------------------------------------\n# Physical Plan: #-----------------------------------------------\nEmployee: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-36\n|\n|---Employee: New For Each(false,false,false,false,false)[bag] - scope-35\n   | |\n   | Cast[int] - scope-21\n   | |\n   | |---Project[bytearray][0] - scope-20\n   | |  \n   | Cast[chararray] - scope-24\n   | |\n   | |---Project[bytearray][1] - scope-23\n   | |\n   | Cast[chararray] - scope-27\n   | |  \n   | |---Project[bytearray][2] - scope-26\n   | |  \n   | Cast[chararray] - scope-30\n   | |  \n   | |---Project[bytearray][3] - scope-29\n   | |\n   | Cast[chararray] - scope-33\n   | |\n   | |---Project[bytearray][4] - scope-32\n   |\n   |---Employee: Load(hdfs:\/\/localhost:9000\/pig_data\/Employee_data.txt:PigStorage(',')) - scope19\n2015-10-05 11:32:43,682 [main]\nINFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler \nFile concatenation threshold: 100 optimistic? false\n2015-10-05 11:32:43,684 [main]\nINFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOp timizer -\nMR plan size before optimization: 1 2015-10-05 11:32:43,685 [main]\nINFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.\nMultiQueryOp timizer - MR plan size after optimization: 1\n#--------------------------------------------------\n# Map Reduce Plan                                   \n#--------------------------------------------------\nMapReduce node scope-37\nMap Plan\nEmployee: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-36\n|\n|---Employee: New For Each(false,false,false,false,false)[bag] - scope-35\n   | |\n   | Cast[int] - scope-21\n   | |\n   | |---Project[bytearray][0] - scope-20\n   | |\n   | Cast[chararray] - scope-24\n   | |\n   | |---Project[bytearray][1] - scope-23\n   | |\n   | Cast[chararray] - scope-27\n   | |\n   | |---Project[bytearray][2] - scope-26\n   | |\n   | Cast[chararray] - scope-30\n   | |  \n   | |---Project[bytearray][3] - scope-29\n   | |\n   | Cast[chararray] - scope-33\n   | |\n   | |---Project[bytearray][4] - scope-32\n   |\n   |---Employee:\nLoad(hdfs:\/\/localhost:9000\/pig_data\/Employee_data.txt:PigStorage(',')) - scope\n19-------- Global sort: false\n----------------<\/pre>\n<h5><span style=\"font-weight: 400\">d. Illustration operator<\/span><\/h5>\n<p><span style=\"font-weight: 400\"> This operator gives you the step-by-step execution of a sequence of statements.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Syntax<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">So, the syntax of the illustrate operator is-<\/span><\/p>\n<p><span style=\"font-weight: 400\">grunt&gt; illustrate Relation_name;<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Example<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Let\u2019s suppose we have a file Employee_data.txt in HDFS. Its content is:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">001,mehul,chourey,9848022337,Hyderabad\n002,Ankur,Dutta,9848022338,Kolkata\n003,Shubham,Sengar,9848022339,Delhi\n004,Prerna,Tripathi,9848022330,Pune\n005,Sagar,Joshi,9848022336,Bhubaneswar\n006,Monika,sharma,9848022335,Chennai<\/pre>\n<p><span style=\"font-weight: 400\">Also, using the LOAD operator, we have read it into a relation Employee<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Employee = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Employee_data.txt' USING PigStorage(',')\n  as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray );<\/pre>\n<p><span style=\"font-weight: 400\">Further, we illustrate the relation named Employee as.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; illustrate Employee;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Output<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">We will get the following output, on executing the above statement.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; illustrate Employee;\nINFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$M ap - Aliases\nbeing processed per job phase (AliasName[line,offset]): M: Employee[1,10] C:  R:<\/pre>\n<table>\n<tbody>\n<tr>\n<td><span style=\"font-weight: 400\">Employee<\/span><\/td>\n<td><span style=\"font-weight: 400\">id:int<\/span><\/td>\n<td><span style=\"font-weight: 400\">firstname:chararray <\/span><\/td>\n<td><span style=\"font-weight: 400\">lastname:chararray<\/span><\/td>\n<td><span style=\"font-weight: 400\">phone:chararray <\/span><\/td>\n<td><span style=\"font-weight: 400\">city:chararray <\/span><\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><span style=\"font-weight: 400\">002<\/span><\/td>\n<td><span style=\"font-weight: 400\">Ankur<\/span><\/td>\n<td><span style=\"font-weight: 400\">Dutta<\/span><\/td>\n<td><span style=\"font-weight: 400\">98458022338<\/span><\/td>\n<td><span style=\"font-weight: 400\">Kolkata<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h4><span style=\"font-weight: 400\">ii. Grouping &amp; Joining:\u00a0Apache Pig Operators<\/span><\/h4>\n<p><span style=\"font-weight: 400\">There are 4 types of Grouping and Joining Operators. Such as:<\/span><\/p>\n<ol>\n<li style=\"list-style-type: none\">\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Group Operator<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<ol>\n<li style=\"list-style-type: none\">\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Cogroup Operator<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<ol>\n<li style=\"list-style-type: none\">\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Join Operator<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Cross operator<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400\">Let\u2019s discuss them in depth:<\/span><\/p>\n<h5><span style=\"font-weight: 400\">a. Group Operator<\/span><\/h5>\n<p><span style=\"font-weight: 400\">To group the data in one or more relations, we use the GROUP operator.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Syntax<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">So, the syntax of the group operator is:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Group_data = GROUP Relation_name BY age;<\/pre>\n<ul>\n<li style=\"font-weight: 400\">\n<h5><span style=\"font-weight: 400\">Example<\/span><\/h5>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Let\u2019s suppose that we have a file named Employee_details.txt in the HDFS directory \/pig_data\/.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">Employee_details.txt\n001,mehul,chourey,21,9848022337,Hyderabad\n002,Ankur,Dutta,22,9848022338,Kolkata\n003,Shubham,Sengar,22,9848022339,Delhi\n004,Prerna,Tripathi,21,9848022330,Pune\n005,Sagar,Joshi,23,9848022336,Bhubaneswar\n006,Monika,sharma,23,9848022335,Chennai\n007,pulkit,pawar,24,9848022334,trivandrum\n008,Roshan,Shaikh,24,9848022333,Chennai<\/pre>\n<p><span style=\"font-weight: 400\">Also, with the relation name Employee_details, we have loaded this file into Apache Pig.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Employee_details = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Employee_details.txt' USING PigStorage(',')\n  as (id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray);<\/pre>\n<p><span style=\"font-weight: 400\">Further, let\u2019s group the records\/tuples in the relation by age.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; group_data = GROUP Employee_details by age;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Verification<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Then, using the DUMP operator, verify the relation group_data.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump group_data;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Output<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Hence, we will get output displaying the contents of the relation named group_data. We can observe that the resulting schema has two columns \u2212<\/span><\/p>\n<p><span style=\"font-weight: 400\">First is age. That groups the relation.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Second is a bag. That contains the group of tuples, Employee records with the respective age.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">(21,{(4,Prerna,Tripathi,21,9848022330,Pune),(1,mehul,chourey,21,9848022337,Hyderabad)})\n(22,{(3,Shubham,Sengar,22,9848022339,Delhi),(2,Ankur,Dutta,22,984802233 8,Kolkata)})\n(23,{(6,Monika,sharma,23,9848022335,Chennai),(5,Sagar,Joshi,23,9848022336 ,Bhubaneswar)})\n(24,{(8,Roshan,Shaikh,24,9848022333,Chennai),(7,pulkit,pawar,24,9848022334, trivandrum)})<\/pre>\n<p><span style=\"font-weight: 400\">Thus, after grouping the data using the describe command see the schema of the table.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Describe group_data;\ngroup_data: {group: int,Employee_details: {(id: int,firstname: chararray,\n              lastname: chararray,age: int,phone: chararray,city: chararray)}}<\/pre>\n<p><span style=\"font-weight: 400\">Similarly, using the illustrate command we can get the sample illustration of the schema.<\/span><\/p>\n<p><span style=\"font-weight: 400\">$ Illustrate group_data;<\/span><\/p>\n<p><span style=\"font-weight: 400\">The output is \u2212<\/span><\/p>\n<table>\n<tbody>\n<tr>\n<td><span style=\"font-weight: 400\">group_data<\/span><\/td>\n<td><span style=\"font-weight: 400\">group:int<\/span><\/td>\n<td><span style=\"font-weight: 400\">Employee_details:bag{:tuple(id:int,firstname:chararray,lastname:<\/span><\/p>\n<p><span style=\"font-weight: 400\">chararray,age:int,phone:chararray,city:chararray)}<\/span><\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><span style=\"font-weight: 400\">21<\/span><\/td>\n<td><span style=\"font-weight: 400\">{ 4, Prerna,Tripathi, 21, 9848022330, Pune), (1, mehul,chourey, 21, 9848022337, Hyderabad)}<\/span><\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><span style=\"font-weight: 400\">22<\/span><\/td>\n<td><span style=\"font-weight: 400\">{(2,Ankur,Dutta,22,9848022338,Kolkata),(003,Shubham,Sengar,22,9848022339,Delhi)}<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<ul>\n<li><span style=\"font-weight: 400\">Grouping by Multiple Columns<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Further, let\u2019s group the relation by age and city.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; group_multiple = GROUP Employee_details by (age, city);<\/pre>\n<p><span style=\"font-weight: 400\">Now, using the Dump operator, we can verify the content of the relation named group_multiple.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump group_multiple;\n((21,Pune),{(4,Prerna,Tripathi,21,9848022330,Pune)})\n((21,Hyderabad),{(1,Mehul,Chourey,21,9848022337,Hyderabad)})\n((22,Delhi),{(3,Shubham,Sengar,22,9848022339,Delhi)}\n((22,Kolkata),{(2,Ankur,Dutta,22,9848022338,Kolkata)})\n((23,Chennai),{(6,Monika,Sharma,23,9848022335,Chennai)})\n((23,Bhubaneswar),{(5,Sagar,Joshi,23,9848022336,Bhubaneswar)})\n((24,Chennai),{(8,Roshan,Shaikh,24,9848022333,Chennai)})\n(24,trivandrum),{(7,Pulkit,Pawar,24,9848022334,trivandrum)})<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Group All<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">We can group a relation by all the columns.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; group_all = GROUP Employee_details All;<\/pre>\n<p><span style=\"font-weight: 400\">Hence, verify the content of the relation group_all.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump group_all;\n(all,{(8,Roshan,Shaikh,24,9848022333,Chennai),(7,pulkit,Pawar,24,9848022334 ,trivandrum),\n(6,Monika,Sharma,23,9848022335,Chennai),(5,Sagar,Joshi,23,9848022336,Bhubaneswar),\n(4,Prerna,Tripathi,21,9848022330,Pune),(3,Shubham,Sengar,22,9848022339,Delhi),\n(2,Ankur,Dutta,22,9848022338,Kolkata),(1,Mehul,Chourey,21,9848022337,Hyderabad)})<\/pre>\n<h5><span style=\"font-weight: 400\">b. Cogroup Operator<\/span><\/h5>\n<p><span style=\"font-weight: 400\">It works more or less in the same way as the GROUP operator. At one point they differentiate that we normally use the group operator with one relation, whereas, we use the cogroup operator in statements involving two or more relations.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Grouping Two Relations using Cogroup<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Let\u2019s suppose we have two files namely Employee_details.txt and Clients_details.txt in the HDFS directory \/pig_data\/. <\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">Employee_details.txt\n001,mehul,chourey,21,9848022337,Hyderabad\n002,Ankur,Dutta,22,9848022338,Kolkata\n003,Shubham,Sengar,22,9848022339,Delhi\n004,Prerna,Tripathi,21,9848022330,Pune\n005,Sagar,Joshi,23,9848022336,Bhubaneswar\n006,Monika,sharma,23,9848022335,Chennai\n007,pulkit,pawar,24,9848022334,trivandrum\n008,Roshan,Shaikh,24,9848022333,Chennai\nClients_details.txt\n001,Kajal,22,new york\n002,Vaishnavi,23,Kolkata\n003,Twinkle,23,Tokyo\n004,Manish,25,London\n005,Purva,23,Bhubaneswar\n006,Vishal,22,Chennai<\/pre>\n<p><span style=\"font-weight: 400\">Also, with the relation names Employee_details and Clients_details respectively we have loaded these files into Pig.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Employee_details = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Employee_details.txt' USING PigStorage(',')\n  as (id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray);\ngrunt&gt; Clients_details = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Clients_details.txt' USING PigStorage(',')\n  as (id:int, name:chararray, age:int, city:chararray);<\/pre>\n<p><span style=\"font-weight: 400\">Hence, with the key age, let\u2019s group the records\/tuples of the relations Employee_details and Clients_details.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; cogroup_data = COGROUP Employee_details by age, Clients_details by age;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Verification<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Using the DUMP operator, Verify the relation cogroup_data.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump cogroup_data;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Output<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Now, displaying the contents of the relation named cogroup_data, it will produce the following output.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">(21,{(4,Prerna,Tripathi,21,9848022330,Pune), (1,Mehul,chourey,21,9848022337,Hyderabad)},\n  {  })\n(22,{ (3,Shubham,Sengar,22,9848022339,Delhi), (2,Ankur,Dutta,22,9848022338,Kolkata) },  \n  { (6,Vishal,22,Chennai),(1,Kajal,22,new york) })  \n(23,{(6,Monika,Sharma,23,9848022335,Chennai),(5,Sagar,Joshi,23,9848022336 ,Bhubaneswar)},\n  {(5,Purva,23,Bhubaneswar),(3,Twinkle,23,Tokyo),(2,Vaishnavi,23,Kolkata)})\n(24,{(8,Roshan,Shaikh,24,9848022333,Chennai),(7,Pulkit,Pawar,24,9848022334, trivandrum)},\n  { })  \n(25,{   },\n  {(4,Manish,25,London)})<\/pre>\n<p><span style=\"font-weight: 400\">So, here, cogroup operator groups the tuples from each relation according to age. Where each group depicts a particular age value.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Let\u2019s understand it with an example. Since we consider the 1st tuple of the result, it is grouped by age 21. It contains two bags \u2212<\/span><\/p>\n<p><span style=\"font-weight: 400\">One bag holds all the tuples from the first relation (Employee_details in this case) having age 21.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Another bag contains all the tuples from the second relation (Clients_details in this case) having age 21.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Moreover, it returns an empty bag, in case a relation doesn\u2019t have tuples having the age value 21.<\/span><\/p>\n<h4>c. Join Operator<\/h4>\n<p><span style=\"font-weight: 400\">Basically, to combine records from two or more relations, we use the JOIN operator. Moreover, we declare one (or a group of) tuple(s) from each relation, as keys, while performing a join operation. <\/span><\/p>\n<p><span style=\"font-weight: 400\">However, make sure, \u00a0the two particular tuples are matched, when these keys match, else the records are dropped. <\/span><\/p>\n<p><span style=\"font-weight: 400\">There are several types of Joins. Such as \u2212<\/span><\/p>\n<ol>\n<li style=\"list-style-type: none\">\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Self-join<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<ol>\n<li style=\"list-style-type: none\">\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Inner-join<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Outer-join \u2212 left join, right join, and full join<\/span><\/li>\n<\/ol>\n<h5><span style=\"font-weight: 400\">d. Cross Operator<\/span><\/h5>\n<p><span style=\"font-weight: 400\">It computes the cross-product of two or more relations. <\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Syntax<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">So, the syntax of the CROSS operator.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Relation3_name = CROSS Relation1_name, Relation2_name;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Example<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Let\u2019s suppose we have two files namely Users.txt and orders.txt in the \/pig_data\/ directory of HDFS Users.txt<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">1,Sanjeev,32,Ahmedabad,2000.00\n2,Ankit,25,Delhi,1500.00\n3,Raj,23,Kota,2000.00\n4,Sumit,25,Mumbai,6500.00\n5,Pankaj,27,Bhopal,8500.00\n6,Vishnu,22,MP,4500.00\n7,Ravi,24,Indore,10000.00\norders.txt\n102,2009-10-08 00:00:00,3,3000\n100,2009-10-08 00:00:00,3,1500\n101,2009-11-20 00:00:00,2,1560\n103,2008-05-20 00:00:00,4,2060<\/pre>\n<p><span style=\"font-weight: 400\">Also, with the relations Users and orders, we have loaded these two files into Pig.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Users = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Users.txt' USING PigStorage(',')\n  as (id:int, name:chararray, age:int, address:chararray, salary:int);\ngrunt&gt; orders = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/orders.txt' USING PigStorage(',')\n  as (oid:int, date:chararray, customer_id:int, amount:int);<\/pre>\n<p><span style=\"font-weight: 400\">Using the cross operator on these two relations, let\u2019s get the cross-product of these two relations.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; cross_data = CROSS Users, orders;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Verification<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Now, using the DUMP operator, verify the relation cross_data.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump cross_data;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Output<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">displaying the contents of the relation cross_data, it will produce the following output.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">(7,Ravi,24,Indore,10000,103,2008-05-20 00:00:00,4,2060)\n(7,Ravi,24,Indore,10000,101,2009-11-20 00:00:00,2,1560)\n(7,Ravi,24,Indore,10000,100,2009-10-08 00:00:00,3,1500)\n(7,Ravi,24,Indore,10000,102,2009-10-08 00:00:00,3,3000)\n(6,Vishnu,22,MP,4500,103,2008-05-20 00:00:00,4,2060)\n(6,Vishnu,22,MP,4500,101,2009-11-20 00:00:00,2,1560)\n(6,Vishnu,22,MP,4500,100,2009-10-08 00:00:00,3,1500)\n(6,Vishnu,22,MP,4500,102,2009-10-08 00:00:00,3,3000)\n(5,Pankaj,27,Bhopal,8500,103,2008-05-20 00:00:00,4,2060)\n(5,Pankaj,27,Bhopal,8500,101,2009-11-20 00:00:00,2,1560)\n(5,Pankaj,27,Bhopal,8500,100,2009-10-08 00:00:00,3,1500)\n(5,Pankaj,27,Bhopal,8500,102,2009-10-08 00:00:00,3,3000)\n(4,Sumit,25,Mumbai,6500,103,2008-05-20 00:00:00,4,2060)\n(4,Sumit,25,Mumbai,6500,101,2009-20 00:00:00,4,2060)\n(2,Ankit,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560)\n(2,Ankit,25,Delhi,1500,100,2009-10-08 00:00:00,3,1500)\n(2,Ankit,25,Delhi,1500,102,2009-10-08 00:00:00,3,3000)\n(1,Sanjeev,32,Ahmedabad,2000,103,2008-05-20 00:00:00,4,2060)\n(1,Sanjeev,32,Ahmedabad,2000,101,2009-11-20 00:00:00,2,1560)\n(1,Sanjeev,32,Ahmedabad,2000,100,2009-10-08 00:00:00,3,1500)\n(1,Sanjeev,32,Ahmedabad,2000,102,2009-10-08 00:00:00,3,3000)-11-20 00:00:00,2,1560)\n(4,Sumit,25,Mumbai,6500,100,2009-10-08 00:00:00,3,1500)\n(4,Sumit,25,Mumbai,6500,102,2009-10-08 00:00:00,3,3000)\n(3,Raj,23,Kota,2000,103,2008-05-20 00:00:00,4,2060)\n(3,Raj,23,Kota,2000,101,2009-11-20 00:00:00,2,1560)\n(3,Raj,23,Kota,2000,100,2009-10-08 00:00:00,3,1500)\n(3,Raj,23,Kota,2000,102,2009-10-08 00:00:00,3,3000)\n(2,Ankit,25,Delhi,1500,103,2008-05-20 00:00:00,4,2060)\n(2,Ankit,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560)\n(2,Ankit,25,Delhi,1500,100,2009-10-08 00:00:00,3,1500)\n(2,Ankit,25,Delhi,1500,102,2009-10-08 00:00:00,3,3000)\n(1,Sanjeev,32,Ahmedabad,2000,103,2008-05-20 00:00:00,4,2060)\n(1,Sanjeev,32,Ahmedabad,2000,101,2009-11-20 00:00:00,2,1560)\n(1,Sanjeev,32,Ahmedabad,2000,100,2009-10-08 00:00:00,3,1500)\n(1,Sanjeev,32,Ahmedabad,2000,102,2009-10-08 00:00:00,3,3000)<\/pre>\n<h4><span style=\"font-weight: 400\">iii. Combining &amp; Splitting:\u00a0Apache Pig Operators<\/span><\/h4>\n<p><span style=\"font-weight: 400\">These are of two types-<\/span><\/p>\n<ol>\n<li style=\"list-style-type: none\">\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Union<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Split<\/span><\/li>\n<\/ol>\n<h5><span style=\"font-weight: 400\">a. Union Operator<\/span><\/h5>\n<p><span style=\"font-weight: 400\">To merge the content of two relations, we use the UNION operator of Pig Latin. Also, make sure, to perform UNION operation on two relations, their columns and domains must be identical.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Syntax<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">So, the syntax of the UNION operator.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Relation_name3 = UNION Relation_name1, Relation_name2;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Example<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Let\u2019s suppose we have two files namely Employee_data1.txt and Employee_data2.txt in the \/pig_data\/ directory of HDFS.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">Employee_data1.txt\n001,mehul,chourey,9848022337,Hyderabad\n002,Ankur,Dutta,9848022338,Kolkata\n003,Shubham,Sengar,9848022339,Delhi\n004,Prerna,Tripathi,9848022330,Pune\n005,Sagar,Joshi,9848022336,Bhubaneswar\n006,Monika,sharma,9848022335,Chennai\nEmployee_data2.txt\n7,Prachi,Yadav,9848022334,trivendram.\n8,Avikal,Singh,9848022333,Chennai.<\/pre>\n<p><span style=\"font-weight: 400\">Also, with the relations Employee1 and Employee2 we have loaded these two files into Pig.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Employee1 = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Employee_data1.txt' USING PigStorage(',')\n  as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray);\ngrunt&gt; Employee2 = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Employee_data2.txt' USING PigStorage(',')\n  as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray);<\/pre>\n<p><span style=\"font-weight: 400\">Using the UNION operator, let\u2019s now merge the contents of these two relations.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Employee = UNION Employee1, Employee2;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Verification<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Now, using the DUMP operator, verify the relation Employee.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump Employee;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Output<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Now, displaying the contents of the relation Employee, it will display the following output.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">(1,mehul,chourey,9848022337,Hyderabad\n(2,Ankur,Dutta,9848022338,Kolkata)\n(3,Shubham,Sengar,9848022339,Delhi)\n(4,Prerna,Tripathi,9848022330,Pune)\n(5,Sagar,Joshi,9848022336,Bhubaneswar)\n(6,Monika,Sharma,9848022335,Chennai)\n(7,Prachi,Yadav,9848022334,trivendram)\n(8,Avikal,Singh,9848022333,Chennai)<\/pre>\n<h5><span style=\"font-weight: 400\">b. Split Operator<\/span><\/h5>\n<p><span style=\"font-weight: 400\">To split a relation into two or more relations, we use the SPLIT operator is used.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Syntax<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">So, the syntax of the SPLIT operator is-<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; SPLIT Relation1_name INTO Relation2_name IF (condition1), Relation2_name (condition2),<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Example<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Let\u2019s suppose that we have a file named Employee_details.txt in the HDFS directory \/pig_data\/ as shown below.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">Employee_details.txt\n001,mehul,chourey,21,9848022337,Hyderabad\n002,Ankur,Dutta,22,9848022338,Kolkata\n003,Shubham,Sengar,22,9848022339,Delhi\n004,Prerna,Tripathi,21,9848022330,Pune\n005,Sagar,Joshi,23,9848022336,Bhubaneswar\n006,Monika,sharma,23,9848022335,Chennai\n007,pulkit,pawar,24,9848022334,trivandrum\n008,Roshan,Shaikh,24,9848022333,Chennai<\/pre>\n<p><span style=\"font-weight: 400\">Also, with the relation name Employee_details, we have loaded this file into Pig.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">Employee_details = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Employee_details.txt' USING PigStorage(',')\n  as (id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray);<\/pre>\n<p><span style=\"font-weight: 400\">Now, Let\u2019s split the relation into two,<\/span><\/p>\n<p><span style=\"font-weight: 400\">First listing the employees of age less than 23, <\/span><\/p>\n<p><span style=\"font-weight: 400\">Second listing the employees having the age between 22 and 25.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">SPLIT Employee_details into Employee_details1 if age&lt;23, Employee_details2 if (22&lt;age and age&gt;25);<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Verification<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Using the DUMP operator, Verify the relations Employee_details1 and Employee_details2.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump Employee\ngrunt&gt; Dump Employee_details2;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Output<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">By displaying the contents of the relations Employee_details1 and Employee_details2 respectively, it will produce the following output.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump Employee_details1;\n(1,mehul,chourey,21,9848022337,Hyderabad)\n(2,Ankur,Dutta,22,9848022338,Kolkata)\n(3,Shubham,Sengar,22,9848022339,Delhi)\n(4,Prerna,Tripathi,21,9848022330,Pune)\ngrunt&gt; Dump Employee_details2;\n(5,Sagar,Joshi,23,9848022336,Bhubaneswar)\n(6,Monika,sharma,23,9848022335,Chennai)\n(7,pulkit,pawar,24,9848022334,trivandrum)\n(8,Roshan,Shaikh,24,9848022333,Chennai)<\/pre>\n<h4><span style=\"font-weight: 400\">iv. Filtering:\u00a0Apache Pig Operators<\/span><\/h4>\n<p><span style=\"font-weight: 400\">These are of 3 types;<\/span><\/p>\n<ol>\n<li style=\"list-style-type: none\">\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Filter<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<ol>\n<li style=\"list-style-type: none\">\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Distinct<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">For Each<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400\">Now, let\u2019s discuss, each in detail:<\/span><\/p>\n<h5><span style=\"font-weight: 400\">A. Filter Operator<\/span><\/h5>\n<p><span style=\"font-weight: 400\">To select the required tuples from a relation based on a condition, we use the FILTER operator.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Syntax<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">So the syntax of the FILTER operator is<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Relation2_name = FILTER Relation1_name BY (condition);<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Example<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Let\u2019s suppose we have a file named Employee_details.txt in the HDFS directory \/pig_data\/<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">Employee_details.txt\n001,mehul,chourey,21,9848022337,Hyderabad\n002,Ankur,Dutta,22,9848022338,Kolkata\n003,Shubham,Sengar,22,9848022339,Delhi\n004,Prerna,Tripathi,21,9848022330,Pune\n005,Sagar,Joshi,23,9848022336,Bhubaneswar\n006,Monika,sharma,23,9848022335,Chennai\n007,pulkit,pawar,24,9848022334,trivandrum\n008,Roshan,Shaikh,24,9848022333,Chennai<\/pre>\n<p><span style=\"font-weight: 400\">Also, with the relation name Employee_details we have loaded this file into Pig.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Employee_details = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Employee_details.txt' USING PigStorage(',')\n  as (id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray);<\/pre>\n<p><span style=\"font-weight: 400\">Now, to get the details of the Employee who belong to the city Chennai, let \u2019s use the Filter operator.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">filter_data = FILTER Employee_details BY city == 'Chennai';<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Verification<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Using the DUMP operator, verify the relation filter_data.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump filter_data;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Output<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">By, displaying the contents of the relation filter_data, it will produce the following output<\/span><\/p>\n<p><span style=\"font-weight: 400\">(6,Monika,Sharma,23,9848022335,Chennai)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(8,Roshan,Shaikh,24,9848022333,Chennai)<\/span><\/p>\n<h5><span style=\"font-weight: 400\">b. The DISTINCT operator<\/span><\/h5>\n<p><span style=\"font-weight: 400\">To remove redundant (duplicate) tuples from a relation, we use the DISTINCT operator.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Syntax<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">So, \u00a0the syntax of the DISTINCT operator is:<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Relation_name2 = DISTINCT Relatin_name1;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Example<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Let\u2019s suppose that we have a file named Employee_details.txt in the HDFS directory \/pig_data\/ as shown below.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">Employee_details.txt\n001,mehul,chourey,9848022337,Hyderabad\n002,Ankur,Dutta,9848022338,Kolkata\n002,Ankur,Dutta,9848022338,Kolkata\n003,Shubham,Sengar,9848022339,Delhi\n003,Shubham,Sengar,9848022339,Delhi\n004,Prerna,Tripathi,9848022330,Pune\n005,Sagar,Joshi,9848022336,Bhubaneswar\n006,Monika,sharma,9848022335,Chennai\n006,Monika,sharma,9848022335,Chennai<\/pre>\n<p><span style=\"font-weight: 400\">Also, with the relation name Employee_details, we have loaded this file into Pig<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Employee_details = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Employee_details.txt' USING PigStorage(',')\n  as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray);<\/pre>\n<p><span style=\"font-weight: 400\">Now, now using the DISTINCT operator remove the redundant (duplicate) tuples from the relation named Employee_details. Also, \u00a0store it in another relation named distinct_data.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; distinct_data = DISTINCT Employee_details;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Verification<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Using the DUMP operator, verify the relation distinct_data.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump distinct_data;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Output<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">By displaying the contents of the relation distinct_data, it will produce the following output.<\/span><\/p>\n<p><span style=\"font-weight: 400\">(1,mehul,chourey,21,9848022337,Hyderabad) <\/span><\/p>\n<p><span style=\"font-weight: 400\">(2,Ankur,Dutta,22,9848022338,Kolkata)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(3,Shubham,Sengar,22,9848022339,Delhi) <\/span><\/p>\n<p><span style=\"font-weight: 400\">(4,Prerna,Tripathi,21,9848022330,Pune)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(5,Sagar,Joshi,23,9848022336,Bhubaneswar)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(6,Monika,sharma,23,9848022335,Chennai)<\/span><\/p>\n<h5><span style=\"font-weight: 400\">c. Filtering Operators<\/span><\/h5>\n<p><span style=\"font-weight: 400\">To generate specified data transformations based on the column data, we use the FOREACH operator.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Syntax<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">So, the syntax of FOREACH operator.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Relation_name2 = FOREACH Relatin_name1 GENERATE (required data);<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Example<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Let\u2019s suppose we have a file named Employee_details.txt in the HDFS directory \/pig_data\/. <\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">Employee_details.txt\n001,mehul,chourey,21,9848022337,Hyderabad\n002,Ankur,Dutta,22,9848022338,Kolkata\n003,Shubham,Sengar,22,9848022339,Delhi\n004,Prerna,Tripathi,21,9848022330,Pune\n005,Sagar,Joshi,23,9848022336,Bhubaneswar\n006,Monika,sharma,23,9848022335,Chennai\n007,pulkit,pawar,24,9848022334,trivandrum\n008,Roshan,Shaikh,24,9848022333,Chennai<\/pre>\n<p><span style=\"font-weight: 400\">Also, with the relation name Employee_details, we have loaded this file into Pig.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Employee_details = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Employee_details.txt' USING PigStorage(',')\n  as (id:int, firstname:chararray, lastname:chararray,age:int, phone:chararray, city:chararray);<\/pre>\n<p><span style=\"font-weight: 400\">Now, using the foreach operator, let us now get the id, age, and city values of each Employee from the relation Employee_details and store it into another relation named foreach_data.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; foreach_data = FOREACH Employee_details GENERATE id,age,city;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Verification<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Also, using the DUMP operator, verify the relation foreach_data.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump foreach_data;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Output<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">By displaying the contents of the relation foreach_data, it will produce the following output.<\/span><\/p>\n<p><span style=\"font-weight: 400\">(1,21,Hyderabad)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(2,22,Kolkata)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(3,22,Delhi)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(4,21,Pune) <\/span><\/p>\n<p><span style=\"font-weight: 400\">(5,23,Bhubaneswar)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(6,23,Chennai) <\/span><\/p>\n<p><span style=\"font-weight: 400\">(7,24,trivandrum)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(8,24,Chennai) <\/span><\/p>\n<h4><span style=\"font-weight: 400\">v. Sorting:\u00a0Apache Pig Operators<\/span><\/h4>\n<p><span style=\"font-weight: 400\">These are of two types, <\/span><\/p>\n<ol>\n<li style=\"list-style-type: none\">\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Order By<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Limit<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400\">Let\u2019s discuss both in detail:<\/span><\/p>\n<h5><span style=\"font-weight: 400\">a. ORDER BY operator<\/span><\/h5>\n<p><span style=\"font-weight: 400\">To display the contents of a relation in a sorted order based on one or more fields, we use the ORDER BY operator.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Syntax<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">So, the syntax of the ORDER BY operator is-<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Relation_name2 = ORDER Relatin_name1 BY (ASC|DESC);<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Example<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Let\u2019s suppose we have a file named Employee_details.txt in the HDFS directory \/pig_data\/.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">Employee_details.txt\n\n001,mehul,chourey,21,9848022337,Hyderabad\n\n002,Ankur,Dutta,22,9848022338,Kolkata\n\n003,Shubham,Sengar,22,9848022339,Delhi\n\n004,Prerna,Tripathi,21,9848022330,Pune\n\n005,Sagar,Joshi,23,9848022336,Bhubaneswar\n\n006,Monika,sharma,23,9848022335,Chennai\n\n007,pulkit,pawar,24,9848022334,trivandrum\n\n008,Roshan,Shaikh,24,9848022333,Chennai<\/pre>\n<p><span style=\"font-weight: 400\">Also with the relation name Employee_details, we have loaded this file into Pig.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Employee_details = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Employee_details.txt' USING PigStorage(',')\n  as (id:int, firstname:chararray, lastname:chararray,age:int, phone:chararray, city:chararray);<\/pre>\n<p><span style=\"font-weight: 400\">Now, on the basis of the age of the Employee let\u2019s sort the relation in a descending order. Then using the ORDER BY operator store it into another relation named order_by_data.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; order_by_data = ORDER Employee_details BY age DESC;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Verification<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Further, using the DUMP operator verify the relation order_by_data.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump order_by_data;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Output<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">By displaying the contents of the relation order_by_data, it will produce the following output.<\/span><\/p>\n<p><span style=\"font-weight: 400\">(8,Roshan,Shaikh,24,9848022333,Chennai)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(7,pulkit,pawar,24,9848022334,trivandrum)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(6,Monika,sharma,23,9848022335,Chennai) <\/span><\/p>\n<p><span style=\"font-weight: 400\">(5,Sagar,Joshi,23,9848022336,Bhubaneswar)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(3,Shubham,Sengar,22,9848022339,Delhi) <\/span><\/p>\n<p><span style=\"font-weight: 400\">(2,Ankur,Dutta,22,9848022338,Kolkata)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(4,Prerna,Tripathi,21,9848022330,Pune) <\/span><\/p>\n<p><span style=\"font-weight: 400\">(1,Mehul,Chourey,21,9848022337,Hyderabad)<\/span><\/p>\n<h5><span style=\"font-weight: 400\">b. LIMIT operator <\/span><\/h5>\n<p><span style=\"font-weight: 400\">In order to get a limited number of tuples from a relation, we use the LIMIT operator.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Syntax<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">So, the syntax of the LIMIT operator is-<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Result = LIMIT Relation_name required number of tuples;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Example<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Assume that we have a file named Employee_details.txt in the HDFS directory \/pig_data\/ as shown below.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">Employee_details.txt\n001,mehul,chourey,21,9848022337,Hyderabad\n002,Ankur,Dutta,22,9848022338,Kolkata\n003,Shubham,Sengar,22,9848022339,Delhi\n004,Prerna,Tripathi,21,9848022330,Pune\n005,Sagar,Joshi,23,9848022336,Bhubaneswar\n006,Monika,sharma,23,9848022335,Chennai\n007,pulkit,pawar,24,9848022334,trivandrum\n008,Roshan,Shaikh,24,9848022333,Chennai<\/pre>\n<p><span style=\"font-weight: 400\">Also, with the relation name Employee_details, we have loaded this file into Pig.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Employee_details = LOAD 'hdfs:\/\/localhost:9000\/pig_data\/Employee_details.txt' USING PigStorage(',')\n  as (id:int, firstname:chararray, lastname:chararray,age:int, phone:chararray, city:chararray);<\/pre>\n<p><span style=\"font-weight: 400\">Now, on the basis of age of the Employee let\u2019s sort the relation in descending order. Then using the ORDER BY operator store it into another relation named limit_data. <\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; limit_data = LIMIT Employee_details 4;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Verification<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Further, using the DUMP operator, verify the relation limit_data.<\/span><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">grunt&gt; Dump limit_data;<\/pre>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Output<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">By displaying the contents of the relation limit_data, it will produce the following output.<\/span><\/p>\n<p><span style=\"font-weight: 400\">(1,mehul,chourey,21,9848022337,Hyderabad) <\/span><\/p>\n<p><span style=\"font-weight: 400\">(2,Ankur,Dutta,22,9848022338,Kolkata)<\/span><\/p>\n<p><span style=\"font-weight: 400\">(3,Shubham,Sengar,22,9848022339,Delhi) <\/span><\/p>\n<p><span style=\"font-weight: 400\">(4,Prerna,Tripathi,21,9848022330,Pune)<\/span><\/p>\n<p>This was all on Apache Pig Operators.<\/p>\n<h3>Conclusion: Apache Pig Operators<\/h3>\n<p><span style=\"font-weight: 400\">As a result, we have seen all the Apache Pig Operators in detail, along with their Examples. However, if any query occurs, feel free to share.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>There is a huge set of Apache Pig Operators available in Apache Pig. In this article, \u201cIntroduction to Apache Pig Operators\u201d we will discuss all types of Apache Pig Operators in detail. Such as&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":73041,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[544],"tags":[1002,1003,1004,1005,1006,1007],"class_list":["post-1637","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-hadoop","tag-apache-pig-operators-tutorial","tag-describe-operator","tag-dump-operator","tag-explanation-operator","tag-introduction-to-apache-pig-operators","tag-types-of-pig-operators"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.7 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Apache Pig Operators with Syntax and Examples - TechVidvan<\/title>\n<meta name=\"description\" content=\"Apache Pig Operators: Learn Pig Operators and Types of operators in Pig along like Dump Operators, Describe Operators, Explanations Pig Operators.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Apache Pig Operators with Syntax and Examples - TechVidvan\" \/>\n<meta property=\"og:description\" content=\"Apache Pig Operators: Learn Pig Operators and Types of operators in Pig along like Dump Operators, Describe Operators, Explanations Pig Operators.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/\" \/>\n<meta property=\"og:site_name\" content=\"TechVidvan\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/TechVidvan\/\" \/>\n<meta property=\"article:published_time\" content=\"2018-04-21T10:39:33+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Apache-Pig-Operators-01.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"TechVidvan Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:site\" content=\"@vidvantech\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"TechVidvan Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"14 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Apache Pig Operators with Syntax and Examples - TechVidvan","description":"Apache Pig Operators: Learn Pig Operators and Types of operators in Pig along like Dump Operators, Describe Operators, Explanations Pig Operators.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/","og_locale":"en_US","og_type":"article","og_title":"Apache Pig Operators with Syntax and Examples - TechVidvan","og_description":"Apache Pig Operators: Learn Pig Operators and Types of operators in Pig along like Dump Operators, Describe Operators, Explanations Pig Operators.","og_url":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/","og_site_name":"TechVidvan","article_publisher":"https:\/\/www.facebook.com\/TechVidvan\/","article_published_time":"2018-04-21T10:39:33+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Apache-Pig-Operators-01.jpg","type":"image\/jpeg"}],"author":"TechVidvan Team","twitter_card":"summary_large_image","twitter_creator":"@vidvantech","twitter_site":"@vidvantech","twitter_misc":{"Written by":"TechVidvan Team","Est. reading time":"14 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/#article","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/"},"author":{"name":"TechVidvan Team","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22"},"headline":"Apache Pig Operators with Syntax and Examples","datePublished":"2018-04-21T10:39:33+00:00","mainEntityOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/"},"wordCount":2150,"commentCount":0,"publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Apache-Pig-Operators-01.jpg","keywords":["Apache Pig Operators Tutorial","Describe operator","Dump Operator","Explanation operator","Introduction to Apache Pig Operators","Types of Pig Operators"],"articleSection":["Hadoop Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/","url":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/","name":"Apache Pig Operators with Syntax and Examples - TechVidvan","isPartOf":{"@id":"https:\/\/techvidvan.com\/tutorials\/#website"},"primaryImageOfPage":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/#primaryimage"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/#primaryimage"},"thumbnailUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Apache-Pig-Operators-01.jpg","datePublished":"2018-04-21T10:39:33+00:00","description":"Apache Pig Operators: Learn Pig Operators and Types of operators in Pig along like Dump Operators, Describe Operators, Explanations Pig Operators.","breadcrumb":{"@id":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/#primaryimage","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Apache-Pig-Operators-01.jpg","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2019\/11\/Apache-Pig-Operators-01.jpg","width":1200,"height":628},{"@type":"BreadcrumbList","@id":"https:\/\/techvidvan.com\/tutorials\/apache-pig-operators\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/techvidvan.com\/tutorials\/"},{"@type":"ListItem","position":2,"name":"Apache Pig Operators with Syntax and Examples"}]},{"@type":"WebSite","@id":"https:\/\/techvidvan.com\/tutorials\/#website","url":"https:\/\/techvidvan.com\/tutorials\/","name":"TechVidvan Blogs","description":"","publisher":{"@id":"https:\/\/techvidvan.com\/tutorials\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/techvidvan.com\/tutorials\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/techvidvan.com\/tutorials\/#organization","name":"TechVidvan","url":"https:\/\/techvidvan.com\/tutorials\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/","url":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","contentUrl":"https:\/\/techvidvan.com\/tutorials\/wp-content\/uploads\/2024\/03\/techvidvan-logo-200x50-1.webp","width":200,"height":50,"caption":"TechVidvan"},"image":{"@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/TechVidvan\/","https:\/\/x.com\/vidvantech"]},{"@type":"Person","@id":"https:\/\/techvidvan.com\/tutorials\/#\/schema\/person\/e9c26e74dd3d87421f7ada9433b8cd22","name":"TechVidvan Team","description":"The TechVidvan Team delivers practical, beginner-friendly tutorials on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our experts are here to help you upskill and excel in today\u2019s tech industry."}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/1637","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/comments?post=1637"}],"version-history":[{"count":0,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/posts\/1637\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media\/73041"}],"wp:attachment":[{"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/media?parent=1637"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/categories?post=1637"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techvidvan.com\/tutorials\/wp-json\/wp\/v2\/tags?post=1637"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}