• 欢迎访问搞代码网站,推荐使用最新版火狐浏览器和Chrome浏览器访问本网站!
  • 如果您觉得本站非常有看点,那么赶紧使用Ctrl+D 收藏搞代码吧

Hadoop Pig Udf Scheme

mysql 搞代码 4年前 (2022-01-09) 18次浏览 已收录 0个评论

hadoop pig udf scheme 如果不指定 scheme 当你返回一个tuple里面有大于1个fields的时候, 就必须指定schemea 不然多个field就当作一个field register myudfs.jar; A = load ‘student_data’ as (name: chararray, age: int, gpa: float); B = foreach A gene

hadoop pig udf scheme

如果不指定 scheme 当你返回一个tuple里面有大于1个fields的时候,

就必须指定schemea 不然多个field就当作一个field

<code>    register myudfs.jar;    A = load 'student_data' as (name: chararray, age: int, gpa: float);    B = foreach A generate flatten(myudfs.Swap(name, age)), gpa;    C = foreach B generate $2;    D = limit B 20;    dump D</code>

This script will result in the following error cause by line 4 ( C = foreach Bgenerate $2;).

<code>java.io.IOException: Out of bound access. Trying to access non-existent column: 2. Schema {bytearray,gpa: float} has 2 column(s).</code>

This is because Pig is only aware of two columns in B while line 4 is requesting the thirdcolumn of the tuple. (Column indexing in Pig starts with 0.)The function, including the schema, looks like this:

下面实现了一个schema,输出为4个参数,输出为两个参数,在android上面要用imei和mac去生成一个ukey,在ios平台上,要用 mac和openudid去生成一个ukey

最后返回的是一个platform,ukey

<code>    package kload;    import java.io.IOException;    import org.apache.pig.EvalFunc;    import org.apache.pig.data.Tuple;    import org.apache.pig.data.TupleFactory;    import org.apache.pig.impl.logicalLayer.schema.Schema;    import org.apache.pig.data.DataType;     /**      *translate mac,imei,openudid to key      */     public class KoudaiFormateUkey extends EvalFunc{         private String ukey = null;         private String platform = null;         public Tuple exec(Tuple input) throws IOException {             if (input == null || input.size() == 0)                 return null;             try{                 String platform = (String)input.get(0);                 String mac = (String)input.get(1);                 String imei= (String)input.get(2);                 String openudID = (String)input.get(3);                 this.getUkey(platform,mac,imei,openudID);                 if(this.platform == null || this.ukey == null){                     return null;                 }                 Tuple output = TupleFactory.getInstance().newTuple(2);                 output.set(0, this.platform);                 output.set(1, this.ukey);                 return output;             }catch(Exception e){                 throw new IOException("Caught exception processing input row ", e);             }         }         private String getUkey(String platform, String mac, String imei, String openudID){             String tmpStr = null;             String ukey = null;             int pType=-1;             if(platform == null){                 return null;             }             tmpStr = platform.toUpperCase();             if(tmpStr.indexOf("IPHONE") != -1){                 this.platform = "iphone";                 pType = 1001;              }else if(tmpStr.indexOf("ANDROID") != -1){                 this.platform = "android";                 pType = 1002;              }else if(tmpStr.indexOf("IPAD") != -1){                 this.platform = "ipad";                 pType = 1003;              }else{                 this.platform = "unknow";                 pType = 1004;              }             switch(pType){                 case 1001:                     case 1003:                     if(mac == null && openudID == null){                         return null;                     }                 ukey = String.format("%s_%s",mac,openudID);                 break;                 case 1002:                     if(mac == null && imei== null){                         return null;                     }                 ukey = String.format("%s_%s",mac,imei);                 break;                 case 1004:                     if(mac == null && imei== null && openudID == null){                         return null;                     }                 ukey = String.format("%s_%s_%s",mac,<i>本文来源gaodai$ma#com搞$代*码网2</i>imei,openudID);                 break;                 default:                 break;             }             if  (ukey == null || ukey.length() == 0){                 return null;             }             this.ukey = ukey.toUpperCase();             return this.ukey;         }         public Schema outputSchema(Schema input) {             try{                 Schema tupleSchema = new Schema();                 tupleSchema.add(input.getField(0));                 tupleSchema.add(input.getField(1));                 return new Schema(new                         Schema.FieldSchema(getSchemaName(this.getClass().getName().toLowerCase(),                                 input),tupleSchema, DataType.TUPLE));             }catch (Exception e){                 return null;             }         }    }</code>

搞代码网(gaodaima.com)提供的所有资源部分来自互联网,如果有侵犯您的版权或其他权益,请说明详细缘由并提供版权或权益证明然后发送到邮箱[email protected],我们会在看到邮件的第一时间内为您处理,或直接联系QQ:872152909。本网站采用BY-NC-SA协议进行授权
转载请注明原文链接:Hadoop Pig Udf Scheme
喜欢 (0)
[搞代码]
分享 (0)
发表我的评论
取消评论

表情 贴图 加粗 删除线 居中 斜体 签到

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址