How to write a Pig UDF in Scala

自闭症网瘾萝莉.ら 提交于 2019-12-13 16:13:00

问题


I am trying to write a Pig UDF in Scala (using Eclipse). I have added pig.jar as a library in the java build path which seems to resolve the 2 imports below:

  • import org.apache.pig.EvalFunc
  • import org.apache.pig.data.Tuple

however I get 2 errors which I cannot resolve:

  1. org.apache.pig.EvalFunc[T] does not have a constructor
  2. value get is not a member of org.apache.pig.data.Tuple (though I am sure that Tuple has the get method)

Here is the full code:

package datesUDFs
import org.apache.pig.EvalFunc
import org.apache.pig.data.Tuple
class getYear extends EvalFunc {
  val extractDate = """^(\d\d\d\d)-\d\d-\d\d \d\d:\d\d:\d\d""".r
  def isDate(dtString: String): Boolean = extractDate.findFirstIn(dtString).nonEmpty

  override def exec(input: Tuple): Int = input.get(0) match {
    case dtString: String =>
      if (!isDate(dtString)) throw new IllegalArgumentException("Invalid date string!")
      else (for (extractDate(year) <- extractDate.findFirstIn(dtString)) yield year).head.toInt
    case _ => throw new IllegalArgumentException("Invalid function call!")
  }
}

Can anybody help me resolving this issue?

Thanks in advance!!!


回答1:


Besides having to specify the EvalFunc type parameter, your code compiles fine for me.

package datesUDFs
import org.apache.pig.EvalFunc
import org.apache.pig.data.Tuple
class getYear extends EvalFunc[Int] { // This is the only line I changed.
  val extractDate = """^(\d\d\d\d)-\d\d-\d\d \d\d:\d\d:\d\d""".r
  def isDate(dtString: String): Boolean = extractDate.findFirstIn(dtString).nonEmpty

  override def exec(input: Tuple): Int = input.get(0) match {
    case dtString: String =>
      if (!isDate(dtString)) throw new IllegalArgumentException("Invalid date string!")
      else (for (extractDate(year) <- extractDate.findFirstIn(dtString)) yield year).head.toInt
    case _ => throw new IllegalArgumentException("Invalid function call!")
  }
}

See if it help, sometimes ScalaIDE complains about the wrong things.




回答2:


solved it! I added hadoop-common-2.2.0.jar and commons-logging-1.1.3.jar to my java build path and the problems were resolved.



来源:https://stackoverflow.com/questions/19778232/how-to-write-a-pig-udf-in-scala

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!