What is the best way to define custom methods on a DataFrame?

前端 未结 3 1510
感情败类
感情败类 2020-12-13 21:51

I need to define custom methods on DataFrame. What is the better way to do it? The solution should be scalable, as I intend to define a significant number of custom methods

3条回答
  •  执念已碎
    2020-12-13 22:20

    Your way is the way to go (see [1]). Even though I solved it a little different, the approach stays similar:

    Possibility 1

    Implicits

    object ExtraDataFrameOperations {
      object implicits {
        implicit def dFWithExtraOperations(df: DataFrame) = DFWithExtraOperations(df)
      }
    }
    
    case class DFWithExtraOperations(df: DataFrame) {
      def customMethod(param: String) : DataFrame = {
        // do something fancy with the df
        // or delegate to some implementation
        //
        // here, just as an illustrating example: do a select
        df.select( df(param) )
      }
    }
    

    Usage

    To use the new customMethod method on a DataFrame:

    import ExtraDataFrameOperations.implicits._
    val df = ...
    val otherDF = df.customMethod("hello")
    

    Possibility 2

    Instead of using an implicit method (see above), you can also use an implicit class:

    Implicit class

    object ExtraDataFrameOperations {
      implicit class DFWithExtraOperations(df : DataFrame) {
         def customMethod(param: String) : DataFrame = {
          // do something fancy with the df
          // or delegate to some implementation
          //
          // here, just as an illustrating example: do a select
          df.select( df(param) )
        }
      }
    }
    

    Usage

    import ExtraDataFrameOperations._
    val df = ...
    val otherDF = df.customMethod("hello")
    

    Remark

    In case you want to prevent the additional import, turn the object ExtraDataFrameOperations into an package object and store it in in a file called package.scala within your package.

    Official documentation / references

    [1] The original blog "Pimp my library" by M. Odersky is available at http://www.artima.com/weblogs/viewpost.jsp?thread=179766

提交回复
热议问题