building a function to add checks to amazon deequ framework

大兔子大兔子 提交于 2020-08-10 01:07:31

问题


Using amazon deequ library I'm trying to build a function that takes 3 parameters, the check object, a string telling what constraint needs to be run and another string that provides the constraint criteria. I have a bunch of checks that I want to read from a mysql table. My intention is to iterate through all the checks that I get from the mysql table and build a check object using the function I described above and run the checks on a source dataframe Here a example of the amazon deequ https://towardsdatascience.com/automated-data-quality-testing-at-scale-using-apache-spark-93bb1e2c5cd0

So the function call looks something like this,

var _check = build_check_object_function(check_object, "hasSize", "10000")

This function should add a new hasSize check to the check_object and return that.

The part where I'm stuck is how to translate the hasSize string to the hasSize function.

    var _check = Check(CheckLevel.Error, "Data Validation Check")
    val listOfFunctions= _check.getClass.getMethods.filter(!_.getName().contains('$'))
    for (function <- listOfFunctions) {
       if( function.getName().toLowerCase().contains(row(2).asInstanceOf[String].toLowerCase())) {
         _check = _check.function(row(3))
        }else{
            println("Not a match")}
        }

Here is the error that I'm getting

<console>:38: error: value function is not a member of com.amazon.deequ.checks.Check
   if( function.getName().toLowerCase().contains(row(2).asInstanceOf[String].toLowerCase())) {_check = _check.function(row(3))                                                          

回答1:


You can either use runtime reflection or build a thin translation layer between your database and the deequ declarations.

I would suggest you go with translating database constraint/check strings explicitly to deequ declarations, e.g.:

if (constraint == "hasSize") {
  // as Constraint
  Constraint.sizeConstraint(_ <= 10)
  // as Check
  Check(CheckLevel.Error, "name").hasSize(_ <= 10)
}


来源:https://stackoverflow.com/questions/60727853/building-a-function-to-add-checks-to-amazon-deequ-framework

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!