I have the same problem with @:
.
In our case, we solved flattering the DataFrame.
val ALIAS_RE: Regex = "[_.:@]+".r
val FIRST_AT_RE: Regex = "^_".r
def getFieldAlias(field_name: String): String = {
FIRST_AT_RE.replaceAllIn(ALIAS_RE.replaceAllIn(field_name, "_"), "")
}
def selectFields(df: DataFrame, fields: List[String]): DataFrame = {
var fields_to_select = List[Column]()
for (field <- fields) {
val alias = getFieldAlias(field)
fields_to_select +:= col(field).alias(alias)
}
df.select(fields_to_select: _*)
}
So the following json:
{
object: 'blabla',
schema: {
@type: 'blabla',
name@id: 'blabla'
}
}
That will be transformed [object, schema.@type, schema.name@id].
@ and dots (in your case =) will create problems for SparkSQL.
So after our SelectFields you can end with
[object, schema_type, schema_name_id]. Flattered DataFrame.