问题
The option spark.sql.caseSensitive
controls whether column names etc should be case sensitive or not. It can be set e.g. by
spark_session.sql('set spark.sql.caseSensitive=true')
and is false
per default.
It does not seem to be possible to enable it globally in $SPARK_HOME/conf/spark-defaults.conf
with
spark.sql.caseSensitive: True
though. Is that intended or is there some other file to set sql options?
Also in the source it is stated that it is highly discouraged to enable this at all. What is the rationale behind that advice?
回答1:
As it turns out setting
spark.sql.caseSensitive: True
in $SPARK_HOME/conf/spark-defaults.conf
DOES work after all. It just has to be done in the configuration of the Spark driver as well, not the master or workers. Apparently I forgot that when I last tried.
回答2:
Try sqlContext.sql("set spark.sql.caseSensitive=true") in your Python code, which worked for me.
来源:https://stackoverflow.com/questions/42946104/enable-case-sensitivity-for-spark-sql-globally