I use spark-shell to do the below operations.
Recently loaded a table with an array column in spark-sql .
Here is the DDL for the same:
Edit2: For who wants to avoid udf at the expense of readability ;-)
If you really want to do it in one step, you will have to use Scala to create a lambda function returning an sequence of Column and wrap it in an array. This is a bit involved, but it's one step:
val df = List(List("Jon", "Snow", "Castle", "Black", "Ned")).toDF("emp_details")
df.withColumn("slice", array((0 until 3).map(i => $"emp_details"(i)):_*)).show(false)
+-------------------------------+-------------------+
|emp_details |slice |
+-------------------------------+-------------------+
|[Jon, Snow, Castle, Black, Ned]|[Jon, Snow, Castle]|
+-------------------------------+-------------------+
The _:* works a bit of magic to pass an list to a so-called variadic function (array in this case, which construct the sql array). But I would advice against using this solution as is. put the lambda function in a named function
def slice(from: Int, to: Int) = array((from until to).map(i => $"emp_details"(i)):_*))
for code readability. Note that in general, sticking to Column expressions (without using `udf) has better performances.
Edit: In order to do it in a sql statement (as you ask in your question...), following the same logic you would generate the sql query using scala logic (not saying it's the most readable)
def sliceSql(emp_details: String, from: Int, to: Int): String = "Array(" + (from until to).map(i => "emp_details["+i.toString+"]").mkString(",") + ")"
val sqlQuery = "select emp_details,"+ sliceSql("emp_details",0,3) + "as slice from emp_details"
sqlContext.sql(sqlQuery).show
+-------------------------------+-------------------+
|emp_details |slice |
+-------------------------------+-------------------+
|[Jon, Snow, Castle, Black, Ned]|[Jon, Snow, Castle]|
+-------------------------------+-------------------+
note that you can replace until by to in order to provide the last element taken rather than the element at which the iteration stops.