I am Using spark-sql 2.4.1 and java 8.
val country_df = Seq(
(\"us\",2001),
(\"fr\",2002),
(\"jp\",2002),
(\"in\",2001),
(\"fr\",2003),
You can try with the below code.
Select the column name from the first dataset.
List columns = country_df.select("country").where($"data_yr" === 2001).as(Encoders.STRING()).collectAsList();
Use the column names in selectexpr in second dataset.
public static Seq convertListToSeq(List inputList) {
return JavaConverters.asScalaIteratorConverter(inputList.iterator()).asScala().toSeq();
}
//using selectExpr
data_df.selectExpr(convertListToSeq(columns)).show(true);