I am looking for a way to select columns of my dataframe in pyspark. For the first row, I know I can use df.first() but not sure about columns given that they do
df.first()
You can use an array and unpack it inside the select:
cols = ['_2','_4','_5'] df.select(*cols).show()