How to find the index of the maximum value in a vector column?
问题 I have a Spark DataFrame with the following structure: root |-- distribution: vector (nullable = true) +--------------------+ | topicDistribution| +--------------------+ | [0.1, 0.2] | | [0.3, 0.2] | | [0.5, 0.2] | | [0.1, 0.7] | | [0.1, 0.8] | | [0.1, 0.9] | +--------------------+ My question is: How to add a column with the index of the maximum value for each row? It should be something like this: root |-- distribution: vector (nullable = true) |-- max_index: integer (nullable = true) +----