Spark ML VectorAssembler returns strange output
I am experiencing a very strange behaviour from VectorAssembler and I was wondering if anyone else has seen this. My scenario is pretty straightforward. I parse data from a CSV file where I have some standard Int and Double fields and I also calculate some extra columns. My parsing function returns this: val joined = countPerChannel ++ countPerSource //two arrays of Doubles joined (label, orderNo, pageNo, Vectors.dense(joinedCounts)) My main function uses the parsing function like this: val parsedData = rawData.filter(row => row != header).map(parseLine) val data = sqlContext.createDataFrame