Spark ML VectorAssembler returns strange output

前端 未结 1 1650
天命终不由人
天命终不由人 2020-12-03 15:07

I am experiencing a very strange behaviour from VectorAssembler and I was wondering if anyone else has seen this.

My scenario is pretty straightforward.

相关标签:
1条回答
  • 2020-12-03 15:40

    There is nothing strange about the output. Your vector seems to have lots of zero elements thus spark used it’s sparse representation.

    To explain further :

    It seems like your vector is composed of 18 elements (dimension).

    This indices [0,1,6,9,14,17] from the vector contains non zero elements which are in order [17.0,15.0,3.0,1.0,4.0,2.0]

    Sparse Vector representation is a way to save computational space thus easier and faster to compute. More on Sparse representation here.

    Now of course you can convert that sparse representation to a dense representation but it comes at a cost.

    In case you are interested in getting feature importance, thus I advise you to take a look at this.

    0 讨论(0)
提交回复
热议问题