问题
How to create SparseVector
and dense Vector representations
if the DenseVector
is:
denseV = np.array([0., 3., 0., 4.])
What will be the Sparse Vector representation ?
回答1:
Unless I have thoroughly misunderstood your doubt, the MLlib data type documentation illustrates this quite clearly:
import org.apache.spark.mllib.linalg.Vector;
import org.apache.spark.mllib.linalg.Vectors;
// Create a dense vector (1.0, 0.0, 3.0).
Vector dv = Vectors.dense(1.0, 0.0, 3.0);
// Create a sparse vector (1.0, 0.0, 3.0) by specifying its indices and values corresponding to nonzero entries.
Vector sv = Vectors.sparse(3, new int[] {0, 2}, new double[] {1.0, 3.0});
Where the second argument of Vectors.sparse
is an array of the indices, and the third argument is the array of the actual values in those indices.
回答2:
Sparse vectors are when you have a lot of values in the vector as zero. While a dense vector is when most of the values in the vector are non zero.
If you have to create a sparse vector from the dense vector you specified, use the following syntax:
Vector sparseVector = Vectors.sparse(4, new int[] {1, 3}, new double[] {3.0, 4.0});
来源:https://stackoverflow.com/questions/31522893/sparse-vector-vs-dense-vector