sparse-matrix

Implications of manually setting scipy sparse matrix shape

巧了我就是萌 提交于 2019-12-02 07:06:41
I need to perform online training on a TF-IDF model. I found that scipy's TfidfVectorizer does not support training on online fashion, so I'm implementing my own CountVectorizer to support online training and then use the scipy's TfidfTransformer to update tf-idf values after a pre-defined number of documents have entered in the corpus. I found here that you shouldn't be adding rows or columns to numpy arrays since all data would need to be copied so it is stored in contiguous blocks of memory. But then I also found that in fact, using scipy sparse matrix you can manually change the matrix's

Perform matrix multiplication between two arrays and get result only on masked places

五迷三道 提交于 2019-12-02 07:06:32
问题 I have two dense matrices, A [200000,10], B [10,100000]. I need to multiply them to get matrix C . I can't do that directly, since the resulting matrix won't fit into the memory. Moreover, I need only a few elements from the resulting matrix, like 1-2% of the total number of elements. I have a third matrix W [200000,100000] which is sparse and has non-zero elements on exactly those places which are interesting to me in the matrix C . Is there a way to use W as a "mask" so that the resulting

scipy.sparse.coo_matrix how to fast find all zeros column, fill with 1 and normalize

你说的曾经没有我的故事 提交于 2019-12-02 06:14:32
For a matrix, i want to find columns with all zeros and fill with 1s, and then normalize the matrix by column. I know how to do that with np.arrays [[0 0 0 0 0] [0 0 1 0 0] [1 0 0 1 0] [0 0 0 0 1] [1 0 0 0 0]] | V [[0 1 0 0 0] [0 1 1 0 0] [1 1 0 1 0] [0 1 0 0 1] [1 1 0 0 0]] | V [[0 0.2 0 0 0] [0 0.2 1 0 0] [0.5 0.2 0 1 0] [0 0.2 0 0 1] [0.5 0.2 0 0 0]] But how can I do the same thing when the matrix is in scipy.sparse.coo.coo_matrix form, without converting it back to np.arrays. how can I achieve the same thing? hpaulj This will be a lot easier with the lil format, and working with rows

how to create a single float sparse matrix in mex files

非 Y 不嫁゛ 提交于 2019-12-02 06:00:46
问题 This Creating sparse matrix in MEX has a good example on mxCreateSparse . But this function return a double sparse matrix instead of single . If I want to return a single sparse matrix, what should I do ? Thanks ! 回答1: As @horchler suggested, you could use the undocumented function mxCreateSparseNumericMatrix . Example: singlesparse.c #include "mex.h" #include <string.h> /* memcpy */ /* undocumented function prototype */ EXTERN_C mxArray *mxCreateSparseNumericMatrix(mwSize m, mwSize n, mwSize

create a sparse matrix; given the indices of non-zero elements for creation of dummy variables of a categorical column of a large dataset

為{幸葍}努か 提交于 2019-12-02 05:58:18
I'm trying to use a sparse matrix to generate dummy variables for a set of data with 5.8 million rows and two categorical columns. The structure of the data is: mydata: data.table of 5,800,000 rows and two categorical (in integer format) variables Var1 and Var2 nlevel(Var1) : 210,000 (levels include all numbers between 1 and 210,000) nlevel(Var2) : 500 (levels include all numbers between 1 and 500) here's an example of mydata: Var_1 Var_2 1 4 1 2 2 7 5 9 5 500 . . . 200 6 200 2 200 80 . . . I'm using a sparse Matrix (sparse_Mx) to create the dummy variable matrix which would be of the form:

Is sparse BLAS not included in BLAS?

柔情痞子 提交于 2019-12-02 04:46:41
问题 I have a working LAPACK implementation and that, as far as I read, contains BLAS. I want to use SPARSE BLAS and as far as I understand this website, SPARSE BLAS is part of BLAS. But when I tried to run the code below from the sparse blas manual using g++ -o sparse.x sparse_blas_example.c -L/usr/local/lib -lblas && ./sparse_ex.x the compiler (or linker?) asked for blas_sparse.h. When I put that file in the working directory I got: ludi@ludi-M17xR4:~/Desktop/tests$ g++ -o sparse.x sparse_blas

Implementing matching pursuit algorithm

自古美人都是妖i 提交于 2019-12-02 04:14:11
问题 I have implemented matching pursuit algorithm but i m unable to get the required result. Here is my code: D=[1 6 11 16 21 26 31 36 41 46 2 7 12 17 22 27 32 37 42 47 3 8 13 18 23 28 33 38 43 48 4 9 14 19 24 29 34 39 44 49 5 10 15 20 25 30 35 40 45 50]; b=[6;7;8;9;10]; n=size(D); A1=zeros(n); R=b; H=10; if(H <= 0) error('The number of iterations needs to be greater then 0') end; for k=1:1:H [c,d] = max(abs(D'*R)); %//' A1(:,d)=D(:,d); D(:,d)=0; y = A1\b; R = b-A1*y; end Output y= 0.8889 0 0 0 0

Algorithm to create all possible combinations

纵然是瞬间 提交于 2019-12-02 02:37:33
I'm writing a spares grid code and need to combine N 1-dimensional grid points (written in vector form) into the an array of all possible points. For example one can mix two vectors (a,b) with (c,d,e) giving the following points: (a,c) (a,d) (a,e) (b,c) (b,d) (b,e) Matlab has a function called combvec: http://www.mathworks.co.uk/help/nnet/ref/combvec.html I'm writing this code in FORTRAN however I can't find the underlying algorithm. The code needs to take in N (N>1) vectors (i.e 2,3...N) and each can be a different length. Does anyone know of an algorithm? I don't know Fortran, but since you

Algorithm to create all possible combinations

荒凉一梦 提交于 2019-12-02 02:14:20
问题 I'm writing a spares grid code and need to combine N 1-dimensional grid points (written in vector form) into the an array of all possible points. For example one can mix two vectors (a,b) with (c,d,e) giving the following points: (a,c) (a,d) (a,e) (b,c) (b,d) (b,e) Matlab has a function called combvec: http://www.mathworks.co.uk/help/nnet/ref/combvec.html I'm writing this code in FORTRAN however I can't find the underlying algorithm. The code needs to take in N (N>1) vectors (i.e 2,3...N) and

Error when making a sparse matrix

◇◆丶佛笑我妖孽 提交于 2019-12-02 01:26:01
问题 I am facing a problem I do not understand. It's a follow-up on answers suggested here and here I have two identically structured datasets. One I created as a reproducible example for which the code works, and one which is real for which the code does not work. After staring at it for hours I cannot find what is causing the error. The following gives an example that works df <- data.table(cbind(rep(seq(1,25), each = 4 )), cbind(rep(seq(1,40), length.out = 100))) colnames(df) <- c("a", "b")