sparse-matrix

Create Sparse Matrix in Python

最后都变了- 提交于 2019-12-23 04:58:17
问题 Working with data and would like to create a sparse matrix to later be used for clustering purposes. fileHandle = open('data', 'r') for line in fileHandle: json_list = [] fields = line.split('\t') json_list.append(fields[0]) json_list.append(fields[1]) json_list.append(fields[3]) Right now the data looks like this: term, ids, quantity ['buick', '123,234', '500'] ['chevy', '345,456', '300'] ['suv','123', '100'] The output I would need would be like this: term, quantity, '123', '234', '345',

R spdep giant weight matrix

此生再无相见时 提交于 2019-12-23 04:51:11
问题 I'm new to spatial statistics, and I'm trying to create a spatial weight matrix for all Census tracts in the US in R. There are around 74000 tracts. Based on US Census Tiger Files, I created a shapefile of all tracts, and then did (using the spdep package): #Create adjacency matrix am = poly2nb(us) is.symmetric.nb(am) This works fine, though am is pretty large. Next: am = nb2mat(am, style="B",zero.policy=T) Which gives me this error: Error: cannot allocate vector of size 40.9 Gb Obviously my

Efficiently assign a row to a lil_matrix

别说谁变了你拦得住时间么 提交于 2019-12-23 04:31:41
问题 How can I efficiently assign a row to a lil_matrix ? I'm currently using: Q[mid, :] = new_Q where new_Q is the result of lil_matrix.getrow(x) I ran a test on using Q.getrow(i) vs. Q[i, :] , and found the former to be 20x faster. Here's the lil_matrix documentation. 回答1: These time tests on small lil (dense, but I don't think that matters), suggest that x[i,:] is not a problem setting. Yes, for some reason, it is slow when used to fetch a row. In [108]: x=sparse.lil_matrix(np.arange(120)

Set all NaN elements in sparse matrix to zero

血红的双手。 提交于 2019-12-23 02:14:20
问题 What's the equivalent of the Matlab statement X(isnan(X))=0 in R? Note X is of type of matrix.csr in R. (This is from pkg:SparseM.) 回答1: Are you sure you want to use the matrix.csr class? It is from the SparseM package and as far as I can tell, at least from the package documentation, there are no is.na<- or is.na[ methods. The Matrix-package does document is.na-methods: > library(Matrix);M <- Matrix(1:6, nrow=4, ncol=3, + dimnames = list(c("a", "b", "c", "d"), c("A", "B", "C"))) > stopifnot

SQL Server: how to populate sparse data with the rest of zero values?

杀马特。学长 韩版系。学妹 提交于 2019-12-23 00:23:17
问题 I have data reporting sales by every month and by every customer. When I count the values, the zero-values are not reported because of the sparsa data format. Suppose customer 1-4. Suppose only customers 1-2 have recordings. Straight table has customerIDs on rows and months on the columns such that |CustomerID|MonthID|Value| -------------------------| | 1 |201101 | 10 | | 2 |201101 | 100 | and then they are reported in Crosstab format such that |CustomerID|201101|201102|2011103|...|201501| --

Random binary matrix with row and column sum constraints

天涯浪子 提交于 2019-12-22 11:06:02
问题 My objective is to create: a randomly populated matrix with entries either 0 or 1 . In this particular case, the matrix is 4x24 . The row sum of each of the 4 rows is exactly 6 . The column sum of each of the 24 columns is exactly 1 Call the desired matrix M . Another way of looking at M : There are exactly 24 entries equal to 1 . No column has more than one 1 entry. Progress: There are 6 spots on each row with a 1 entry. The rest are zero, the matrix is sparse. With 4 rows, this means that M

How to visualize a sparse matrix in MATLAB?

可紊 提交于 2019-12-22 10:34:44
问题 So I have this matrix here, and it is of size 13 x 8198. (I have called it 'blah'). This is a sparse matrix, in that, most of its entries are 0. When I do an imagesc(blah), I get the following image: Clearly this is worthless because I cannot clearly see the non-zero elements. I have tried playing around with the color scaling, but to no avail. Anyway, I was wondering if there might be a nicer way to be able to visualize this matrix in MATLAB somehow? I am designing an algorithm and would

Sparse Matrix as input to Hierarchical clustering in R

心已入冬 提交于 2019-12-22 09:49:15
问题 I have a question about clustering using a distance matrix, but sparse. Is there a sparse distance object format that does not expand the matrix and can work with the sparse representation? Currently I'm doing the following # read sparse matrix sparse <- readMM('sparse-matrix') distance <- as.dist(sparse) sparse-matrix is already the correct distance matrix, which has NA's for entries that are not connected. >sparse [1,] . . . [2,] 1 . . [3,] 1 . . > as.dist(sparse) 1 2 2 1 3 1 0 But

Subset of a matrix multiplication, fast, and sparse

隐身守侯 提交于 2019-12-22 09:10:06
问题 Converting a collaborative filtering code to use sparse matrices I'm puzzling on the following problem: given two full matrices X (m by l) and Theta (n by l), and a sparse matrix R (m by n), is there a fast way to calculate the sparse inner product . Large dimensions are m and n (order 100000), while l is small (order 10). This is probably a fairly common operation for big data since it shows up in the cost function of most linear regression problems, so I'd expect a solution built into scipy

Calculate subset of matrix multiplication

心不动则不痛 提交于 2019-12-22 08:59:06
问题 When I have two non-sparse matrices A and B , is there a way to efficiently calculate C=A.T.dot(B) when I only want a subset of the elements of C ? I have the desired indices of C stored in CSC format which is specified here. 回答1: If you know in advance which parts of C you want and some of these parts are contiguous and rectangular regions*, then you can use the matrix algebra rules associated with the Multiplication of Partitioned Matrices (1) or Block matrix multiplication (2) to speed up