matrix-multiplication

Why is a naïve C++ matrix multiplication 100 times slower than BLAS?

杀马特。学长 韩版系。学妹 提交于 2019-11-30 03:52:28
I am taking a look at large matrix multiplication and ran the following experiment to form a baseline test: Randomly generate two 4096x4096 matrixes X, Y from std normal (0 mean, 1 stddev). Z = X*Y Sum elements of Z (to make sure they are accessed) and output. Here is the naïve C++ implementatation: #include <iostream> #include <algorithm> using namespace std; int main() { constexpr size_t dim = 4096; float* x = new float[dim*dim]; float* y = new float[dim*dim]; float* z = new float[dim*dim]; random_device rd; mt19937 gen(rd()); normal_distribution<float> dist(0, 1); for (size_t i = 0; i < dim

What is R's multidimensional equivalent of rbind and cbind?

烂漫一生 提交于 2019-11-30 02:39:12
When working with matrices in R, one can put them side-by-side or stack them top of each other using cbind and rbind, respectively. What is the equivalent function for stacking matrices or arrays in other dimensions? For example, the following creates a pair of 2x2 matrices, each having 4 elements: x = cbind(1:2,3:4) y = cbind(5:6,7:8) What is the code to combine them into a 2x2x2 array with 8 elements? mdsumner See the abind package. If you want them to bind on a 3rd dimension, do this: library(abind) abind(x, y, along = 3) See ?abind Also, abind gives a lot more convenience, but for simple

Matrix multiplication, solve Ax = b solve for x

 ̄綄美尐妖づ 提交于 2019-11-30 01:50:22
问题 So I was given a homework assignment that requires solving the coefficients of cubic splines. Now I clearly understand how to do the math on paper as well as with MatLab, I want to solve the problem with Python. Given an equation Ax = b where I know the values of A and b, I want to be able to solve for x with Python and I am having trouble finding a good resource to do such a thing. Ex. A = |1 0 0| |1 4 1| |0 0 1| x = Unknown 3x1 matrix b = |0 | |24| |0 | Solve for x 回答1: In a general case,

How to write a matrix matrix product that can compete with Eigen?

﹥>﹥吖頭↗ 提交于 2019-11-30 00:47:51
Below is the C++ implementation comparing the time taken by Eigen and For Loop to perform matrix-matrix products. The For loop has been optimised to minimise cache misses. The for loop is faster than Eigen initially but then eventually becomes slower (upto a factor of 2 for 500 by 500 matrices). What else should I do to compete with Eigen? Is blocking the reason for the better Eigen performance? If so, how should I go about adding blocking to the for loop? #include<iostream> #include<Eigen/Dense> #include<ctime> int main(int argc, char* argv[]) { srand(time(NULL)); // Input the size of the

Multiply two 100-Digit Numbers inside Excel Using Matrix

末鹿安然 提交于 2019-11-29 18:17:06
I want to multiply two 100-Digit Numbers In Excel using matrix. The issue in Excel is that after 15-digit, it shows only 0. So, the output also need to be in a Matrix. 1st Number: "9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999" 2nd Number: "2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222" Output:

numpy - matrix multiple 3x3 and 100x100x3 arrays?

旧巷老猫 提交于 2019-11-29 16:28:39
I have the following: import numpy as np XYZ_to_sRGB_mat_D50 = np.asarray([ [3.1338561, -1.6168667, -0.4906146], [-0.9787684, 1.9161415, 0.0334540], [0.0719453, -0.2289914, 1.4052427], ]) XYZ_1 = np.asarray([0.25, 0.4, 0.1]) XYZ_2 = np.random.rand(100,100,3) np.matmul(XYZ_to_sRGB_mat_D50, XYZ_1) # valid operation np.matmul(XYZ_to_sRGB_mat_D50, XYZ_2) # makes no sense mathematically How do I perform the same operation on XYZ_2 that I would on XYZ_2? Do I somehow reshape the array first? It seems you are trying to sum-reduce the last axis of XYZ_to_sRGB_mat_D50 (axis=1) with the last one of XYZ

Dynamic matrix multiplication with CUDA

牧云@^-^@ 提交于 2019-11-29 15:37:45
The idea of my simple program that I've been trying to write is to take input from the user to see how large of a matrix to multiply. I am looking to take the input x by x, I am not currently looking to multiply two different sizes at the moment. How would you guys suggest I go about accomplishing this? I'm sorry my question was not clear enough, I want to modify this kernel so that it can handle a matrix of any size(where the x and y are equivalents to keep it simple). Instead of multiples of 16. I'm not sure if you would need my current code but here is the kernel code: // CUDA Kernel _

fast matrix multiplication in Matlab

社会主义新天地 提交于 2019-11-29 15:06:15
I need to make a matrix/vector multiplication in Matlab of very large sizes: "A" is an 655360 by 5 real-valued matrix that are not necessarily sparse and "B" is a 655360 by 1 real-valued vector. My question is how to compute: B'*A efficiently. I have notice a slight time improvement by computing A'*B instead, which gives a column vector. But still it is quite slow (I need to perform this operation several times in the program). With a little bit search I found an interesting Matlab toolbox MTIMESX by James Tursa, which I hoped would improve the above matrix multiplication performance. After

Can UIPinchGestureRecognizer and UIPanGestureRecognizer Be Merged?

∥☆過路亽.° 提交于 2019-11-29 14:06:59
问题 I am struggling a bit trying to figure out if it is possible to create a single combined gesture recognizer that combines UIPinchGestureRecognizer with UIPanGestureRecognizer. I am using pan for view translation and pinch for view scaling. I am doing incremental matrix concatenation to derive a resultant final transformation matrix that is applied to the view. This matrix has both scale and translation. Using separate gesture recognizers leads to a jittery movement/scaling. Not what I want.

Efficient way of computing matrix product AXA'?

久未见 提交于 2019-11-29 14:05:34
I'm currently using BLAS function DSYMM to compute Y = AX and then DGEMM for YA' , but I'm wondering is there some more efficient way of computing the matrix product AXA T , where A is an arbitrary n×n matrix and X is a symmetric n×n matrix? 来源: https://stackoverflow.com/questions/11139933/efficient-way-of-computing-matrix-product-axa