blas

Element wise multiplication between matrices in BLAS?

北慕城南 提交于 2019-12-04 04:42:02
问题 Im starting to use BLAS functions in c++ (specifically intel MKL) to create faster versions of some of my old Matlab code. Its been working out well so far, but I cant figure out how to perform elementwise multiplication on 2 matrices (A .* B in Matlab). I know gemv does something similar between a matrix and a vector, so should I just break one of my matrices into vectprs and call gemv repeatedly? I think this would work, but I feel like there should be aomething built in for this operation.

How to check which BLAS is in my Ubuntu system?

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-04 04:11:01
In particular, I would like to know if xianyi's OpenBLAS has been installed. I work on several PCs and had it installed in several PCs over the past couple of years, but I lost track which were not installed with it. I need to know which PC has it and which doesn't This is how I installed it: git clone git://github.com/xianyi/OpenBLAS cd OpenBLAS make FC=gfortran sudo make PREFIX=/usr/local/ install Note: I may have deleted the OpenBLAS directory, so it's not a reliable indicator. And I have no idea how to uninstall it, so I can't try installing it on every PC and then uninstall selectively

Theano CNN on CPU: AbstractConv2d Theano optimization failed

…衆ロ難τιáo~ 提交于 2019-12-04 00:03:45
I'm trying to train a CNN for object detection on images with the CIFAR10 dataset for a seminar at my university but I get the following Error: AssertionError: AbstractConv2d Theano optimization failed: there is no implementation available supporting the requested options. Did you exclude both "conv_dnn" and "conv_gemm" from the optimizer? If on GPU, is cuDNN available and does the GPU support it? If on CPU, do you have a BLAS library installed Theano can link against? I am running Anaconda 2.7 within a Jupyter notebook (CNN training on CPU) from a Windows 10 machine. As I already have updated

Installing BLAS on a mac OS X Yosemite

谁说我不能喝 提交于 2019-12-03 16:28:17
I'm trying to Install BLAS on my Mac, but every time I run make I get this error (shown below the link). I was trying to follow the instructions on this website : gfortran -O3 -c isamax.f -o isamax.o make: gfortran: No such file or directory make: *** [isamax.o] Error 1 I have no idea what this means or how to fix it so any help would be appreciated. Also I'm trying to install CBLAS and LAPACK so any tips/instructions for that would be nice if you know of a good source...Everything I've found so far is pretty confusing. Also I tried to install ATLAS but it kept not working. This error is

Does scipy support multithreading for sparse matrix multiplication when using MKL BLAS?

萝らか妹 提交于 2019-12-03 14:15:44
According to MKL BLAS documentation "All matrix-matrix operations (level 3) are threaded for both dense and sparse BLAS." http://software.intel.com/en-us/articles/parallelism-in-the-intel-math-kernel-library I have built Scipy with MKL BLAS. Using the test code below, I see the expected multithreaded speedup for dense, but not sparse, matrix multiplication. Are there any changes to Scipy to enable multithreaded sparse operations? # test dense matrix multiplication from numpy import * import time x = random.random((10000,10000)) t1 = time.time() foo = dot(x.T, x) print time.time() - t1 # test

How to make sure the numpy BLAS libraries are available as dynamically-loadable libraries?

放肆的年华 提交于 2019-12-03 08:27:38
问题 The theano installation documentation states, that theano will as a default use the BLAS libraries from numpy, if the "BLAS libraries are available as dynamically-loadable libraries". This seems not to be working on my machine, see error message. How do I find out, if the numpy BLAS libraries are availalbe as dynamically-loadable? How do I recompile the numpy BLAS libraries, if they are not dynamically-loadable? Please indicate, if you would need more information! Error message We did not

How to perform Vector-Matrix Multiplication with BLAS ?

≯℡__Kan透↙ 提交于 2019-12-03 07:46:53
BLAS defines the GEMV (Matrix-Vector Multiplication) level-2 operation. How to use a BLAS Library to perform Vector-Matrix Multiplication ? It's probably obvious, but I don't see how to use BLAS operation for this multiplication. I would have expected a GEVM operation. The Matrix-Vector multiplication of a (M x N) Matrix with a (N x 1) Vector will result an (M x 1) Vector. In short a*A(MxN)*X(Nx1) + b*Y(Mx1) -> Y(Mx1) . Of course you can use INCX and INCY when your vector is included in a matrix. In order to define a Vector-Matrix multiplication The Vector should be transposed. i.e. a*X(1xM)*A

Distributing Cython based extensions using LAPACK

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-03 05:21:56
问题 I am writing a Python module that includes Cython extensions and uses LAPACK (and BLAS ). I am open to using either clapack or lapacke , or some kind of f2c or f2py solution if necessary. What is important is that I am able to call lapack and blas routines from Cython in tight loops without Python call overhead. I've found one example here. However, that example depends on SAGE. I want my module to be installable without installing SAGE, since my users are not likely to want or need SAGE for

What is a good free (open source) BLAS/LAPACK library for .net (C#)? [closed]

a 夏天 提交于 2019-12-03 05:06:44
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 7 months ago . I have a project written in C# where I need to do various linear algebraic operations on matrices (like LU-factorization). Since the program is mainly a prototype created to confirm a theory, a C# implementation will suffice (compared to a possibly speedier C++ one), but I would still like a good BLAS or LAPACK

Replicating BLAS matrix multiplication performance: Can I match it?

六月ゝ 毕业季﹏ 提交于 2019-12-03 04:18:22
问题 Background If you have been following my posts, I am attempting to replicate the results found in Kazushige Goto's seminal paper on square matrix multiplication C = AB . My last post regarding this topic can be found here. In that version of my code, I follow the memory layering and packing strategy of Goto with an inner kernel computing 2x8 blocks of C using 128 bit SSE3 intrinsics. My CPU is i5-540M with hyperthreading off. Additional info about my hardware can be found in another post and