blas

memory leak in dgemm_

放肆的年华 提交于 2019-12-05 18:23:05
I am currently working on an application which involves lots and lots of calls to blas routines. Routinely checking for memory leaks I discovered, that I am loosing bytes in a dgemm call. The call looks like this: // I want to multiply 2 nxn matrices and put the result into C - an nxn matrix double zero = 0.0; double one = 1.0; double n; // matrix dimension char N = 'N'; dgemm_(&N, &N, &n, &n, &n, &one, A, &n, B, &n, &zero, C, &n); A,B and C are double fields of size n*n. The valgrind output is: ==78182== 18 bytes in 1 blocks are definitely lost in loss record 2 of 30 ==78182== at 0xB936:

lapack/blas/openblas proper installation from source - replace system libraries with new ones

对着背影说爱祢 提交于 2019-12-05 17:51:59
I wanted to install BLAS, CBLAS, LAPACK and OpenBLAS libraries from source using available packages you can download here openblas and lapack , blas/cblas . Firstly I removed my system blas/cblas and lapack libraries, but unfortunately atlas library couldn't be uninstalled (I can either have both blas and lapack or atlas - can't remove them all). I didn't bother and started compiling downloaded libraries cause I thought that after installation I would be able to remove atlas. Building process was based on this tutorial. For completeness I will list the steps: OpenBLAS . After editing Makefile

Docker images with architecture optimisation?

拥有回忆 提交于 2019-12-05 17:30:39
Some libraries such as BLAS/LAPACK or certain optimisation libraries get optimised for the local machine architecture upon compilation time. Lets take OpenBlas as an example. There exist two ways to create a Docker container with OpenBlas: Use a Dockerfile in which you specify a git clone of the OpenBlas library together with all necessary compilation flags and build commands. Pull and run someone else's image of Ubuntu + OpenBlas from the Docker Hub. Option (1) guarantees that OpenBlas is build and optimised for your machine. What about option (2)? As a Docker novice, I see images as

Link MKL to an installed Numpy in Anaconda?

杀马特。学长 韩版系。学妹 提交于 2019-12-05 14:39:34
>>> numpy.__config__.show() atlas_threads_info: NOT AVAILABLE blas_opt_info: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['/home/admin/anaconda/lib'] define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')] language = c atlas_blas_threads_info: NOT AVAILABLE openblas_info: NOT AVAILABLE lapack_opt_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/home/admin/anaconda/lib'] define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')] language = f77 openblas_lapack_info: NOT AVAILABLE atlas_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/home/admin

Theano CNN on CPU: AbstractConv2d Theano optimization failed

半城伤御伤魂 提交于 2019-12-05 12:30:09
问题 I'm trying to train a CNN for object detection on images with the CIFAR10 dataset for a seminar at my university but I get the following Error: AssertionError: AbstractConv2d Theano optimization failed: there is no implementation available supporting the requested options. Did you exclude both "conv_dnn" and "conv_gemm" from the optimizer? If on GPU, is cuDNN available and does the GPU support it? If on CPU, do you have a BLAS library installed Theano can link against? I am running Anaconda 2

cblas gemm time dependent on input matrix values - Ubuntu 14.04

[亡魂溺海] 提交于 2019-12-05 07:56:09
问题 This is an extension of my earlier question, but I am asking it separately because I am getting really frustrated, so please do not down-vote it! Question: What could be the reason behind a cblas_sgemm call taking much less time for matrices with a large number of zeros as compared to the same cblas_sgemm call for dense matrices? I know gemv is designed for matrix-vector multiplication but why can't I use gemm for vector-matrix multiplication if it takes less time, especially for sparse

iOS 4 Accelerate Cblas with 4x4 matrices

时光总嘲笑我的痴心妄想 提交于 2019-12-05 07:03:43
I’ve been looking into the Accelerate framework that was made available in iOS 4. Specifically, I made some attempts to use the Cblas routines in my linear algebra library in C. Now I can’t get the use of these functions to give me any performance gain over very basic routines. Specifically, the case of 4x4 matrix multiplication. Wherever I couldn’t make use of affine or homogeneous properties of the matrices, I’ve been using this routine (abridged): float *mat4SetMat4Mult(const float *m0, const float *m1, float *target) { target[0] = m0[0] * m1[0] + m0[4] * m1[1] + m0[8] * m1[2] + m0[12] * m1

Installing BLAS on a mac OS X Yosemite

不想你离开。 提交于 2019-12-04 23:43:21
问题 I'm trying to Install BLAS on my Mac, but every time I run make I get this error (shown below the link). I was trying to follow the instructions on this website: gfortran -O3 -c isamax.f -o isamax.o make: gfortran: No such file or directory make: *** [isamax.o] Error 1 I have no idea what this means or how to fix it so any help would be appreciated. Also I'm trying to install CBLAS and LAPACK so any tips/instructions for that would be nice if you know of a good source...Everything I've found

Does scipy support multithreading for sparse matrix multiplication when using MKL BLAS?

不问归期 提交于 2019-12-04 22:58:17
问题 According to MKL BLAS documentation "All matrix-matrix operations (level 3) are threaded for both dense and sparse BLAS." http://software.intel.com/en-us/articles/parallelism-in-the-intel-math-kernel-library I have built Scipy with MKL BLAS. Using the test code below, I see the expected multithreaded speedup for dense, but not sparse, matrix multiplication. Are there any changes to Scipy to enable multithreaded sparse operations? # test dense matrix multiplication from numpy import * import

Symmetric Matrix Inversion in C using CBLAS/LAPACK

余生长醉 提交于 2019-12-04 16:17:29
I am writing an algorithm in C that requires Matrix and Vector multiplications. I have a matrix Q (W x W) which is created by multiplying the transpose of a vector J (1 x W) with itself and adding Identity matrix I , scaled using scalar a . Q = [(J^T) * J + aI]. I then have to multiply the inverse of Q with vector G to get vector M . M = (Q^(-1)) * G. I am using cblas and clapack to develop my algorithm. When matrix Q is populated using random numbers (type float) and inverted using the routines sgetrf_ and sgetri_ , the calculated inverse is correct . But when matrix Q is symmetrical , which