blas

Numpy, BLAS and CUBLAS

若如初见. 提交于 2019-11-29 02:57:20
问题 Numpy can be "linked/compiled" against different BLAS implementations (MKL, ACML, ATLAS, GotoBlas, etc). That's not always straightforward to configure but it is possible. Is it also possible to "link/compile" numpy against NVIDIA's CUBLAS implementation? I couldn't find any resources in the web and before I spend too much time trying it I wanted to make sure that it possible at all. 回答1: In a word: no, you can't do that. There is a rather good scikit which provides access to CUBLAS from

libgfortran: version `GFORTRAN_1.4' not found

纵饮孤独 提交于 2019-11-29 01:25:50
问题 I am getting the following error when I trying to a run mex file in MATLAB: ??? Invalid MEX-file 'findimps3.mexa64': /MATLAB/bin/glnxa64/../../sys/os/glnxa64/libgfortran.so.3: version `GFORTRAN_1.4' not found (required by /usr/lib/libblas.so.3gf) Any ideas how to solve this problem? update: I found out that "strings MATLAB/.../libgfortran.so.3 | grep GFORTRAN" output GFORTRAN_1.0. I tried to changed libgfortran inside MATLAB but it didn't work. Not I think it's better to find a suitable

Purpose of LDA argument in BLAS dgemm?

倾然丶 夕夏残阳落幕 提交于 2019-11-29 00:20:37
问题 The Fortran reference implementation documentation states: * LDA - INTEGER. * On entry, LDA specifies the first dimension of A as declared * in the calling (sub) program. When TRANSA = 'N' or 'n' then * LDA must be at least max( 1, m ), otherwise LDA must be at * least max( 1, k ). * Unchanged on exit. However, given m and k shouldn't I be able to derive LDA? When is LDA permitted to be bigger than n (or k)? 回答1: The distinction is between the logical size of the first dimensions of the

How to use numpy with OpenBLAS instead of Atlas in Ubuntu?

泪湿孤枕 提交于 2019-11-28 19:38:24
I have looked for an easy way to install/compile Numpy with OpenBLAS but didn't find an easy answer. All the documentation I have seen takes too much knowledge as granted for someone like me who is not used to compile software. There are two packages in Ubuntu related to OpenBLAS : libopenblas-base and libopenblas-dev . Once they are installed, what should I do to install Numpy again with them? Thanks! Note that when these OpenBLAS packages are installed, Numpy doesn't work anymore: it can't be imported: ImportError: /usr/lib/liblapack.so.3gf: undefined symbol: ATL_chemv. The problem occurs as

Running Scipy on Heroku

為{幸葍}努か 提交于 2019-11-28 18:40:06
I got Numpy and Matplotlib running on Heroku, and I'm trying to install Scipy as well. However, Scipy requires BLAS[1] to install, which is not presented on the Heroku platform. After contacting Heroku support, they suggested me to build BLAS as a static library to deploy, and setup the necessary environment variables. So, I compiled libblas.a on a 64bit Linux box, and set the following variables as described in [2] : $ heroku config BLAS => .heroku/vendor/lib/libfblas.a LD_LIBRARY_PATH => .heroku/vendor/lib LIBRARY_PATH => .heroku/vendor/lib PATH => bin:/usr/local/bin:/usr/bin:/bin

Undefined reference to LAPACK and BLAS subroutines

不羁的心 提交于 2019-11-28 14:26:10
I'm trying to understand how BLAS and LAPACK in Fortran work and so on, so I made a code that generates a matrix and inverts it. Here's the code program test Implicit none external ZGETRF external ZGETRI integer ::M complex*16,allocatable,dimension(:,:)::A complex*16,allocatable,dimension(:)::WORK integer,allocatable,dimension(:)::IPIV integer i,j,info,error Print*, 'Enter size of the matrix' Read*, M Print*, 'Enter file of the matrix' READ(*,*), A OPEN(UNIT=10,FILE = '(/A/)' ,STATUS='OLD',ACTION='READ') allocate(A(M,M),WORK(M),IPIV(M),stat=error) if (error.ne.0)then print *,"error:not enough

cuBLAS argmin — segfault if outputing to device memory?

我的梦境 提交于 2019-11-28 13:53:52
In cuBLAS, cublasIsamin() gives the argmin for a single-precision array. Here's the full function declaration: cublasStatus_t cublasIsamin(cublasHandle_t handle, int n, const float *x, int incx, int *result) The cuBLAS programmer guide provides this information about the cublasIsamin() parameters: If I use host (CPU) memory for result , then cublasIsamin works properly. Here's an example: void argmin_experiment_hostOutput(){ float h_A[4] = {1, 2, 3, 4}; int N = 4; float* d_A = 0; CHECK_CUDART(cudaMalloc((void**)&d_A, N * sizeof(d_A[0]))); CHECK_CUBLAS(cublasSetVector(N, sizeof(h_A[0]), h_A, 1,

performance of NumPy with different BLAS implementations

∥☆過路亽.° 提交于 2019-11-28 11:01:28
I'm running an algorithm that is implemented in Python and uses NumPy. The most computationally expensive part of the algorithm involves solving a set of linear systems (i.e. a call to numpy.linalg.solve() . I came up with this small benchmark: import numpy as np import time # Create two large random matrices a = np.random.randn(5000, 5000) b = np.random.randn(5000, 5000) t1 = time.time() # That's the expensive call: np.linalg.solve(a, b) print time.time() - t1 I've been running this on: My laptop, a late 2013 MacBook Pro 15" with 4 cores at 2GHz ( sysctl -n machdep.cpu.brand_string gives me

Does installing BLAS/ATLAS/MKL/OPENBLAS will speed up R package that is written in C/C++?

跟風遠走 提交于 2019-11-28 01:51:08
问题 I found that using one of BLAS/ATLAS/MKL/OPENBLAS will give improvement on speed in R. However, will it still improve the R Package that is written in C or C++? for example, R package Glmnet is implemented in FORTRAN and R package rpart is implemented in C++. Will it just installing BLAS/...etc will improve the execution time? or do we have to rebuild (building new C code) the package based on BLAS/...etc? 回答1: It is frequently stated, including in a comment here, that "you have to recompile

Calling BLAS / LAPACK directly using the SciPy interface and Cython

无人久伴 提交于 2019-11-28 01:45:41
There was a post on this here: https://gist.github.com/JonathanRaiman/f2ce5331750da7b2d4e9 which shows a great speed improvement by just calling the Fortran libraries (BLAS / LAPACK / Intel MKL / OpenBLAS / whatever you installed with NumPy). After many hours of working on this (because of deprecated SciPy libraries) I finally got it to compile with no results. It was 2x faster than NumPy. Unfortunately as another user pointed out, the Fortran routine is always adding the output matrix to the new results calculated, so it only matches NumPy on the 1st run. I.e. A := alpha*x*y.T + A . So that