blas | 易学教程

lapack/blas/openblas proper installation from source - replace system libraries with new ones

阅读更多关于 lapack/blas/openblas proper installation from source - replace system libraries with new ones

问题 I wanted to install BLAS, CBLAS, LAPACK and OpenBLAS libraries from source using available packages you can download here openblas and lapack, blas/cblas. Firstly I removed my system blas/cblas and lapack libraries, but unfortunately atlas library couldn't be uninstalled (I can either have both blas and lapack or atlas - can't remove them all). I didn't bother and started compiling downloaded libraries cause I thought that after installation I would be able to remove atlas. Building process

Prefetch for Intel Core 2 Duo

阅读更多关于 Prefetch for Intel Core 2 Duo

问题 Has anyone had experience using prefetch instructions for the Core 2 Duo processor? I've been using the (standard?) prefetch set ( prefetchnta , prefetcht1 , etc) with success for a series of P4 machines, but when running the code on a Core 2 Duo it seems that the prefetcht(i) instructions do nothing, and that the prefetchnta instruction is less effective. My criteria for assessing performance is the timing results for a BLAS 1 vector-vector (axpy) operation, when the vector size is large

Floating point math in python / numpy not reproducible across machines

阅读更多关于 Floating point math in python / numpy not reproducible across machines

问题 Comparing the results of a floating point computation across a couple of different machines, they are consistently producing different results. Here is a stripped down example that reproduces the behavior: import numpy as np from numpy.random import randn as rand M = 1024 N = 2048 np.random.seed(0) a = rand(M,N).astype(dtype=np.float32) w = rand(N,M).astype(dtype=np.float32) b = np.dot(a, w) for i in range(10): b = b + np.dot(b, a)[:, :1024] np.divide(b, 100., out=b) print b[0,:3] Different

Floating point math in python / numpy not reproducible across machines

阅读更多关于 Floating point math in python / numpy not reproducible across machines

R detection of Blas version

阅读更多关于 R detection of Blas version

问题 Is there a way of detecting the version of BLAS that R is using from inside R? I am using Ubuntu, and I have a couple of BLAS versions installed - I just don't know which one is "active" from R's point of view! I am aware of http://r.789695.n4.nabble.com/is-Rs-own-BLAS-td911515.html where Brian Ripley said in June 2006 that it was not possible - but have things changed? 回答1: I think you cannot. R will be built against the BLAS interface , and R itself does not which package supplies the

R detection of Blas version

阅读更多关于 R detection of Blas version

R detection of Blas version

阅读更多关于 R detection of Blas version

Is armadillo solve() thread safe?

阅读更多关于 Is armadillo solve() thread safe?

问题 In my code I have loop in which I construct and over determined linear system and try to solve it: #pragma omp parallel for for (int i = 0; i < n[0]+1; i++) { for (int j = 0; j < n[1]+1; j++) { for (int k = 0; k < n[2]+1; k++) { arma::mat A(max_points, 2); arma::mat y(max_points, 1); // initialize A and y arma::vec solution = solve(A,y); } } } Sometimes, quite randomly the program hangs or the results in the solution vector are NaN. And if I put do this: arma::vec solution; #pragma omp

Why can R be linked to a shared BLAS later even if it was built with `--with-blas = lblas`?

阅读更多关于 Why can R be linked to a shared BLAS later even if it was built with `--with-blas = lblas`?

问题 The BLAS section in R installation and administration manual says that when R is built from source , with configuration parameter --without-blas , it will build Netlib's reference BLAS into a standalone shared library at R_HOME/lib/libRblas.so , along side the standard R shared library R_HOME/lib/libR.so . This makes it easier for user to switch and benchmark different tuned BLAS in R environment. The guide suggests that researcher might use symbolic link to libRblas.so to achieve this, and

Set max number of threads at runtime on numpy/openblas

阅读更多关于 Set max number of threads at runtime on numpy/openblas

问题 I'd like to know if it's possible to change at (Python) runtime the maximum number of threads used by OpenBLAS behind numpy? I know it's possible to set it before running the interpreter through the environment variable OMP_NUM_THREADS , but I'd like to change it at runtime. Typically, when using MKL instead of OpenBLAS, it is possible: import mkl mkl.set_num_threads(n) 回答1: You can do this by calling the openblas_set_num_threads function using ctypes . I often find myself wanting to do this,