blas | 易学教程

cuBLAS argmin — segfault if outputing to device memory?

阅读更多关于 cuBLAS argmin — segfault if outputing to device memory?

问题 In cuBLAS, cublasIsamin() gives the argmin for a single-precision array. Here's the full function declaration: cublasStatus_t cublasIsamin(cublasHandle_t handle, int n, const float *x, int incx, int *result) The cuBLAS programmer guide provides this information about the cublasIsamin() parameters: If I use host (CPU) memory for result , then cublasIsamin works properly. Here's an example: void argmin_experiment_hostOutput(){ float h_A[4] = {1, 2, 3, 4}; int N = 4; float* d_A = 0; CHECK_CUDART

R Packages, gcc, and BLAS on Amazon EC2

阅读更多关于 R Packages, gcc, and BLAS on Amazon EC2

问题 I am trying to install RTextTools for R on my Amazon EC2 instance. I'm using R 3.1.1. (installed 2014-07-10) with Amazon's Linux AMI. I open R with root privileges and try the following: > install.packages('RTextTools') Installing package into ‘/root/R/x86_64-redhat-linux-gnu-library/3.1’ (as ‘lib’ is unspecified) also installing the dependencies ‘slam’, ‘tm’, ‘maxent’ trying URL 'http://cran.stat.ucla.edu/src/contrib/slam_0.1-32.tar.gz' Content type 'application/x-tar' length 46672 bytes (45

Calling BLAS functions

阅读更多关于 Calling BLAS functions

问题 Here is a simple program PROGRAM MAIN implicit none integer, PARAMETER :: N=10 real*8 :: A(N) real*8 :: x=0.1D0 integer :: i=1 Do i=1,N A(i)=i end do call dscal(N,x, A, 1) x=dasum(N,A,1) END PROGRAM MAIN I compile with the command gfortran test.f90 -o test -O1 -I /usr/include/ -L /usr/lib -lblas While I have no problem calling the subroutine dscal I get the following error for the function dasum test.f90:15.2: x=dasum(N,A,1) 1 Error: Function 'dasum' at (1) has no IMPLICIT type Should I

When is 'crossprod' preferred to '%*%', and when isn't?

阅读更多关于 When is 'crossprod' preferred to '%*%', and when isn't?

问题 When exactly is crossprod(X,Y) preferred to t(X) %*% Y when X and Y are both matrices? The documentation says Given matrices x and y as arguments, return a matrix cross-product. This is formally equivalent to (but usually slightly faster than) the call t(x) %*% y ( crossprod ) or x %*% t(y) ( tcrossprod ). So when is it not faster? When searching online I found several sources that stated either that crossprod is generally preferred and should be used as default (e.g. here), or that it

matrix multiplication for integral types using BLAS

阅读更多关于 matrix multiplication for integral types using BLAS

问题 Is there an equivalent of dgemm (from BLAS) for integral types? I only know of dgemm, sgemm for double precision / single precision matrices, but would like to have it for matrices that are of integral type such as int (or short int...). Note: I'm not looking for a solution that involves converting to float/double, and am looking for a fast library implementation. Also, same question for dgemms (using strassen algorithm). 回答1: BLAS algorithms don't natively support integer types. 回答2: You did

Theano with Anaconda on Windows: how to setup BLAS?

阅读更多关于 Theano with Anaconda on Windows: how to setup BLAS?

问题 I've used Anaconda to install Theano (and Keras) on Windows 7 64bit. Here are my steps. Install the latest Anaconda for Python 3.5 conda install mingw libpython pip install Theano conda install pydot-ng pip install keras Edit .keras/keras.json to use "theano" instead of "tensorflow". Open Jupyter, copy and paste this code: https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py It executes fine until the call to model.fit : imports, data download, model compilation all work.

Theano with Anaconda on Windows: how to setup BLAS?

阅读更多关于 Theano with Anaconda on Windows: how to setup BLAS?

R and nvblas.dynlib (on a mac)

阅读更多关于 R and nvblas.dynlib (on a mac)

问题 I have R on my mac installed via CRAN. I also have openblas installed via homebrew. I can switch between BLAS implementations as follows: Reference blas (netlib I think): ln -sf /Library/Frameworks/R.framework/Resources/lib/libRblas.0.dylib /Library/Frameworks/R.framework/Resources/lib/libRblas.dylib vecLib (Apple's BLAS): ln -sf /System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Versions/Current/libBLAS.dylib /Library/Frameworks/R.framework/Resources/lib/libRblas

CFFI Not Loading Dependent Libraries?

阅读更多关于 CFFI Not Loading Dependent Libraries?

问题 I am trying to use the BLAS/LAPACK libraries from SBCL (specifically trying to get the LLA package running). I was having a lot of troubles getting the BLAS shared library to load; eventually I discovered that it wasn't able to load its dependent libraries. Eventually I was able to load BLAS by loaded all of its dependencies manually: (setq cffi::*foreign-library-directories* '("C:/cygwin64/bin/" "C:/cygwin64/lib/lapack/")) (CFFI:LOAD-FOREIGN-LIBRARY "CYGWIN1.DLL") (CFFI:LOAD-FOREIGN-LIBRARY

Compile numpy WITHOUT Intel MKL/BLAS/ATLAS/LAPACK

阅读更多关于 Compile numpy WITHOUT Intel MKL/BLAS/ATLAS/LAPACK

问题 I am using py2exe to convert a script which uses numpy and am getting a very large resulting folder, and it seems a lot of the large files are from parts of the numpy package that I'm not using, such as numpy.linalg . To reduce the size of folder that is created, I have been led to believe I should have numpy compiled without Intel MKL/BLAS/ATLAS/LAPACK. How would I make this change? EDIT In C:\Python27\Lib\site-packages\numpy\linalg I found the following files: _umath_linalg.pyd (34MB) and