blas

cuBLAS argmin — segfault if outputing to device memory?

谁说胖子不能爱 提交于 2019-12-29 01:40:09
问题 In cuBLAS, cublasIsamin() gives the argmin for a single-precision array. Here's the full function declaration: cublasStatus_t cublasIsamin(cublasHandle_t handle, int n, const float *x, int incx, int *result) The cuBLAS programmer guide provides this information about the cublasIsamin() parameters: If I use host (CPU) memory for result , then cublasIsamin works properly. Here's an example: void argmin_experiment_hostOutput(){ float h_A[4] = {1, 2, 3, 4}; int N = 4; float* d_A = 0; CHECK_CUDART

R Packages, gcc, and BLAS on Amazon EC2

倖福魔咒の 提交于 2019-12-25 03:39:12
问题 I am trying to install RTextTools for R on my Amazon EC2 instance. I'm using R 3.1.1. (installed 2014-07-10) with Amazon's Linux AMI. I open R with root privileges and try the following: > install.packages('RTextTools') Installing package into ‘/root/R/x86_64-redhat-linux-gnu-library/3.1’ (as ‘lib’ is unspecified) also installing the dependencies ‘slam’, ‘tm’, ‘maxent’ trying URL 'http://cran.stat.ucla.edu/src/contrib/slam_0.1-32.tar.gz' Content type 'application/x-tar' length 46672 bytes (45

Calling BLAS functions

半世苍凉 提交于 2019-12-24 15:42:30
问题 Here is a simple program PROGRAM MAIN implicit none integer, PARAMETER :: N=10 real*8 :: A(N) real*8 :: x=0.1D0 integer :: i=1 Do i=1,N A(i)=i end do call dscal(N,x, A, 1) x=dasum(N,A,1) END PROGRAM MAIN I compile with the command gfortran test.f90 -o test -O1 -I /usr/include/ -L /usr/lib -lblas While I have no problem calling the subroutine dscal I get the following error for the function dasum test.f90:15.2: x=dasum(N,A,1) 1 Error: Function 'dasum' at (1) has no IMPLICIT type Should I

When is 'crossprod' preferred to '%*%', and when isn't?

耗尽温柔 提交于 2019-12-24 09:24:03
问题 When exactly is crossprod(X,Y) preferred to t(X) %*% Y when X and Y are both matrices? The documentation says Given matrices x and y as arguments, return a matrix cross-product. This is formally equivalent to (but usually slightly faster than) the call t(x) %*% y ( crossprod ) or x %*% t(y) ( tcrossprod ). So when is it not faster? When searching online I found several sources that stated either that crossprod is generally preferred and should be used as default (e.g. here), or that it

matrix multiplication for integral types using BLAS

十年热恋 提交于 2019-12-23 12:56:30
问题 Is there an equivalent of dgemm (from BLAS) for integral types? I only know of dgemm, sgemm for double precision / single precision matrices, but would like to have it for matrices that are of integral type such as int (or short int...). Note: I'm not looking for a solution that involves converting to float/double, and am looking for a fast library implementation. Also, same question for dgemms (using strassen algorithm). 回答1: BLAS algorithms don't natively support integer types. 回答2: You did

Theano with Anaconda on Windows: how to setup BLAS?

◇◆丶佛笑我妖孽 提交于 2019-12-23 03:01:30
问题 I've used Anaconda to install Theano (and Keras) on Windows 7 64bit. Here are my steps. Install the latest Anaconda for Python 3.5 conda install mingw libpython pip install Theano conda install pydot-ng pip install keras Edit .keras/keras.json to use "theano" instead of "tensorflow". Open Jupyter, copy and paste this code: https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py It executes fine until the call to model.fit : imports, data download, model compilation all work.

Theano with Anaconda on Windows: how to setup BLAS?

爱⌒轻易说出口 提交于 2019-12-23 03:01:13
问题 I've used Anaconda to install Theano (and Keras) on Windows 7 64bit. Here are my steps. Install the latest Anaconda for Python 3.5 conda install mingw libpython pip install Theano conda install pydot-ng pip install keras Edit .keras/keras.json to use "theano" instead of "tensorflow". Open Jupyter, copy and paste this code: https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py It executes fine until the call to model.fit : imports, data download, model compilation all work.

R and nvblas.dynlib (on a mac)

南楼画角 提交于 2019-12-22 10:38:23
问题 I have R on my mac installed via CRAN. I also have openblas installed via homebrew. I can switch between BLAS implementations as follows: Reference blas (netlib I think): ln -sf /Library/Frameworks/R.framework/Resources/lib/libRblas.0.dylib /Library/Frameworks/R.framework/Resources/lib/libRblas.dylib vecLib (Apple's BLAS): ln -sf /System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Versions/Current/libBLAS.dylib /Library/Frameworks/R.framework/Resources/lib/libRblas

CFFI Not Loading Dependent Libraries?

寵の児 提交于 2019-12-22 10:14:47
问题 I am trying to use the BLAS/LAPACK libraries from SBCL (specifically trying to get the LLA package running). I was having a lot of troubles getting the BLAS shared library to load; eventually I discovered that it wasn't able to load its dependent libraries. Eventually I was able to load BLAS by loaded all of its dependencies manually: (setq cffi::*foreign-library-directories* '("C:/cygwin64/bin/" "C:/cygwin64/lib/lapack/")) (CFFI:LOAD-FOREIGN-LIBRARY "CYGWIN1.DLL") (CFFI:LOAD-FOREIGN-LIBRARY

Compile numpy WITHOUT Intel MKL/BLAS/ATLAS/LAPACK

限于喜欢 提交于 2019-12-22 08:48:21
问题 I am using py2exe to convert a script which uses numpy and am getting a very large resulting folder, and it seems a lot of the large files are from parts of the numpy package that I'm not using, such as numpy.linalg . To reduce the size of folder that is created, I have been led to believe I should have numpy compiled without Intel MKL/BLAS/ATLAS/LAPACK. How would I make this change? EDIT In C:\Python27\Lib\site-packages\numpy\linalg I found the following files: _umath_linalg.pyd (34MB) and