blas

Set max number of threads at runtime on numpy/openblas

回眸只為那壹抹淺笑 提交于 2019-12-18 12:29:11
问题 I'd like to know if it's possible to change at (Python) runtime the maximum number of threads used by OpenBLAS behind numpy? I know it's possible to set it before running the interpreter through the environment variable OMP_NUM_THREADS , but I'd like to change it at runtime. Typically, when using MKL instead of OpenBLAS, it is possible: import mkl mkl.set_num_threads(n) 回答1: You can do this by calling the openblas_set_num_threads function using ctypes . I often find myself wanting to do this,

Element-wise vector-vector multiplication in BLAS?

…衆ロ難τιáo~ 提交于 2019-12-18 11:53:08
问题 Is there a means to do element-wise vector-vector multiplication with BLAS, GSL or any other high performance library ? 回答1: (Taking the title of the question literally...) Yes it can be done with BLAS alone (though it is probably not the most efficient way.) The trick is to treat one of the input vectors as a diagonal matrix: ⎡a ⎤ ⎡x⎤ ⎡ax⎤ ⎢ b ⎥ ⎢y⎥ = ⎢by⎥ ⎣ c⎦ ⎣z⎦ ⎣cz⎦ You can then use one of the matrix-vector multiply functions that can take a diagonal matrix as input without padding, e.g.

Element-wise vector-vector multiplication in BLAS?

↘锁芯ラ 提交于 2019-12-18 11:52:57
问题 Is there a means to do element-wise vector-vector multiplication with BLAS, GSL or any other high performance library ? 回答1: (Taking the title of the question literally...) Yes it can be done with BLAS alone (though it is probably not the most efficient way.) The trick is to treat one of the input vectors as a diagonal matrix: ⎡a ⎤ ⎡x⎤ ⎡ax⎤ ⎢ b ⎥ ⎢y⎥ = ⎢by⎥ ⎣ c⎦ ⎣z⎦ ⎣cz⎦ You can then use one of the matrix-vector multiply functions that can take a diagonal matrix as input without padding, e.g.

TensorFlow: Blas GEMM launch failed

房东的猫 提交于 2019-12-18 11:13:15
问题 When I'm trying to use TensorFlow with Keras using the gpu, I'm getting this error message: C:\Users\nicol\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\__main__.py:2: UserWarning: Update your `fit_generator` call to the Keras 2 API: `fit_generator(<keras.pre..., 37800, epochs=2, validation_data=<keras.pre..., validation_steps=4200)` from ipykernel import kernelapp as app Epoch 1/2 InternalError Traceback (most recent call last) C:\Users\nicol\Anaconda3\envs\tensorflow\lib\site

Efficient way of computing matrix product AXA'?

為{幸葍}努か 提交于 2019-12-18 07:52:54
问题 I'm currently using BLAS function DSYMM to compute Y = AX and then DGEMM for YA' , but I'm wondering is there some more efficient way of computing the matrix product AXA T , where A is an arbitrary n×n matrix and X is a symmetric n×n matrix? 来源: https://stackoverflow.com/questions/11139933/efficient-way-of-computing-matrix-product-axa

Statically linking against LAPACK

我是研究僧i 提交于 2019-12-17 21:06:06
问题 I'm attempting to do a release of some software and am currently working through a script for the build process. I'm stuck on something I never thought I would be, statically linking LAPACK on x86_64 linux. During configuration AC_SEARCH_LIB([main],[lapack]) works, but compilation of the lapack units do not work, for example undefiend reference to 'dsyev_' --no lapack/blas routine goes unnoticed. I've confirmed I have the libraries installed and even compiled them myself with the appropriate

performance of NumPy with different BLAS implementations

落花浮王杯 提交于 2019-12-17 19:38:09
问题 I'm running an algorithm that is implemented in Python and uses NumPy. The most computationally expensive part of the algorithm involves solving a set of linear systems (i.e. a call to numpy.linalg.solve() . I came up with this small benchmark: import numpy as np import time # Create two large random matrices a = np.random.randn(5000, 5000) b = np.random.randn(5000, 5000) t1 = time.time() # That's the expensive call: np.linalg.solve(a, b) print time.time() - t1 I've been running this on: My

TensorFlow: InternalError: Blas SGEMM launch failed

南笙酒味 提交于 2019-12-17 15:24:55
问题 When I run sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) I get InternalError: Blas SGEMM launch failed . Here is the full error and stack trace: InternalErrorTraceback (most recent call last) <ipython-input-9-a3261a02bdce> in <module>() 1 batch_xs, batch_ys = mnist.train.next_batch(100) ----> 2 sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) /usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict, options, run

Calling MATLAB's built-in LAPACK/BLAS routines

邮差的信 提交于 2019-12-17 11:00:29
问题 I want to learn how to call the built-in LAPACK/BLAS routines in MATLAB. I have experience in MATLAB and mex files but I've actually no idea how to call LAPACK or BLAS libraries. I've found the gateway routines in file exchange that simplifies the calls since I don't have to write a mex file for any function such as this one. I need any toy example to learn the basic messaging between MATLAB and these built-in libraries. Any toy example such as matrix multiplication or LU decomposition is

MatLab error: cannot open with static TLS

自闭症网瘾萝莉.ら 提交于 2019-12-17 10:14:41
问题 Since a couple of days, I constantly receive the same error while using MATLAB which happens at some point with dlopen . I am pretty new to MATLAB, and that is why I don't know what to do. Google doesn't seem to be helping me either. When I try to make an eigenvector, I get this: Error using eig LAPACK loading error: dlopen: cannot load any more object with static TLS I also get this while making a multiplication: Error using * BLAS loading error: dlopen: cannot load any more object with