blas

Why Fortran is slow in the julia benchmark “rand_mat_mul”?

半城伤御伤魂 提交于 2019-12-04 15:55:02
Benchmark test results on the home page of Julia ( http://julialang.org/ ) shows that Fortran is about 4x slower than Julia/Numpy in the "rand_mat_mul" benchmark. I can not understand that why fortran is slower while calling from the same fortran library (BLAS)?? I have also performed a simple test for matrix multiplication evolving fortran, julia and numpy and got the similar results: Julia n = 1000; A = rand(n,n); B = rand(n,n); @time C = A*B; >> elapsed time: 0.069577896 seconds (7 MB allocated) Numpy in IPython from numpy import * n = 1000; A = random.rand(n,n); B = random.rand(n,n); %time

Numpy SVD appears to parallelize on Mac OSX, but not on my Ubuntu virtual machine

馋奶兔 提交于 2019-12-04 15:14:24
I want to run the following script: #python imports import time #3rd party imports import numpy as np import pandas as pd def pd_svd(pd_dataframe): np_dataframe = pd_dataframe.values return np.linalg.svd(pd_dataframe) if __name__ == '__main__': li_times = [] for i in range(1, 3): start = time.time() pd_dataframe = pd.DataFrame(np.random.random((3000, 252 * i))) pd_svd(pd_dataframe) li_times.append(str(time.time() - start)) print li_times I try it on my Macbook Air 2011 with OSX 10.9.4 and on a 16 core cloud VM running Ubuntu 12.0.4. For some reason, this takes approximately 4 seconds on my

Cross-Compiling Armadillo Linear Algebra Library

不羁的心 提交于 2019-12-04 13:12:11
问题 I enjoy using the Armadillo Linear Algebra Library. It becomes extremely nice when porting octave .m files over to C++, especially when you have to use the eigen methods. However I ran into issues when I had to take my program from my native vanilla G++ and dump it onto my ARM processor. Since I spent a few hours muddling my way though it I wanted to share so others might avoid some frustration. If anyone else could add anything else I would love it. This was the process I used to tackle this

How to perform Vector-Matrix Multiplication with BLAS ?

眉间皱痕 提交于 2019-12-04 12:21:17
问题 BLAS defines the GEMV (Matrix-Vector Multiplication) level-2 operation. How to use a BLAS Library to perform Vector-Matrix Multiplication ? It's probably obvious, but I don't see how to use BLAS operation for this multiplication. I would have expected a GEVM operation. 回答1: The Matrix-Vector multiplication of a (M x N) Matrix with a (N x 1) Vector will result an (M x 1) Vector. In short a*A(MxN)*X(Nx1) + b*Y(Mx1) -> Y(Mx1) . Of course you can use INCX and INCY when your vector is included in

Matrix-vector product with dgemm/dgemv

匆匆过客 提交于 2019-12-04 11:19:15
Using Lapack with C++ is giving me a small headache. I found the functions defined for fortran a bit eccentric, so I tried to make a few functions on C++ to make it easier for me to read what's going on. Anyway, I'm not getting th matrix-vector product working as I wish. Here is a small sample of the program. smallmatlib.cpp: #include <cstdio> #include <stdlib.h> extern "C"{ // product C= alphaA.B + betaC void dgemm_(char* TRANSA, char* TRANSB, const int* M, const int* N, const int* K, double* alpha, double* A, const int* LDA, double* B, const int* LDB, double* beta, double* C, const int* LDC)

Prefetch for Intel Core 2 Duo

本秂侑毒 提交于 2019-12-04 10:00:47
Has anyone had experience using prefetch instructions for the Core 2 Duo processor? I've been using the (standard?) prefetch set ( prefetchnta , prefetcht1 , etc) with success for a series of P4 machines, but when running the code on a Core 2 Duo it seems that the prefetcht(i) instructions do nothing, and that the prefetchnta instruction is less effective. My criteria for assessing performance is the timing results for a BLAS 1 vector-vector (axpy) operation, when the vector size is large enough for out-of-cache behaviour. Have Intel introduced new prefetch instructions? From an Intel

“undefined reference to 'cblas_ddot'” when using cblas library

China☆狼群 提交于 2019-12-04 09:33:58
I was testing the cblas ddot, and the code I used is from the link and I fixed it as #include <stdio.h> #include <stdlib.h> #include <cblas.h> int main() { double m[10],n[10]; int i; int result; printf("Enter the elements into first vector.\n"); for(i=0;i<10;i++) scanf("%lf",&m[i]); printf("Enter the elements into second vector.\n"); for(i=0;i<10;i++) scanf("%lf",&n[i]); result = cblas_ddot(10, m, 1, n, 1); printf("The result is %d\n",result); return 0; } Then when I compiled it, it turned out to be: /tmp/ccJIpqKH.o: In function `main': test.c:(.text+0xbc): undefined reference to `cblas_ddot'

Any good documentation for the cblas interface? [closed]

筅森魡賤 提交于 2019-12-04 09:13:56
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . Can someone recommend a good reference or tutorial for the cblas interface? Nothing comes up on google, all of the man pages I've found are for the fortran blas interface, and the pdf that came with MKL literally took ten seconds to search and wasn't helpful. In particular, I'm curious why there is an extra

'Symbol lookup error' with netlib-java

有些话、适合烂在心里 提交于 2019-12-04 08:37:52
Background & Problem I am having a bit of trouble running the examples in Spark's MLLib on a machine running Fedora 23. I have built Spark 1.6.2 with the following options per Spark documentation: build/mvn -Pnetlib-lgpl -Pyarn -Phadoop-2.4 \ -Dhadoop.version=2.4.0 -DskipTests clean package and upon running the binary classification example: bin/spark-submit --class org.apache.spark.examples.mllib.BinaryClassification \ examples/target/scala-*/spark-examples-*.jar \ --algorithm LR --regType L2 --regParam 1.0 \ data/mllib/sample_binary_classification_data.txt I receive the following error: /usr

Correct way to point to ATLAS/BLAS/LAPACK libraries for numpy build?

送分小仙女□ 提交于 2019-12-04 06:28:02
I'm building numpy from source on CentOS 6.5 with no root access (python -V=2.7.6). I have the latest numpy source from git. I cannot for the life of me get numpy to acknowledge atlas libs. I have: ls -1 /usr/lib64/atlas libatlas.so.3 libatlas.so.3.0 libcblas.so.3 libcblas.so.3.0 libclapack.so.3 libclapack.so.3.0 libf77blas.so.3 libf77blas.so.3.0 liblapack.so.3 liblapack.so.3.0 libptcblas.so.3 libptcblas.so.3.0 libptf77blas.so.3 libptf77blas.so.3.0 I don't know anything about how these libs came about, but I can only assume that the atlas builds would be faster than any standard BLAS/LAPACK