vectorization | 易学教程

Efficiently replace part of value from one column with value from another column in pandas using regex?

阅读更多关于 Efficiently replace part of value from one column with value from another column in pandas using regex?

问题 I have a pandas dataframe df with dates as strings: Date1 Date2 2017-08-31 1970-01-01 17:35:00 2017-10-31 1970-01-01 15:00:00 2017-11-30 1970-01-01 16:30:00 2017-10-31 1970-01-01 16:00:00 2017-10-31 1970-01-01 16:12:00 What I want to do is replace each date part in the Date2 column with the corresponding date in Date1 but leave the time untouched, so the output is: Date1 Date2 2017-08-31 2017-08-31 17:35:00 2017-10-31 2017-10-31 15:00:00 2017-11-30 2017-11-30 16:30:00 2017-10-31 2017-10-31 16

How to check if all elements of a numpy array are in another numpy array

阅读更多关于 How to check if all elements of a numpy array are in another numpy array

问题 I have two 2D numpy arrays, for example: A = numpy.array([[1, 2, 4, 8], [16, 32, 32, 8], [64, 32, 16, 8]]) and B = numpy.array([[1, 2], [32, 32]]) I want to have all lines from A where I can find all elements from any of the lines of B . Where there are 2 of the same element in a row of B , lines from A must contain at least 2 as well. In case of my example, I want to achieve this: A_filtered = [[1, 2, 4, 8], [16, 32, 32, 8]] I have control over the values representation so I chose numbers

MATLAB Efficiently find the row that contains two of three elements in a large matrix

阅读更多关于 MATLAB Efficiently find the row that contains two of three elements in a large matrix

问题 I have a large matrix, let's call it A, which has dimension Mx3, e.g. M=4000 rows x 3 columns. Each row in the matrix contains three numbers, eg. [241 112 478]. Out of these three numbers, we can construct three pairs, eg. [241 112], [112 478], [241 478]. Of the other 3999 rows: For each of the three pairs, exactly one row of M (only one) will contain the same pair. However, the order of the numbers could be scrambled. For example, exactly one row will read: [333 478 112]. No other row will

Expression Template implementation not being optimized

阅读更多关于 Expression Template implementation not being optimized

问题 I'm trying to understand the concept of expression templates in C++, as such I've cobbled together pieces of example code etc to produce a simple vector and associated expression template infrastructure to support only binary operators (+,-,*). Everything compiles, however I've noticed the performance difference between the standard hand written loop versus the expression template variant is quite large. ET is nearly twice as slow as the hand written. I expected a difference but not that much

Vectorizing ther higher dimensions in nested for loop in Matlab

阅读更多关于 Vectorizing ther higher dimensions in nested for loop in Matlab

问题 I have a 5D matrix A , and I need to multiply the 3rd-5th dimensions with a vector. For example, see the following sample code: A=rand(50,50,10,8,6); B=rand(10,1); C=rand(8,1); D=rand(6,1); for i=1:size(A,3) for j=1:size(A,4) for K=1:size(A,5) A(:,:,i,j,K)=A(:,:,i,j,K)*B(i)*C(j)*D(K); end end end I wonder if there's a better \ vectorized \ faster way to do this? 回答1: Firstly, as a note, these days in Matlab, with JIT compilation, vectorised code is not necessarily faster/better. For big

vectorize numpy mean across the slices of an array

阅读更多关于 vectorize numpy mean across the slices of an array

问题 Is there a way to vectorize a function so that the output would be an array of means where each mean represents the mean of the values from 0-index of the input array? Looping this is pretty straightforward but I am trying to be as efficient as possible. e.g. 0 = mean(0), 1 = mean(0-1), N = mean(0-N) 回答1: The intended operation could be coined as cumulative averaging . So, an obvious solution would involve cumulative summation and dividing those summations by the number of elements

Linspace applied on array [duplicate]

阅读更多关于 Linspace applied on array [duplicate]

问题 This question already has an answer here : Linspace using matrix input matlab (1 answer) Closed last year . Given an array like a = [ -1; 0; 1]; . For each a(i) , I need to compute a linearly spaced vector with linspace(min(a(i),0),max(a(i),0),3); , where each linspace-vector should be stored into a matrix: A = [-1 -0.5 0; 0 0 0; 0 0.5 1]; With a for loop, I can do this like so: for i=1:3 A(i) = linspace(min(a(i),0),max(a(i),0),3); end How can I achieve this without using loops? 回答1: The

Efficiency problem of customizing numpy's vectorized operation

阅读更多关于 Efficiency problem of customizing numpy's vectorized operation

问题 I have a python function given below: def myfun(x): if x > 0: return 0 else: return np.exp(x) where np is the numpy library. I want to make the function vectorized in numpy, so I use: vec_myfun = np.vectorize(myfun) I did a test to evaluate the efficiency. First I generate a vector of 100 random numbers: x = np.random.randn(100) Then I run the following code to obtain the runtime: %timeit np.exp(x) %timeit vec_myfun(x) The runtime for np.exp(x) is 1.07 µs ± 24.9 ns per loop (mean ± std. dev.

create a matrix from array of elements under diagonal in numpy

阅读更多关于 create a matrix from array of elements under diagonal in numpy

问题 I would like to create a matrix using a list whose elements would be the elements of the matrix under the diagonal. import numpy as np x1 = np.array([0.9375, 0.75, 0.4375, 0.0, 0.9375, 0.75, 0.4375, 0.9375, 0.75, 0.9375]) x1 the matrix I would like to have is array([[ 1. , 0.9375, 0.75 , 0.4375, 0. ], [ 0.9375, 1. , 0.9375, 0.75 , 0.4375], [ 0.75 , 0.9375, 1. , 0.9375, 0.75 ], [ 0.4375, 0.75 , 0.9375, 1. , 0.9375], [ 0. , 0.4375, 0.75 , 0.9375, 1. ]]) I thought you could do this with np.tril

User Warning: Your stop_words may be inconsistent with your preprocessing

阅读更多关于 User Warning: Your stop_words may be inconsistent with your preprocessing

问题 I am following this document clustering tutorial. As an input I give a txt file which can be downloaded here. It's a combined file of 3 other txt files divided with a use of \n. After creating a tf-idf matrix I received this warning: ,,UserWarning: Your stop_words may be inconsistent with your preprocessing. Tokenizing the stop words generated tokens ['abov', 'afterward', 'alon', 'alreadi', 'alway', 'ani', 'anoth', 'anyon', 'anyth', 'anywher', 'becam', 'becaus', 'becom', 'befor', 'besid',