vectorization

Numpy vectorize and atomic vectors

江枫思渺然 提交于 2019-12-11 09:08:22
问题 I would like to implement a function that works like the numpy.sum function on arrays as on expects, e.g. np.sum([2,3],1) = [3,4] and np.sum([1,2],[3,4]) = [4,6]. Yet a trivial test implementation already behaves somehow awkward: import numpy as np def triv(a, b): return a, b triv_vec = np.vectorize(fun, otypes = [np.int]) triv_vec([1,2],[3,4]) with result: array([0, 0]) rather than the desired result: array([[1,3], [2,4]]) Any ideas, what is going on here? Thx 回答1: You need otypes=[np.int,np

How to vectorize searching function and Intersection in Matlab?

╄→гoц情女王★ 提交于 2019-12-11 09:06:17
问题 Here is a Matlab coding problem (A little different version with setdiff not intersect here): a rating matrix A with 3 cols, the 1st col is user'ID which maybe duplicated, 2nd col is the item'ID which maybe duplicated, 3rd col is rating from user to item, ranging from 1 to 5. Now, I have a subset of user IDs smallUserIDList and a subset of item IDs smallItemIDList , then I want to find the rows in A that rated by users in smallUserIDList , and collect the items that user rated, and do some

Vectorization of min distance in kernel

社会主义新天地 提交于 2019-12-11 08:49:40
问题 I have an Nx2 array K1 with the location of N keypoints and a 3 dimensional WxHx3 array Kart1(width,height,coordinates) that maps coordinates to every pixel of an image. For every keypoint in K1 I want to read the location of the pixel in Kart1 and evaluate the coordinates (search for the min/max or calculate the mean) in a 3x3 kernel around it and assign a value to the current pixel in KPCoor1 . My current approach looks like this: for ii=1:length(K1(:,1)) %for every keypoint in K1 MinDist

Matlab Function Performance - Too many loops

本小妞迷上赌 提交于 2019-12-11 07:49:01
问题 I have to write in a txt file several informations in a LOT of rows. The result is a file like: result.txt: RED;12;7;0;2;1;4;7;0.0140 RED;12;7;0;2;2;9;7;0.1484 RED;12;7;0;2;3;7;4;0.1787 RED;12;7;0;2;4;2;6;0.7891 RED;12;7;0;2;5;9;6;0.1160 RED;12;7;0;2;6;9;1;0.9893 ... Which is build by the code below (with some reduced dimensions): /* the variables 'str1', 'num1', 'day', 'vect1', 'vect2' and 'MD' are inputs of this function /* str1 is a string 1x1 /* num1 is a integer 1x1 /* day is a vector

Matlab: Access matrix elements using indices stored in other matrices

℡╲_俬逩灬. 提交于 2019-12-11 06:48:29
问题 I am working in matlab . I have five matrices in ,out, out_temp,ind_i , ind_j, all of identical dimensions say n x m . I want to implement the following loop in one line. out = zeros(n,m) out_temp = zeros(n,m) for i = 1:n for j = 1:m out(ind_i(i,j),ind_j(i,j)) = in(ind_i(i,j),ind_j(i,j)); out_temp(ind_i(i,j),ind_j(i,j)) = some_scalar_value; end end It is assured that the values in ind_i lies in range 1:n and values in ind_j lies in range 1:m . I believe a way to implement line 3 would give

How to improve np.random.choice() looping efficiency

我们两清 提交于 2019-12-11 06:19:47
问题 I am trying to apply np.random.choice to a big array with different weights, and wondering any way could avoid looping and improve the performance? Over here len(weights) could be millions. weights = [[0.1, 0.5, 0.4], [0.2, 0.4, 0.4], ... [0.3, 0.3, 0.4]] choice = [1, 2, 3] ret = np.zeros((len(weights), 20)) for i in range(len(weights)): ret[i] = np.random.choice(choice, 20, p=weights[i]) 回答1: Here's a generalization of my answer in Fast random weighted selection across all rows of a

tfidf vectorizer process shows error

白昼怎懂夜的黑 提交于 2019-12-11 06:15:00
问题 I am working on non-Engish corpus analysis but facing several problems. One of those problems is tfidf_vectorizer. After importing concerned liberaries, I processed following code to get results contents = [open("D:\test.txt", encoding='utf8').read()] #define vectorizer parameters tfidf_vectorizer = TfidfVectorizer(max_df=0.8, max_features=200000, min_df=0.2, stop_words=stopwords, use_idf=True, tokenizer=tokenize_and_stem, ngram_range=(3,3)) %time tfidf_matrix = tfidf_vectorizer.fit_transform

vectorizing “expansion of date range” per row in dplyr of R

拥有回忆 提交于 2019-12-11 06:05:58
问题 I have a dataset in tibble in R like the one below: # A tibble: 50,045 x 5 ref_key start_date end_date <chr> <date> <date> 1 123 2010-01-08 2010-01-13 2 123 2010-01-21 2010-01-23 3 123 2010-03-10 2010-04-14 I need to create another tibble that each row only store one date, like the one below: ref_key date <chr> <date> 1 123 2010-01-08 2 123 2010-01-09 3 123 2010-01-10 4 123 2010-01-11 5 123 2010-01-12 6 123 2010-01-13 7 123 2010-01-21 8 123 2010-01-22 9 123 2010-01-23 Currently I am writing

How to check if pandas dataframe rows have certain values in various columns, scalability

て烟熏妆下的殇ゞ 提交于 2019-12-11 06:05:50
问题 I have implemented the CN2 classification algorithm, it induces rules to classify the data of the form: IF Attribute1 = a AND Attribute4 = b THEN class = class 1 My current implementation loops through a pandas DataFrame containing the training data using the iterrows() function and returns True or False for each row if it satisfies the rule or not, however, I am aware this is a highly inefficient solution. I would like to vectorise the code, my current attempt is like so: DataFrame = df age

Python: Reshape 3D image series to pixel series

Deadly 提交于 2019-12-11 05:50:34
问题 i have a 3D numpy array of images with the shape (imageCount, width, height) . My goal is to transform this into a 2D pixel series array with the shape (pixelPosition, imageCount) . Right now this is my solution: timeSeries= [] for h in range(height): for w in range(width): timeSeries.append(images[:,h,w]) is there a simpler way with numpy.reshape() or something like this? 回答1: Transpose and reshape - images.transpose(1,2,0).reshape(height*width,-1) 来源: https://stackoverflow.com/questions