vectorization | 易学教程

Create a matrix from a vector where each row is a shifted version of the vector

阅读更多关于 Create a matrix from a vector where each row is a shifted version of the vector

问题 I have a numpy array like this import numpy as np ar = np.array([1, 2, 3, 4]) and I want to create an array that looks like this: array([[4, 1, 2, 3], [3, 4, 1, 2], [2, 3, 4, 1], [1, 2, 3, 4]]) Thereby, each row corresponds to ar which is shifted by the row index + 1. A straightforward implementation could look like this: ar_roll = np.tile(ar, ar.shape[0]).reshape(ar.shape[0], ar.shape[0]) for indi, ri in enumerate(ar_roll): ar_roll[indi, :] = np.roll(ri, indi + 1) which gives me the desired

Fast vectorized conversion from RGB to BGRA

阅读更多关于 Fast vectorized conversion from RGB to BGRA

问题 In a follow-up to some previous questions on converting RGB to RGBA, and ARGB to BGR, I would like to speed up a RGB to BGRA conversion with SSE . Assume a 32-bit machine, and would like to use intrinsics . I'm having difficulty aligning both source and destination buffers to work with 128-bit registers, and seek for other savvy vectorization solutions. The routine to be vectorized is as follows... void RGB8ToBGRX8(int w, const void *in, void *out) { int i; int width = w; const unsigned char

Fast way of getting index of match in list

阅读更多关于 Fast way of getting index of match in list

问题 Given a list a containing vectors of unequal length and a vector b containing some elements from the vectors in a , I want to get a vector of equal length to b containing the index in a where the element in b matches (this is a bad explanation I know)... The following code does the job: a <- list(1:3, 4:5, 6:9) b <- c(2, 3, 5, 8) sapply(b, function(x, list) which(unlist(lapply(list, function(y, z) z %in% y, z=x))), list=a) [1] 1 1 2 3 Replacing the sapply with a for loop achieves the same of

How to benchmark Matlab processes?

阅读更多关于 How to benchmark Matlab processes?

问题 Searching for an idea how to avoid using loop in my Matlab code, I found following comments under one question on SE: The statement "for loops are slow in Matlab" is no longer generally true since Matlab...euhm, R2008a? and Have you tried to benchmark a for loop vs what you already have? sometimes it is faster than vectorized code... So I would like to ask, is there commonly used way to test the speed of a process in Matlab ? Can user see somewhere how much time the process takes or the only

Geographical distance by group - Applying a function on each pair of rows

阅读更多关于 Geographical distance by group - Applying a function on each pair of rows

问题 I want to calculate the average geographical distance between a number of houses per province. Suppose I have the following data. df1 <- data.frame(province = c(1, 1, 1, 2, 2, 2), house = c(1, 2, 3, 4, 5, 6), lat = c(-76.6, -76.5, -76.4, -75.4, -80.9, -85.7), lon = c(39.2, 39.1, 39.3, 60.8, 53.3, 40.2)) Using the geosphere library I can find the distance between two houses. For instance: library(geosphere) distm(c(df1$lon[1], df1$lat[1]), c(df1$lon[2], df1$lat[2]), fun = distHaversine) #11429

Vectorized implementation to create multiple rows from a single row in pandas dataframe

阅读更多关于 Vectorized implementation to create multiple rows from a single row in pandas dataframe

问题 For each row in the input table, I need to generate multiple rows by separating the date range based on monthly. (please refer to the below sample output). There is a simple iterative approach to convert row by row, but it is very slow on large dataframes. Could anyone suggest a vectorized approach, such as using apply(), map() etc. to achieve the objective? The output table is a new table. Input: ID, START_DATE, END_DATE 1, 2010-12-08, 2011-03-01 2, 2010-12-10, 2011-01-12 3, 2010-12-16, 2011

Removing rows with duplicates in a NumPy array

阅读更多关于 Removing rows with duplicates in a NumPy array

问题 I have a (N,3) array of numpy values: >>> vals = numpy.array([[1,2,3],[4,5,6],[7,8,7],[0,4,5],[2,2,1],[0,0,0],[5,4,3]]) >>> vals array([[1, 2, 3], [4, 5, 6], [7, 8, 7], [0, 4, 5], [2, 2, 1], [0, 0, 0], [5, 4, 3]]) I'd like to remove rows from the array that have a duplicate value. For example, the result for the above array should be: >>> duplicates_removed array([[1, 2, 3], [4, 5, 6], [0, 4, 5], [5, 4, 3]]) I'm not sure how to do this efficiently with numpy without looping (the array could

Efficient colon operator for multiple start and end points

阅读更多关于 Efficient colon operator for multiple start and end points

问题 Suppose I have the following two variables: start_idx = [1 4 7]; end_idx = [2 6 15]; I want to efficiently (no for loop if possible) generate a single row which consists of the colon operator being applied between corresponding elements of start_idx and end_idx . For this example, this would result in: result = [1:2 4:6 7:15]; Therefore: results = [1 2 4 5 6 7 8 9 10 11 12 13 14 15]; The method to do this should be usable inside Simulink's MATLAB Function block. Thank you very much! 回答1: Here

Coding practice in R : what are the advantages and disadvantages of different styles?

阅读更多关于 Coding practice in R : what are the advantages and disadvantages of different styles?

问题 The recent questions regarding the use of require versus :: raised the question about which programming styles are used when programming in R, and what their advantages/disadvantages are. Browsing through the source code or browsing on the net, you see a lot of different styles displayed. The main trends in my code : heavy vectorization I play a lot with the indices (and nested indices), which results in rather obscure code sometimes but is generally a lot faster than other solutions. eg: x[x

Pandas: reshaping data

阅读更多关于 Pandas: reshaping data

问题 I have a pandas Series which presently looks like this: 14 [Yellow, Pizza, Restaurants] ... 160920 [Automotive, Auto Parts & Supplies] 160921 [Lighting Fixtures & Equipment, Home Services] 160922 [Food, Pizza, Candy Stores] 160923 [Hair Removal, Nail Salons, Beauty & Spas] 160924 [Hair Removal, Nail Salons, Beauty & Spas] And I want to radically reshape it into a dataframe that looks something like this... Yellow Automotive Pizza 14 1 0 1 … 160920 0 1 0 160921 0 0 0 160922 0 0 1 160923 0 0 0