Broadcasting/Vectorizing inner and outer for loops in python/NumPy

允我心安 提交于 2021-02-11 06:30:32

问题


Purpose

I have turned a double for loop into a single for loop using vectorization. I would like to now get rid of the last loop.

I want to slice an Nx3 array of coordinates and calculate distances between the sliced portion and the remaining portion without using a for loop.

Two cases

(1) the slice is always 3x3.

(2) the slice is variable i.e., Mx3 where M is always significantly smaller than N

Vectorizing the interaction of 1 row of the slice interacting with the remainder is straightforward. However, I am stuck using a for loop to do (in the case of the slice of size 3) 3 loops, to calculate all distances.

Context:

The Nx3 array is atom coordinates, the slice is all atoms in a specific molecule. I want to calculate the energy of a given molecule interacting with the rest of the system. The first step is calculating the distances between each atom in the molecule, with all other atoms. The second part is to use those distances in a function to calculate energy, and that is outside the scope of this question.

Here is what I have for a working minimal example (I have vectorized the inner loop, but, need to (would really like to...) vectorize the outer loop. That loop won't always be of only size 3, and python is slow at for loops.

Minimal Working Example

import numpy as np 

box=10 # simulation box is size 10 for this example
r = np.random.rand(1000,3) * box  # avoids huge numbers later by scaling  coords

start=0 #fixed starting index for example (first atom)
end=2   #fixed ending index for example   (last atom)

rj=np.delete(r, np.arange(start,end), 0)
ri = r[np.arange(start,end),:]

atoms_in_molecule, coords = np.shape(ri)
energy = 0
for a in range(atoms_in_molecule):
    rij = ri[a,:] - rj                # I want to get rid of this 'a' index dependance
    rij = rij - np.rint(rij/box)*box  # periodic boundary conditions - necessary
    rij_sq = np.sum(rij**2,axis=1)

    # perform energy calculation using rij_sq
    ener = 4 * ((1/rij_sq)**12 - (1/rij_sq)**6)  # dummy LJ, do not optimize
    energy += np.sum(ener)

print(energy)

This question is not about optimizing the vectorizing I already have. I have played around with pdist/cdist and others. All I want is to get rid of the pesky for loop over atoms. I will optimize the rest.


回答1:


Here how you can do it:

R = ri[:,None] - rj[None, :]
R = R - np.rint(R/box)*box
R_sq = np.sum(np.square(R), axis=2)

energy = np.sum(4 * ((1/R_sq)**12 - (1/R_sq)**6))


来源:https://stackoverflow.com/questions/60906051/broadcasting-vectorizing-inner-and-outer-for-loops-in-python-numpy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!