问题
I have written a fatigue analysis program with a GUI. The program takes strain information for unit loads for each element of a finite element model, reads in a load case using np.genfromtxt('loadcasefilename.txt') and then does some fatigue analysis and saves the result for each element in another array.
The load cases are about 32Mb as text files and there are 40 or so which get read and analysed in a loop. The loads for each element are interpolated by taking slices of the load case array.
The GUI and fatigue analysis run in separate threads. When you click 'Start' on the fatigue analysis it starts the loop over the load cases in the fatigue analysis.
This brings me onto my problem. If I have a lot of elements, the analysis will not finish. How early it quits depends on how many elements there are, which makes me think it might be a memory problem. I've tried fixing this by deleting the load case array at the end of each loop (after deleting all the arrays which are slices of it) and running gc.collect() but this has not had any success.
In MatLab, I'd use the 'pack' function to write the workspace to disk, clear it, and then reload it at the end of each loop. I know this isn't good practice but it would get the job done! Can I do the equivalent in Python somehow?
Code below:
for LoadCaseNo in range(len(LoadCases[0]['LoadCaseLoops'])):#range(1):#xxx
#Get load case data
self.statustext.emit('Opening current load case file...')
LoadCaseFilePath=LoadCases[0]['LoadCasePaths'][LoadCaseNo][0]
#TK: load case paths may be different
try:
with open(LoadCaseFilePath):
pass
except Exception as e:
self.statustext.emit(str(e))
LoadCaseLoops=LoadCases[0]['LoadCaseLoops'][LoadCaseNo,0]
LoadCase=np.genfromtxt(LoadCaseFilePath,delimiter=',')
LoadCaseArray=np.array(LoadCases[0]['LoadCaseLoops'])
LoadCaseArray=LoadCaseArray/np.sum(LoadCaseArray,axis=0)
#Loop through sections
for SectionNo in range(len(Sections)):#range(100):#xxx
SectionCount=len(Sections)
#Get section data
Elements=Sections[SectionNo]['elements']
UnitStrains=Sections[SectionNo]['strains'][:,1:]
Nodes=Sections[SectionNo]['nodes']
rootdist=Sections[SectionNo]['rootdist']
#Interpolate load case data at this section
NeighbourFind=rootdist-np.reshape(LoadCase[0,1:],(1,-1))
NeighbourFind[NeighbourFind<0]=1e100
nearest=np.unravel_index(NeighbourFind.argmin(), NeighbourFind.shape)
nearestcol=int(nearest[1])
Distance0=LoadCase[0,nearestcol+1]
Distance1=LoadCase[0,nearestcol+7]
MxLow=LoadCase[1:,nearestcol+1]
MxHigh=LoadCase[1:,nearestcol+7]
MyLow=LoadCase[1:,nearestcol+2]
MyHigh=LoadCase[1:,nearestcol+8]
MzLow=LoadCase[1:,nearestcol+3]
MzHigh=LoadCase[1:,nearestcol+9]
FxLow=LoadCase[1:,nearestcol+4]
FxHigh=LoadCase[1:,nearestcol+10]
FyLow=LoadCase[1:,nearestcol+5]
FyHigh=LoadCase[1:,nearestcol+11]
FzLow=LoadCase[1:,nearestcol+6]
FzHigh=LoadCase[1:,nearestcol+12]
InterpFactor=(rootdist-Distance0)/(Distance1-Distance0)
Mx=MxLow+(MxHigh-MxLow)*InterpFactor[0,0]
My=MyLow+(MyHigh-MyLow)*InterpFactor[0,0]
Mz=MzLow+(MzHigh-MzLow)*InterpFactor[0,0]
Fx=-FxLow+(FxHigh-FxLow)*InterpFactor[0,0]
Fy=-FyLow+(FyHigh-FyLow)*InterpFactor[0,0]
Fz=FzLow+(FzHigh-FzLow)*InterpFactor[0,0]
#Loop through section coordinates
for ElementNo in range(len(Elements)):
MaterialID=int(Elements[ElementNo,1])
if Materials[MaterialID]['curvefit'][0,0]!=3:
StrainHist=UnitStrains[ElementNo,0]*Mx+UnitStrains[ElementNo,1]*My+UnitStrains[ElementNo,2]*Fz
elif Materials[MaterialID]['curvefit'][0,0]==3:
StrainHist=UnitStrains[ElementNo,3]*Fx+UnitStrains[ElementNo,4]*Fy+UnitStrains[ElementNo,5]*Mz
EndIn=len(StrainHist)
Extrema=np.bitwise_or(np.bitwise_and(StrainHist[1:EndIn-1]<=StrainHist[0:EndIn-2] , StrainHist[1:EndIn-1]<=StrainHist[2:EndIn]),np.bitwise_and(StrainHist[1:EndIn-1]>=StrainHist[0:EndIn-2] , StrainHist[1:EndIn-1]>=StrainHist[2:EndIn]))
Extrema=np.concatenate((np.array([True]),Extrema,np.array([True])),axis=0)
Extrema=StrainHist[np.where(Extrema==True)]
del StrainHist
#Do fatigue analysis
self.statustext.emit('Analysing load case '+str(LoadCaseNo+1)+' of '+str(len(LoadCases[0]['LoadCaseLoops']))+' - '+str(((SectionNo+1)*100)/SectionCount)+'% complete')
del MxLow,MxHigh,MyLow,MyHigh,MzLow,MzHigh,FxLow,FxHigh,FyLow,FyHigh,FzLow,FzHigh,Mx,My,Mz,Fx,Fy,Fz,Distance0,Distance1
gc.collect()
回答1:
There's obviously a retain cycle or other leak somewhere, but without seeing your code, it's impossible to say more than that. But since you seem to be more interested in workarounds than solutions…
In MatLab, I'd use the 'pack' function to write the workspace to disk, clear it, and then reload it at the end of each loop. I know this isn't good practice but it would get the job done! Can I do the equivalent in Python somehow?
No, Python doesn't have any equivalent to pack. (Of course if you know exactly what set of values you want to keep around, you can always np.savetxt or pickle.dump or otherwise stash them, then exec or spawn a new interpreter instance, then np.loadtxt or pickle.load or otherwise restore those values. But then if you know exactly what set of values you want to keep around, you probably aren't going to have this problem in the first place, unless you've actually hit an unknown memory leak in NumPy, which is unlikely.)
But it has something that may be better. Kick off a child process to analyze each element (or each batch of elements, if they're small enough that the process-spawning overhead matters), send the results back in a file or over a queue, then quit.
For example, if you're doing this:
def analyze(thingy):
a = build_giant_array(thingy)
result = process_giant_array(a)
return result
total = 0
for thingy in thingies:
total += analyze(thingy)
You can change it to this:
def wrap_analyze(thingy, q):
q.put(analyze(thingy))
total = 0
for thingy in thingies:
q = multiprocessing.Queue()
p = multiprocessing.Process(target=wrap_analyze, args=(thingy, q))
p.start()
p.join()
total += q.get()
(This assumes that each thingy and result is both smallish and pickleable. If it's a huge NumPy array, look into NumPy's shared memory wrappers, which are designed to make things much easier when you need to share memory directly between processes instead of passing it.)
But you may want to look at what multiprocessing.Pool can do to automate this for you (and to make it easier to extend the code to, e.g., use all your cores in parallel). Notice that it has a maxtasksperchild parameter, which you can use to recycle the pool processes every, say, 10 thingies, so they don't run out of memory.
But back to actually trying to solve things briefly:
I've tried fixing this by deleting the load case array at the end of each loop (after deleting all the arrays which are slices of it) and running gc.collect() but this has not had any success.
None of that should make any difference at all. If you're just reassigning all the local variables to new values each time through the loop, and aren't keeping references to them anywhere else, then they're just going to get freed up anyway, so you'll never have more than 2 at a (brief) time. And gc.collect() only helps if there are reference cycles. So, on the one hand, it's good news that these had no effect—it means there's nothing obviously stupid in your code. On the other hand, it's bad news—it means that whatever's wrong isn't obviously stupid.
Usually people see this because they keep growing some data structure without realizing it. For example, maybe you vstack all the new rows onto the old version of giant_array instead of onto an empty array, then delete the old version… but it doesn't matter, because each time through the loop, giant_array isn't 5*N, it's 5*N, then 10*N, then 15*N, and so on. (That's just an example of something stupid I did not long ago… Again, it's hard to give more specific examples while knowing nothing about your code.)
来源:https://stackoverflow.com/questions/27418943/free-up-memory-by-deleting-numpy-arrays