问题
I have some code that processes around 30,000 records. The basic outline is like this:
startRecordID = 2345;
endRecordID = 32345;
for(recordID=startRecordID; recordID <= endRecordID; recordID++){
// process record...
}
Now, this processing takes a long time, and I'd like to have a thread pool of 15 threads and give each thread a list of recordIDs to process, and then join them all at the end.
In the past I accomplished this with code that looked something like this, where recordLists was an array of sub-arrays each containing 1/15 of the records to be processed:
<cfset numThreads = 15 />
<!--- keep a running list of threads so we can join them all at the end --->
<cfset threadlist = "" />
<cfloop from="1" to="#numThreads#" index="threadNum">
<cfset threadName = "recordProcessing_#threadNum#" />
<cfset threadlist = listAppend(threadlist, threadName) />
<cfthread action="run" name="#threadName#" recordList="#recordList[threadNum]#">
<cfloop from="1" to="#ArrayLen(recordList)#" index="recordIndex">
<cfset recordID = recordList[recordIndex] />
... process recordID ...
</cfloop>
</cfthread>
</cfloop>
<!--- Join all threads before continuing --->
<cfthread action="join" name="#threadlist#" timeout="4000"/>
This worked well (although I would also convert this old code to cfscript :) ), but to create the recordLists array of sub-arrays is not so simple... The way I can think of to do it would be to loop through the numbers from startRecordID-endRecordID, add each to an array, then run an ArrayDivide function (that we have already defined in our codebase) on it to split it into numThreads (in this case 15) equal sub-arrays. Considering that I have the start of the range, the end of the range, and the number of threads I want to divide it among, isn't there a simpler way to break it up and assign it to the threads?
回答1:
(From comments ..)
If you already have an array, why loop through it again? There are no built in functions, but since an array is a java List, a simple yourArray.subList(startIndex, endIndex) would do the trick. Obviously add some error handling in case the number of records is less than the number of processing threads.
NB: Since it is a java method, indexes start at zero (0) and the endIndex is exclusive. Also, the result is like a CF array in most respects. However, it is immutable ie cannot be modified.
<cfscript>
// calculate how many records to process in each batch
numOfIterations = 15;
totalRecords = arrayLen(recordsArray);
batchSize = ceiling(totalRecords/numOfIterations);
for (t=0; t < numOfIterations; t++) {
// calculate sub array positions
startAt = t * batchSize;
endAt = Min(startAt+batchSize, totalRecords);
// get next batch of records
subArray = recordsArray.subList(startAt, endAt);
// kick off a thread and do whatever you want with the array ...
WriteOutput("<br>Batch ["& t &"] startAt="& startAt &" endAt="& endAt);
}
</cfscript>
来源:https://stackoverflow.com/questions/28740376/how-can-i-split-a-range-of-values-among-a-pool-of-threads