问题
I am working on an application which has a large array containing lines of numbers,
transNum[20000][200]//this is the 2d array containing the numbers and always keep track of the line numbers
I am using a nested loop to look for the most frequent items. which is
for(int i=0/*,lineitems=0*/;i<lineCounter;i++)
{
for(int j=0,shows=1;j<lineitem1[i];j++)
{
for(int t=i+1;t<lineCounter;t++)
{
for(int s=0;s<lineitem1[t];s++)
{
if(transNum[i][j]==transNum[t][s])
shows++;
}
}
if(shows/lineCounter>=0.2)
{
freItem[i][lineitem2[i]]=transNum[i][j];
lineitem2[i]++;
}
}
}
when I was doing tests using small input arrays like test[200][200], this loop works fine and the computing time is acceptable, but when I try to process the array contains 12000 lines, the computing time is too long, so I am thinking if there are other ways to compute the frequent items rather than using this loop.I just ran a test on 10688 lines, and the time to get all the frequent item is 825805ms, which is way to expensive.
回答1:
Depends on your input. If you are also inserting the data in the same code then you can count frequent items as you insert them.
Here is a pseudo-C solution:
int counts[1000000];
while(each number as n)
{
counts[n]++;
// then insert number into array
}
EDIT #2: Make sure, so you don't get unexpected results, to initialize all the items in the array to zero.
回答2:
Bear in mind this is an O(n^2) algorithm at best and could be worse. That means the number of operations is proportional to the count of the items squared. After a certain number of lines, performance will degrade rapidly and there's nothing you can do about it except to improve the algorithm.
回答3:
The Multiset implementation from Google Guava project might be useful in such cases. You could store items there and then retrieve set of values with count of each occurrence.
回答4:
Gave the algorithm for this one some thought. Here's the solution I came up with:
import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Random;
public class NumberTotalizerTest {
public static void main(String args[]) {
HashMap<Integer,Integer> hashMap = new HashMap<Integer,Integer>();
// Number input
Random randomGenerator = new Random();
for (int i = 1; i <= 50; ++i ) {
int randomInt = randomGenerator.nextInt(15);
System.out.println("Generated : " + randomInt);
Integer tempInt = hashMap.get(randomInt);
// Counting takes place here
hashMap.put(randomInt, tempInt==null?1:(tempInt+1) );
}
// Sorting and display
Iterator itr = sortByValue(hashMap).iterator();
System.out.println( "Occurences from lowest to highest:" );
while(itr.hasNext()){
int key = (Integer) itr.next();
System.out.println( "Number: " + key + ", occurences: " + hashMap.get(key));
}
}
public static List sortByValue(final Map m) {
List keys = new ArrayList();
keys.addAll(m.keySet());
Collections.sort(keys, new Comparator() {
public int compare(Object o1, Object o2) {
Object v1 = m.get(o1);
Object v2 = m.get(o2);
if (v1 == null) {
return (v2 == null) ? 0 : 1;
}
else if (v1 instanceof Comparable) {
return ((Comparable) v1).compareTo(v2);
}
else {
return 0;
}
}
});
return keys;
}
}
来源:https://stackoverflow.com/questions/3847079/how-to-get-the-most-frequent-items