I wrote a test that attempts to test two things:
If you store millions of objects, then the Add or Contains functions will be super slow. Best way is to split it using hashMap of Arrays. Although similar algorithms can be used for other types of objects, this is how I improved 1000 times faster the processing of 10 million strings (the memory taken is 2-3 times more)
public static class ArrayHashList {
private String temp1, temp2;
HashMap allKeys = new HashMap();
ArrayList curKeys;
private int keySize;
public ArrayHashList(int keySize) {
this.keySize = keySize;
}
public ArrayHashList(int keySize, String fromFileName) {
this.keySize = keySize;
String line;
try{
BufferedReader br1 = new BufferedReader(new FileReader(fromFileName));
while ((line = br1.readLine()) != null)
addString(line);
br1.close();
}catch(Exception e){
e.printStackTrace();
}
}
public boolean addString(String strToAdd) {
if (strToAdd.length()
to init and use it:
ArrayHashList fullHlist = new ArrayHashList(3, filesPath+"\\allPhrases.txt");
ArrayList pendingList = new ArrayList();
BufferedReader br1 = new BufferedReader(new FileReader(filesPath + "\\processedPhrases.txt"));
while ((line = br1.readLine()) != null) {
wordEnc = StrUtil.GetFirstToken(line,",~~~,");
if (!fullHlist.haveString(wordEnc))
pendingList.add(wordEnc);
}
br1.close();