comparing two collections for comparing two text files for additions, deletions, modifications

自古美人都是妖i 提交于 2019-12-18 09:36:33

问题


I have two collections as below which hold IDs for Students.

The ids are Strings in the format 111-1111. e.g. of ids 221-2534, 215-6365, etc.

 Collection<String> newKeys = new ArrayList<String>();
 Collection<String> oldKeys = new ArrayList<String>();

The ids are in a fixed format file along with other data. That is first 8 char ids, next 10 char name, next 10 char addr, etc.

I am reading ids into collection as below:

String oldFile = "C:\\oldFile.dat";
String newFile = "C:\\newFile.dat";
BufferedReader in;
String str;
// Read keys from old file
in = new BufferedReader(new FileReader(oldFile));
while ((str = in.readLine()) != null) {
      oldKeys.add(str.substring(0, 8).trim());
}
in.close();

// Read keys from new file
in = new BufferedReader(new FileReader(newFile));
while ((str = in.readLine()) != null) {
    newKeys.add(str.substring(0, 8).trim());
}
in.close();   

Here the entries in the file are sorted on SSN. So I believe the collections formed will also be sorted.

Now:

Case: I want to know the differences as resultant lists by comparing the two collections. That is I need lists which contains entries which got added, entries which got removed and entries which are same.

I will then use the list having common entries to read corresponding data from both files and compare that for any modifications.

That is after I have the common list --

a) Take a id from the list. Read the corresponding data for this id from both files into Strings. Compare the String for any differences. In case of a difference, move the newFile String into a fileWithUpdates.

b) Do nothing in case of no difference.

Questions:

1) Is this correct approach ?

2) Also how to compare the two collections to get resultant lists viz. toBeDeleted, toBeAdded and sameEntries ?

3) How to read a specific line from a file on a key (student id in this case) ?

Update:

Based on below answer, added the below code:

Iterator<String> iOld = oldKeys.iterator();
    Iterator<String> iNew = newKeys.iterator();
    Map<String, String> tempMap = new HashMap<String, String>();

    while (iOld.hasNext()) {
        tempMap.put(iOld.next(), "old");
    }

    while (iNew.hasNext()) {
        String temp = iNew.next();
        if (tempMap.containsKey(temp)) {
            tempMap.put(temp, "both");
        }

        else {
            System.out.println("here");
            tempMap.put(temp, "new");
        }
    }

So now I have a map which has:

Entries to be compared: Entries in above map with value "both"

Entries to be added: Entries in above map with value "new"

Entries to be deleted: Entries in above map with value "old"

So my problem boils down to:

How to read a specific line from a file on a key so that I can compare them for data modifications??

Thanks for reading!


回答1:


Overall, I don't think this is the correct approach. Instead of storing all the information in a single String, I would create an object with fields for the various things you need to store.

public Student {
   String id; //or int, or char[8]
   String firstName, lastName;
   String address;
  //and so on

  //constructor - Given a line of input from the data file, create a Student object
  public Student(String line) {
     id = line.substring(0,8);
     //and so on

  }

As for comparing the two collections, let's declare them both as ArrayLists and then keep track of the indices of what they have in common.

ArrayList<String> newKeys = new ArrayList<>();  //java 7 syntax
ArrayList<String> oldKeys = new ArrayList<>();
//store keys from files.

TreeMap<Integer, Integer> commonKeys = new TreeMap<Integer, Integer>();
//stores the index values from newList as keys that get mapped to the old list index.

ArrayList<Integer> removedKeys =ArrayList<>();  
// Store the indices from oldKeys that are not in newKeys.

int newListIndex = 0;
int oldListIndex = 0;
while(newListIndex < newKeys.size() && oldListIndex<oldKeys.size()) {
   if(newKeys.get(newListIndex).equals(oldKeys.get(oldListIndex) ) {
      commonKeys.put(newListIndex,oldListIndex);
      oldListIndex++; newListIndex++ 
   }
   else if(newKeys.get(newListIndex).compareTo(oldKeys.get(oldListIndex)>0 ) {
      removedKeys.add(oldListIndex);
      oldListIndex++
   }
   else {
      //maybe this is a newListIndex that is not in the old list, so it was added.
      newListIndex++;
   }
}

You will need to tweak the above code a bit to make it fail-safe. Another approach is to use the contains method like this:

for(int i=0; i<oldKeys.size(); i++) {
   String oldKey = oldKeys.get(i);
   if(newKeys.contians(oldKey);
       commonKeys.put(newKeys.indexOf(oldKey) , i);
   else
       removedKeys.add(i);

}



回答2:


If your files are not too large, maybe you can do the following steps

  • Create a HashMap
  • For every entry in old file, add it with value 'Old'
  • For every entry in new file,
    • Check if it is in the HashMap
      • If so, then set value 'Both' (Additionally, you could add it to a HashMap of common elements)
      • If not, add it with value 'New'

This should hopefully address question 2. Please do let me know if it works. Thanks!




回答3:


you could proceed like this,

Collection<String> newKeys = new ArrayList<String>();  
Collection<String> oldKeys = new ArrayList<String>(); 

Collection<String> toBeDeleted = new ArrayList(oldKeys).removeAll(newKeys);
Collection<String> toBeAdded = new ArrayList(newKeys).removeAll(oldKeys);

Collection<String> sameEntries = new ArrayList(newKeys).removeAll(toBeAdded);

though for the third question, you would be better off with using a HashMap (or a TreeMap if you want to keep the keys automatically sorted).

***updates

In your original file reading code, you can make the following change,

Map<String, String> oldContentMap = new HashMap<String, String>();  
while ((str = in.readLine()) != null) {       
    oldKeys.add(str.substring(0, 8).trim()); 
    oldContentMap.put(str.substring(0, 8).trim(),str.substring(8).trim());
} 
in.close(); 

and similarly for new file,

  Map<String, String> newContentMap = new HashMap<String, String>();  
    while ((str = in.readLine()) != null) {       
        newKeys.add(str.substring(0, 8).trim()); 
        newContentMap.put(str.substring(0, 8).trim(),str.substring(8).trim());
    } 
    in.close(); 

Now you can proceed to compare by,

for (Map.Entry<String, String> entry : tempMap.entrySet()) { 
    if(entry.getValue().equals("both"){ //comparing for keys in both lists
         String oldContent = oldContentMap.get(entry.getKey());
         String newContent = newContentMap.get(entry.getKey());
         if(oldContent.equals(newContent)){
            System.out.println("Different data for key:"+entry.getKey());
         }
    }
}

you can use necessary temp variable and move the declarations outside the loop as well..




回答4:


I would do your task in this way

  • Create two HashMap one for each file(oldFile, newFile), your ids will be the keys of the map
  • Build new arraylists: common, toBeAdded, toBeDeleted
  • loop on oldKeysHashMap keys: for each key check whether the key exists in the newHasMap. If yes check if the two keys contain the same value (this is easy with Maps) -> put the entry in common arraylist. If no put the entry into toBeDeleted.
  • loop on newKeysHashMap and fill in the toBeAdded arrayList
  • Mix the toBeAdded and Common arraysList in a new one. Delete the two original files. Write a new file and fill in the file with the entries of the new mixed arrayList. (deleting and creating a new file should be more qick than searching the ids in the file and removing the line)

I can provide also some code snippet. If you need use an implementation of Map interface that keeps entry sorted. This is not that case of HashMap, SortedHashMap could be the right one.



来源:https://stackoverflow.com/questions/9766720/comparing-two-collections-for-comparing-two-text-files-for-additions-deletions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!