How to compare Two List of Map to identify the matching and non matching records with multiple filter predicates in Java8 Streams

 ̄綄美尐妖づ 提交于 2021-02-10 12:48:51

问题


Requirement is to get all the matching and non matching records from the List of Map using multiple matching criteria using the streams. i.e Instead of having a single filter to compare only "Email", required to compare two list for matching records with multiple filter predicate for comparing Email and Id both.

List1:

[{"Email","naveen@domain.com", "Id": "A1"}, 
 {"Email":"test@domain.com","id":"A2"}]

List2:

[{"Email","naveen@domain.com", "Id": "A1"}, 
 {"Email":"test@domain.com","id":"A2"}, 
 {"Email":"test1@domain.com","id":"B1"}]

Using streams I'm able to find the matching and non matching records using Single filter predicate on Email: Matching Records :

[{"Email","naveen@domain.com", "Id": "A1"}, 
 {"Email":"test@domain.com","id":"A2"}]

Non Matching Records :

[{"Email":"test1@domain.com","id":"B1"}]]

Is there a way to compare both Email and Id comparison instead of just Email

dbRecords.parallelStream().filter(searchData ->
                inputRecords.parallelStream().anyMatch(inputMap ->
                    searchData.get("Email").equals(inputMap.get("Email")))).
                collect(Collectors.toList());

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
public class ListFiltersToGetMatchingRecords {


    public static void main(String[] args) {

        long startTime = System.currentTimeMillis();
        List<Map<String, Object>> dbRecords = createDbRecords();
        List<Map<String, Object>> inputRecords = createInputRecords();

        List<Map<String,Object>> matchinRecords = dbRecords.parallelStream().filter(searchData ->
                inputRecords.parallelStream().anyMatch(inputMap ->
                    searchData.get("Email").equals(inputMap.get("Email")))).
                collect(Collectors.toList());

        List<Map<String,Object>> notMatchinRecords = inputRecords.parallelStream().filter(searchData ->
                dbRecords.parallelStream().noneMatch( inputMap ->
                        searchData.get("Email").equals(inputMap.get("Email"))
                )).collect(Collectors.toList());

        long endTime = System.currentTimeMillis();
        System.out.println("Matching Records: " + matchinRecords.size());
        matchinRecords.forEach(record -> {
            System.out.println(record.get("Email"));
        });

        System.out.println("Non Matching Records" + notMatchinRecords.size());
        notMatchinRecords.forEach(record -> {
            System.out.println(record.get("Email"));
        });
        System.out.println("Non Matching Records" + notMatchinRecords.size());
        System.out.println("Matching Records: " + matchinRecords.size());
        System.out.println("TotalTImeTaken =" + ((endTime-startTime) /1000) + "sec");
    }

    private static List<Map<String, Object>> createDbRecords() {
        List<Map<String, Object>> dbRecords = new ArrayList<>();
        for(int i =0; i< 100; i+=2) {
            Map<String, Object> dbRecord = new HashMap<>();
            dbRecord.put("Email","naveen" + i +"@gmail.com");
            dbRecord.put("Id", "ID" + i);
            dbRecords.add(dbRecord);
        }
        return dbRecords;
    }

    private static List<Map<String, Object>> createInputRecords() {
        List<Map<String, Object>> dbRecords = new ArrayList<>();
        for(int i =0; i< 100; i++) {
            Map<String, Object> dbRecord = new HashMap<>();
            dbRecord.put("Email", "naveen" + i +"@gmail.com");
            dbRecord.put("ID", "ID" + i);
            dbRecords.add(dbRecord);
        }
        return dbRecords;
    }
}

回答1:


If you care for performance, you should not combine a linear search with another linear search; with the resulting time complexity can’t be fixed with parallel processing when the lists get large.

You should built a data structure which allows efficient lookups first:

Map<List<?>,Map<String, Object>> inputKeys = inputRecords.stream()
    .collect(Collectors.toMap(
        m -> Arrays.asList(m.get("ID"),m.get("Email")),
        m -> m,
        (a,b) -> { throw new IllegalStateException("duplicate "+a+" and "+b); },
        LinkedHashMap::new));

List<Map<String,Object>> matchinRecords = dbRecords.stream()
    .filter(m -> inputKeys.containsKey(Arrays.asList(m.get("ID"),m.get("Email"))))
    .collect(Collectors.toList());

matchinRecords.forEach(m -> inputKeys.remove(Arrays.asList(m.get("ID"),m.get("Email"))));
List<Map<String,Object>> notMatchinRecords = new ArrayList<>(inputKeys.values());

This solution will keep the identity of the Maps.

If you are only interested in the values associated with the "Email" key, it would be much simpler:

Map<Object,Object> notMatchinRecords = inputRecords.stream()
    .collect(Collectors.toMap(
        m -> m.get("ID"),
        m -> m.get("Email"),
        (a,b) -> { throw new IllegalStateException("duplicate"); },
        LinkedHashMap::new));

Object notPresent = new Object();
Map<Object,Object> matchinRecords = dbRecords.stream()
    .filter(m -> notMatchinRecords.getOrDefault(m.get("ID"), notPresent)
                                  .equals(m.get("Email")))
    .collect(Collectors.toMap(
        m -> m.get("ID"),
        m -> m.get("Email"),
        (a,b) -> { throw new IllegalStateException("duplicate"); },
        LinkedHashMap::new));

notMatchinRecords.keySet().removeAll(matchinRecords.keySet());

System.out.println("Matching Records: " + matchinRecords.size());
matchinRecords.forEach((id,email) -> System.out.println(email));

System.out.println("Non Matching Records" + notMatchinRecords.size());
notMatchinRecords.forEach((id,email) -> System.out.println(email));

The first variant can get extended to support more/other map entries easily:

List<String> keys = Arrays.asList("ID", "Email");

Function<Map<String,Object>,List<?>> getKey
    = m -> keys.stream().map(m::get).collect(Collectors.toList());

Map<List<?>,Map<String, Object>> inputKeys = inputRecords.stream()
    .collect(Collectors.toMap(
        getKey,
        m -> m,
        (a,b) -> { throw new IllegalStateException("duplicate "+a+" and "+b); },
        LinkedHashMap::new));

List<Map<String,Object>> matchinRecords = dbRecords.stream()
    .filter(m -> inputKeys.containsKey(getKey.apply(m)))
    .collect(Collectors.toList());

matchinRecords.forEach(m -> inputKeys.remove(getKey.apply(m)));
List<Map<String,Object>> notMatchinRecords = new ArrayList<>(inputKeys.values());



回答2:


Why not use && inside anyMatch:

anyMatch(inputMap -> searchData.get("Email").equals(inputMap.get("Email")) 
                     && searchData.get("Id").equals(inputMap.get("Id")))

And I doubt you actually need parallelStream, you do need System.nanoTime on the other hand instead of currentTimeMillis




回答3:


You just need to add a condition in the comparison

dbRecords.parallelStream().filter(searchData -> 
                  inputRecords.parallelStream().anyMatch(inputMap ->
                                     searchData.get("Email").equals(inputMap.get("Email"))
                                     && searchData.get("id").equals(inputMap.get("id"))))
         .collect(Collectors.toList());

  • Add the same in the noneMatch().
  • Compute the average time using System.nanoTime(), it's more accurate
  • Try with and without .parallelStream() (just .stream()) because not sure it helps you)



回答4:


Here it is mate...

The most efficient way to compare two List of Map to identify the matching and non matching records with multiple filter predicates in Java8 Streams is:

List<Map<String,String>> unMatchedRecords = dbRecords.parallelStream().filter(searchData ->
                inputRecords.parallelStream().noneMatch( inputMap ->
                        searchData.entrySet().stream().noneMatch(value ->
                                inputMap.entrySet().stream().noneMatch(value1 ->
                                        (value1.getKey().equals(value.getKey()) &&
                                                value1.getValue().equals(value.getValue()))))
                )).collect(Collectors.toList());

Note:

  1. If <Map<String,String> used above is <Map<Object,Object> instead, don't forget to apply .toString() for .getKey() and value.getKey().

  2. The unmatched records thus obtained, could be easily subtracted from either of the list (i.e., dbRecords or inputRecords) to retrieve the match results and the operation is swift.

Cheers,

Shubham Chauhan



来源:https://stackoverflow.com/questions/51280088/how-to-compare-two-list-of-map-to-identify-the-matching-and-non-matching-records

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!