Performance difference between Stream.map and Collectors.mapping [duplicate]

问题

This question already has answers here:

What's the difference between Stream.map(…) and Collectors.mapping(…)? (2 answers)

Why Stream operations is duplicated with Collectors? (2 answers)

Collectors.summingInt() vs mapToInt().sum() (1 answer)

Closed last month.

Last time I was discovering nooks of the functional programming of Java 8 and above and I found out a static method mapping in Collectors class.

We have a class Employee like:

@AllArgsConstructor
@Builder
@Getter
public class Employee {
  private String name;
  private Integer age;
  private Double salary;
}

Let's say that we have a POJO list of Employee class and we want to receive a list of all names of Employees. We have two approaches likes:

    List<Employee> employeeList
        = Arrays.asList(new Employee("Tom Jones", 45, 15000.00),
        new Employee("Harry Andrews", 45, 7000.00),
        new Employee("Ethan Hardy", 65, 8000.00),
        new Employee("Nancy Smith", 22, 10000.00),
        new Employee("Deborah Sprightly", 29, 9000.00));

    //IntelliJ suggest replacing the first approach with ```map``` and ```collect```

    List<String> collect =
        employeeList
        .stream()
        .collect(
            Collectors.mapping(Employee::getName, Collectors.toList()));

    List<String> collect1 =
        employeeList
            .stream()
            .map(Employee::getName)
            .collect(Collectors.toList());

I know that the first approach uses a terminal operation on Stream and the second one intermediate operation on Stream but I want to know if the first approach will have worse performance than second and vice-versa. I would be grateful if you could explain the potential performance degradation for the first case when our data source (employeeList) will significantly increase in size.

EDIT:

I created a simple two test cases which were supplied by records generated in a simple for loop. Accordingly for small data input the difference between ,,traditional'' approach with Stream.map usage and Collectors.mapping is marginal. On the other hand in a scenario when we are intensively increasing the number of data like 30000000 surprisingly Collectors.mapping starts working a little bit better. So as not to be empty-handed for data input 30000000 Collectors.mapping lasts 56 seconds for 10 iterations as @RepeatedTest and with the same data input for the same iteration more recognizable approach like Stream.map and then collect last 5 second longer. I know that my provisional tests are not the best and it cannot illustrate reality due to JVM optimization but we can claim that for huge data input Collectors.mapping can be more desirable. Anyway, I think that this

回答1:

I doubt there is a meaningful performance difference. You'd have to benchmark it on your data to know for sure.

Note that mapping isn't actually intended to be used directly as a collector, but rather as a downstream collector within another collector:

The mapping() collectors are most useful when used in a multi-level reduction, such as downstream of a groupingBy or partitioningBy.

There is something in Effective Java 3rd Edition about this too (in Item 46, about 2/3 of the way down page 214, the paragraph starting "The collectors returned by the counting method"). Basically, it says not to use things like mapping in the first way you do here.

来源：https://stackoverflow.com/questions/58389258/performance-difference-between-stream-map-and-collectors-mapping

标签

java

functional-programming

java-stream

collectors