Perform multiple unrelated operations on elements of a single stream in Java

后端 未结 4 698
暗喜
暗喜 2020-12-10 17:18

How can I perform multiple unrelated operations on elements of a single stream?

Say I have a List composed from a text. Each string in the

相关标签:
4条回答
  • 2020-12-10 17:42

    Here is the answer to address the OP from a different aspect. First of all, let's take a look how fast/slow to iterate a list/collection. Here is the test result on my machine by the below performance test:

    When: length of string list = 100, Thread number = 1, loops = 1000, unit = milliseconds


    OP: 0.013

    Accepted answer: 0.020

    By the counter function: 0.010


    When: length of string list = 1000_000, Thread number = 1, loops = 100, unit = milliseconds


    OP: 99.387

    Accepted answer: 89.848

    By the counter function: 59.183


    Conclusion: The percentage of performance improvement is pretty small or even slower(if the length of string list is small). generally, it's a mistake to reduce the iteration of list/collection which is loaded in memory by the more complicate collector. you won't get much performance improvements. we should look into somewhere else if there is a performance issue.

    Here is my performance test code with tool Profiler: (I'm not going to discuss how to do a performance test here. if you doubt the test result, you can do it again with any tool you believe in)

    @Test
    public void test_46539786() {
        final int strsLength = 1000_000;
        final int threadNum = 1;
        final int loops = 100;
        final int rounds = 3;
    
        final List<String> strs = IntStream.range(0, strsLength).mapToObj(i -> i % 2 == 0 ? i + " of " + i : i + " for " + i).toList();
    
        Profiler.run(threadNum, loops, rounds, "OP", () -> {
            List<Integer> wordsInStr = strs.stream().filter(t -> t.contains("of")).map(t -> t.split(" ").length).collect(Collectors.toList());
            List<String> linePortionAfterFor = strs.stream().filter(t -> t.contains("for")).map(t -> t.substring(t.indexOf("for")))
                    .collect(Collectors.toList());
    
            assertTrue(wordsInStr.size() == linePortionAfterFor.size());
        }).printResult();
    
        Profiler.run(threadNum, loops, rounds, "Accepted answer", () -> {
            Splitter collect = strs.stream().collect(Collector.of(Splitter::new, Splitter::accept, Splitter::merge));
            assertTrue(collect.counts.size() == collect.words.size());
        }).printResult();
    
        final Function<String, Integer> counter = s -> {
            int count = 0;
            for (int i = 0, len = s.length(); i < len; i++) {
                if (s.charAt(i) == ' ') {
                    count++;
                }
            }
            return count;
        };
    
        Profiler.run(threadNum, loops, rounds, "By the counter function", () -> {
            List<Integer> wordsInStr = strs.stream().filter(t -> t.contains("of")).map(counter).collect(Collectors.toList());
            List<String> linePortionAfterFor = strs.stream().filter(t -> t.contains("for")).map(t -> t.substring(t.indexOf("for")))
                    .collect(Collectors.toList());
    
            assertTrue(wordsInStr.size() == linePortionAfterFor.size());
        }).printResult();
    }
    
    0 讨论(0)
  • 2020-12-10 17:49

    If you want a single pass Stream then you have to use a custom Collector (parallelization possible).

    class Splitter {
      public List<String> words = new ArrayList<>();
      public List<Integer> counts = new ArrayList<>();
    
      public void accept(String s) {
        if(s.contains("of")) {
          counts.add(s.split(" ").length);
        } else if(s.contains("for")) {
          words.add(s.substring(s.indexOf("for")));
        }
      }
    
      public Splitter merge(Splitter other) {
        words.addAll(other.words);
        counts.addAll(other.counts);
        return this;
      }
    }
    Splitter collect = strs.stream().collect(
      Collector.of(Splitter::new, Splitter::accept, Splitter::merge)
    );
    System.out.println(collect.counts);
    System.out.println(collect.words);
    
    0 讨论(0)
  • 2020-12-10 18:00

    You could use a custom collector for that and iterate only once:

     private static <T, R> Collector<String, ?, Pair<List<String>, List<Long>>> multiple() {
    
        class Acc {
    
            List<String> strings = new ArrayList<>();
    
            List<Long> longs = new ArrayList<>();
    
            void add(String elem) {
                if (elem.contains("of")) {
                    long howMany = Arrays.stream(elem.split(" ")).count();
                    longs.add(howMany);
                }
                if (elem.contains("for")) {
                    String result = elem.substring(elem.indexOf("for"));
                    strings.add(result);
                }
    
            }
    
            Acc merge(Acc right) {
                longs.addAll(right.longs);
                strings.addAll(right.strings);
                return this;
            }
    
            public Pair<List<String>, List<Long>> finisher() {
                return Pair.of(strings, longs);
            }
    
        }
        return Collector.of(Acc::new, Acc::add, Acc::merge, Acc::finisher);
    }
    

    Usage would be:

    Pair<List<String>, List<Long>> pair = Stream.of("t of r m", "t of r m", "nice for nice nice again")
                .collect(multiple());
    
    0 讨论(0)
  • 2020-12-10 18:01

    If you want to have 1 stream through a list, you need a way to manage 2 different states, you can do this by implementing Consumer to new class.

        class WordsInStr implements Consumer<String> {
    
          ArrayList<Integer> list = new ArrayList<>();
    
          @Override
          public void accept(String s) {
            Stream.of(s).filter(t -> t.contains("of")) //probably would be faster without stream here
                .map(t -> t.split(" ").length)
                .forEach(list::add);
          }
        }
    
        class LinePortionAfterFor implements Consumer<String> {
    
          ArrayList<String> list = new ArrayList<>();
    
          @Override
          public void accept(String s) {
            Stream.of(s) //probably would be faster without stream here
                .filter(t -> t.contains("for"))
                .map(t -> t.substring(t.indexOf("for")))
                .forEach(list::add);
          }
        }
    
        WordsInStr w = new WordsInStr();
        LinePortionAfterFor l = new LinePortionAfterFor();
    
        strs.stream()//stream not needed here
            .forEach(w.andThen(l));
        System.out.println(w.list);
        System.out.println(l.list);
    
    0 讨论(0)
提交回复
热议问题