What is the best way to flatten a list of lists in Spring Batch?

半城伤御伤魂 提交于 2020-04-30 07:26:07

问题


In Spring Batch, processors Map from one of an Input type to one of an Output type. However I have a need to generate a list of Output type (List<O>) from one I.

The processor can return the List<O> just fine, but supposing I want to work with the elements of this list as individuals in subsequent processors. Am I expected to write them to the database first? In fact I have some enrichment from a remote service that needs to be done to each member of the List<O> so I do not want them written anywhere until the individual objects in the list can be processed.

This is related to the previous post of mine in which I was told that @JobScope and in-memory transfer of objects between steps is 90% a code smell. I'm curious whether I'm missing a special Spring Batch pattern here for flattening the resulting list of lists that would be different than a write of half-baked objects to a db, cache or flat file ahead of processing.

But ultimately I want the writer to use a chunk of O not a chunk of List<O>. So what is the recommended approach for this? So far I came up with the following used as a @JobScope bean:

public class FlatMapPipe<T> implements ItemWriter<List<T>>, ItemReader<T> {

    private LinkedList<List<T>> lists = new LinkedList<List<T>>();

    /**
     * Pages through the internal linked list to find the next item
     * @return next item in the current list or the first item in the next list or null
     * @throws Exception
     * @throws UnexpectedInputException
     * @throws ParseException
     * @throws NonTransientResourceException
     */
    @Override
    public T read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
        if (lists.size() == 0) {
            return null;
        }
        List<T> list = lists.get(0);
        if (list.isEmpty()) {
            lists.remove();
            return read();
        } else {
            return list.remove(0);
        }
    }

    /**
     * Appends a list to the linked list of lists of written Items
     * @param list
     * @throws Exception
     */
    @Override
    public void write(List<? extends List<T>> list) throws Exception {
        list.forEach((it) -> lists.add(new ArrayList<>(it)));
    }
} 

回答1:


The processor can return the List just fine, but supposing I want to work with the elements of this list as individuals in subsequent processors. Am I expected to write them to the database first?

No need to write them to the database first, that would be inefficient. Encapsulation is your friend here, you can wrap the result of your processor in an aggregate type that can be handed to subsequent processors in the chain (using a composite processor for instance). The item writer is then responsible for doing the flat map operation to unwrap fully processed items from the aggregate type before writing them.

Another technique is to use two concurrent steps with a staging area (where you would flatten items) as described in issue #2044. I implemented a PoC here with a blocking queue as staging area. In your case, the first step would process items and write the results in the queue, and the second step can read (flat) items from the queue, enrich them as necessary and write them where appropriate.



来源:https://stackoverflow.com/questions/60644783/what-is-the-best-way-to-flatten-a-list-of-lists-in-spring-batch

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!