Kafka multiple partition ordering

前端 未结 2 1682
余生分开走
余生分开走 2020-12-29 14:11

I am aware that it is not possible to order multiple partitions in Kafka and that partition ordering is only guaranteed for a single consumer within a group (for a single pa

2条回答
  •  攒了一身酷
    2020-12-29 14:26

    I'm not using Kafka streams - but it is possible to do this with the normal Consumer.

    First sort the partitions - this assumes you've already seeked to the offset in each you want or used Consumer Group to do it.

    private List>> orderPartitions(ConsumerRecords events) {
    
        Set pollPartitions = events.partitions();
        List>> orderEvents = new ArrayList<>();
        for (TopicPartition tp : pollPartitions) {
            orderEvents.add(events.records(tp));
        }
        // order the list by the first event, each list is ordered internally also
        orderEvents.sort(new PartitionEventListComparator());
        return orderEvents;
    }
    
    /**
     * Used to sort the topic partition event lists so we get them in order
     */
    private class PartitionEventListComparator implements Comparator>> {
    
        @Override
        public int compare(List> list1, List> list2) {
            long c1 = list1.get(0).timestamp();
            long c2 = list2.get(0).timestamp();
            if (c1 < c2) {
                return -1;
            } else if (c1 > c2) {
                return 1;
            }
    
            return 0;
        }
    
    
    }
    

    Then just round robin the partitions to get the events in order - in practice I've found this to work.

                    ConsumerRecords events = consumer.poll(500);
                    int totalEvents = events.count();
                    log.debug("Polling topic - recieved " + totalEvents + " events");
                    if (totalEvents == 0) {
                        break;  // no more events
                    }
    
                    List>> orderEvents = orderPartitions(events);
    
                    int cnt = 0;
                    // Each list is removed when it is no longer needed
                    while (!orderEvents.isEmpty() && sent < max) {
                        for (int j = 0; j < orderEvents.size(); j++) {
                            List> subList = orderEvents.get(j);
                            // The list contains no more events, or none in our time range, remove it
                            if (subList.size() < cnt + 1) {
                                orderEvents.remove(j);
                                log.debug("exhausted partition - removed");
                                j--;
                                continue;
                            }
                            ConsumerRecord event = subList.get(cnt);
                            cnt++
    }
    

提交回复
热议问题