Enrich a Kafka Stream with data from KTables

旧城冷巷雨未停 提交于 2021-02-11 15:01:35

问题


I currently maintain a financial application. While there are many calculations done in this financial application, one of the calculations is to determine 1) How much percentage of the total transaction amount does a new incoming transaction account for? 2) How much percentage of the total transaction amount for the given customer does the new transaction account for with respect to the same customer?

For the sake of simplicity, let's assume that the transcation data will be cut off at 6 am each morning and that is when the program would kick off. In other words, we are working with mostly static data here for a given day.

For example :

  • Transaction 1 : Customer 1 -> 100 dollars
  • Transaction 2 : Customer 1 -> 100 dollars
  • Transaction 3 : Customer 2 -> 100 dollars

What I would like to know is that Transaction 1 accounts of 33% of the total transactions. Transaction 1 accounts 50% of the total transactions for customer 1.

Following is a slightly simplified version of the code as of today which runs as a single Java process and all data is stored with the heap for the same process (So no inter process communication here).

The DAO class : Maintains the application data

public class ApplicationDataDao {
    private Map<String,Transaction> transactionsByTransactionId;
    private Map<String,Transcation> transcationsByCustomerId;
    private TranscationAggregate transcationAggregate;
    private Map<String,TranscationAggregate> transactionAggregateByCustomerId;

    //constructor, getters and setters to populate these maps and to retrieve data 
    from these maps
}

The transaction class : Represents a transcation

public class Transaction {
     private String transcationId;
     private String customerId;
     private BigDecimal transcationAmount;
     
     private BigDecimal transcationPercentageAllocation;
     private BigDecimal customerPercentageAllocation;
}

The aggregate class : Holds the aggregated totals at transaction level and customer level.

public class TranscationAggregate {
    private BigDecimal totalTranscationAmount = BigDecimal.ZERO;

    private String trancationId;
    private String customerId;
         
    private void aggregate(BigDecimal currentTranscationAmount) {
        totalTranscationAmount.add(currentTranscationAmount);            
    }      
 
}

Reading the data from the cut off file for today

    ApplicationDataDao dao = getSingletonApplicationDataDao();
    
    for(String line : reader.read()) {

         String []tokens = line.split(",");
         Transaction transaction = new Transaction();
         transaction.setTransactionId(tokens[0]);
         transaction.setCustomerId(tokens[1]);
         transcation.setTransactionAmount(tokens[2]);
         dao.putTransactionByTransactionId(transaction.getTranscationId());
         dao.putTranscationByCustomerId(transcation.getCustomerId());     
         //Keep a track of the total transaction amount and total transaction amount by customer id.
         dao.getTranscationAggregate().aggregate(transcation.getTranscationAmount());
         dao.getTranscationAggregateByCustomerId(transcation.getCustomerId()).
         aggregate(transcation.getTranscationAmount());

        
                    
      }

Calculating the percentage allocation for a transaction with respect to other transactions

      for(Transaction transaction : dao.getTranscationsByTranscationId().values()) {
                  transaction.setTranscationPercentageAllocation(transaction.getTranscationAmount().divide(dao.getTransactionAggregate().getTotalTransactionAmount())
     }

Calculating the percentage allocation for a customer's transaction with respect to other transactions for the same customer

     for(TransactionAggregate transactionAggregate : dao.getTranscationAggregateByCustomerId()) {
       Transaction transaction = dao.getTranscationByCustomerId(transactionAggregate .getCustomerId());
       transaction.setCustomerPercentageAllocation(transaction.getTranscationAmount().divide(transactionAggregate.getTotalTransactionAmount())
     }

As of today, this application runs on a dedicated UNIX box used by other teams. In other words, it is a standalone, monolith application. I want to refactor this application to be a Kafka Stream based application. This would mean that the above for loop would be broken down into a producer and a consumer instead of all the work being done in a single for loop as follows :

  1. Standalone program that reads one line from the file, converts it into a transaction object and writes it to a Kafka Topic.
  2. On the other side, a streaming consumer reads the Transaction object and creates two KTable instances for holding the total transaction amount (null key) and aggregation of transaction amounts by customer r id respectively (customer id as key)
  3. Write the Ktable instances to two separate Kafka topics (transaction-aggregate-topic and customer-aggregate-topic) for example.

I now have a stream of transaction objects. I also have two topics that are essentially holding the aggregates. My question is : How do I re-enrich the transaction stream with the values from the aggregated KTables for each transaction such that when I look at the stream at the end of the processing, each transaction object now knows it's percentage with respect to other transaction or it's percentage with respect to other transaction by the same customer. (For starters, the transaction stream has no key. How does one match a message in the transaction stream with two KTables?)

来源:https://stackoverflow.com/questions/64738353/enrich-a-kafka-stream-with-data-from-ktables

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!