I am working with customer transaction data from an old Kaggle competition. The dataset contains 20M records and 4M users (about 5 records per user). Example data for one user i