问题 How can I n-hot encode a column of lists with duplicates? Something like MultiLabelBinarizer from sklearn which counts the number of instances of duplicate classes instead of binarizing. Example input: x = pd.Series([['a', 'b', 'a'], ['b', 'c'], ['c','c']]) Expected output: a b c 0 2 1 0 1 0 1 1 2 0 0 2 回答1: I have written a new class MultiLabelCounter based on the MultiLabelBinarizer code. import itertools import numpy as np class MultiLabelCounter(): def __init__(self, classes=None): self