multi-column factorize in pandas
问题 The pandas factorize function assigns each unique value in a series to a sequential, 0-based index, and calculates which index each series entry belongs to. I'd like to accomplish the equivalent of pandas.factorize on multiple columns: import pandas as pd df = pd.DataFrame({'x': [1, 1, 2, 2, 1, 1], 'y':[1, 2, 2, 2, 2, 1]}) pd.factorize(df)[0] # would like [0, 1, 2, 2, 1, 0] That is, I want to determine each unique tuple of values in several columns of a data frame, assign a sequential index