Say I have a categorical feature, color, which takes the values
[\'red\', \'blue\', \'green\', \'orange\'],
and I want to use it to predict something in a ra
Most implementations of random forest (and many other machine learning algorithms) that accept categorical inputs are either just automating the encoding of categorical features for you or using a method that becomes computationally intractable for large numbers of categories.
A notable exception is H2O. H2O has a very efficient method for handling categorical data directly which often gives it an edge over tree based methods that require one-hot-encoding.
This article by Will McGinnis has a very good discussion of one-hot-encoding and alternatives.
This article by Nick Dingwall and Chris Potts has a very good discussion about categorical variables and tree based learners.