How can I apply reinforcement learning to continuous action spaces?
I'm trying to get an agent to learn the mouse movements necessary to best perform some task in a reinforcement learning setting (i.e. the reward signal is the only feedback for learning). I'm hoping to use the Q-learning technique, but while I've found a way to extend this method to continuous state spaces , I can't seem to figure out how to accommodate a problem with a continuous action space. I could just force all mouse movement to be of a certain magnitude and in only a certain number of different directions, but any reasonable way of making the actions discrete would yield a huge action