reinforcement-learning

Are Q-learning and SARSA with greedy selection equivalent?

痴心易碎 提交于 2020-05-25 07:26:30
问题 The difference between Q-learning and SARSA is that Q-learning compares the current state and the best possible next state, whereas SARSA compares the current state against the actual next state. If a greedy selection policy is used, that is, the action with the highest action value is selected 100% of the time, are SARSA and Q-learning then identical? 回答1: Well, not actually. A key difference between SARSA and Q-learning is that SARSA is an on-policy algorithm (it follows the policy that is

Are Q-learning and SARSA with greedy selection equivalent?

不问归期 提交于 2020-05-25 07:25:06
问题 The difference between Q-learning and SARSA is that Q-learning compares the current state and the best possible next state, whereas SARSA compares the current state against the actual next state. If a greedy selection policy is used, that is, the action with the highest action value is selected 100% of the time, are SARSA and Q-learning then identical? 回答1: Well, not actually. A key difference between SARSA and Q-learning is that SARSA is an on-policy algorithm (it follows the policy that is

TypeError: len is not well defined for symbolic Tensors. (activation_3/Identity:0) Please call `x.shape` rather than `len(x)` for shape information

和自甴很熟 提交于 2020-05-14 16:00:29
问题 I am trying to implement a DQL model on one game of openAI gym. But it's giving me following error. TypeError: len is not well defined for symbolic Tensors. (activation_3/Identity:0) Please call x.shape rather than len(x) for shape information. Creating a gym environment: ENV_NAME = 'CartPole-v0' env = gym.make(ENV_NAME) np.random.seed(123) env.seed(123) nb_actions = env.action_space.n My model looks like this: model = Sequential() model.add(Flatten(input_shape=(1,) + env.observation_space

Reinforcement Learning doesn't work for this VERY EASY game, why? Q Learning

点点圈 提交于 2020-04-18 05:46:45
问题 I programmed a very easy game which works the following way: Given an 4x4 field of squares, a player can move (up, right, down or left). Going on a square the agent never visited before gives the reward 1. Stepping on "dead-field" is rewarded with -5 and then the game will be resetted. Moving on a field that was already visited is rewarded with -1 Going on the "win-field" (there's exactly one) gives the reward 5 and the game will be resetted as well. Now I want an AI to learn to play that

How to accumulate and appy gradients for Async n-step DQNetwork update in Tensorflow?

梦想与她 提交于 2020-02-27 22:22:21
问题 I am trying to implement Asynchronous Methods for Deep Reinforcement Learning and one of the steps requires to accumulate the gradient over different steps and then apply it. What is the best way to achieve this in tensorflow? I got so far as to accumulate the gradient and I don't think is the fastest way to achieve it (lots of transfers from tensorflow to python and back). Any suggestions are welcome. This is my code of a toy NN. It does not model or compute anything it just exercise the

How to accumulate and appy gradients for Async n-step DQNetwork update in Tensorflow?

ⅰ亾dé卋堺 提交于 2020-02-27 22:20:13
问题 I am trying to implement Asynchronous Methods for Deep Reinforcement Learning and one of the steps requires to accumulate the gradient over different steps and then apply it. What is the best way to achieve this in tensorflow? I got so far as to accumulate the gradient and I don't think is the fastest way to achieve it (lots of transfers from tensorflow to python and back). Any suggestions are welcome. This is my code of a toy NN. It does not model or compute anything it just exercise the

Is it possible to modify OpenAI environments?

你离开我真会死。 提交于 2020-02-22 08:42:28
问题 There are some things that I would like to modify in the OpenAI environments. If we use the Cartpole example then we can edit things that are in the class init function but with environments that use Box2D it doesn't seem to be as straight forward? For example, consider the BipedalWalker environment. In this case how would I edit things like the SPEED_HIP or SPEED_KNEE variables? Thanks 回答1: Yes, you can modify or create new environments in gym. The simplest (but not recommended) way is to

Is it possible to modify OpenAI environments?

与世无争的帅哥 提交于 2020-02-22 08:36:54
问题 There are some things that I would like to modify in the OpenAI environments. If we use the Cartpole example then we can edit things that are in the class init function but with environments that use Box2D it doesn't seem to be as straight forward? For example, consider the BipedalWalker environment. In this case how would I edit things like the SPEED_HIP or SPEED_KNEE variables? Thanks 回答1: Yes, you can modify or create new environments in gym. The simplest (but not recommended) way is to

Pygame and Open AI implementation

佐手、 提交于 2020-01-24 14:12:47
问题 Me and my classmate have decided to try and implement and AI agent into our own game. My friend have done most of the code, based on previous projects, and I was wondering how PyGame and OpenAI would work together. Have tried to do some research but can't really find any useful information about this specific topic. Some have said that it is hard to implement but some say it works. Either way, I'd like your opinion on this project and how you'd approach this if it was you. The game is very

Pygame and Open AI implementation

谁说我不能喝 提交于 2020-01-24 14:11:10
问题 Me and my classmate have decided to try and implement and AI agent into our own game. My friend have done most of the code, based on previous projects, and I was wondering how PyGame and OpenAI would work together. Have tried to do some research but can't really find any useful information about this specific topic. Some have said that it is hard to implement but some say it works. Either way, I'd like your opinion on this project and how you'd approach this if it was you. The game is very