reinforcement-learning

Can tf.agent policy return probability vector for all actions?

爷,独闯天下 提交于 2020-12-31 07:37:32
问题 I am trying to train a Reinforcement Learning agent using TF-Agent TF-Agent DQN Tutorial. In my application, I have 1 action containing 9 possible discrete values (labeled from 0 to 8). Below is the output from env.action_spec() BoundedTensorSpec(shape=(), dtype=tf.int64, name='action', minimum=array(0, dtype=int64), maximum=array(8, dtype=int64)) I would like to get the probability vector contains all actions calculated by the trained policy, and do further processing in other application

Slow training on CPU and GPU in a small network (tensorflow)

怎甘沉沦 提交于 2020-12-13 03:11:47
问题 Here is the original script I'm trying to run on both CPU and GPU, I'm expecting a much faster training on GPU however it's taking almost the same time. I made the following modification to main() (the first 4 lines) because the original script does not activate / use the GPU. Suggestions ... ? def main(): physical_devices = tf.config.experimental.list_physical_devices('GPU') if len(physical_devices) > 0: tf.config.experimental.set_memory_growth(physical_devices[0], True) print('GPU activated

Slow training on CPU and GPU in a small network (tensorflow)

六月ゝ 毕业季﹏ 提交于 2020-12-13 03:09:23
问题 Here is the original script I'm trying to run on both CPU and GPU, I'm expecting a much faster training on GPU however it's taking almost the same time. I made the following modification to main() (the first 4 lines) because the original script does not activate / use the GPU. Suggestions ... ? def main(): physical_devices = tf.config.experimental.list_physical_devices('GPU') if len(physical_devices) > 0: tf.config.experimental.set_memory_growth(physical_devices[0], True) print('GPU activated

OpenAI Gym: Understanding `action_space` notation (spaces.Box)

微笑、不失礼 提交于 2020-12-01 08:18:23
问题 I want to setup an RL agent on the OpenAI CarRacing-v0 environment, but before that I want to understand the action space. In the code on github line 119 says: self.action_space = spaces.Box( np.array([-1,0,0]), np.array([+1,+1,+1])) # steer, gas, brake How do I read this line? Although my problem is concrete wrt CarRacing-v0 I would like to understand the spaces.Box() notation in general 回答1: Box means that you are dealing with real valued quantities. The first array np.array([-1,0,0] are

OpenAI Gym: Understanding `action_space` notation (spaces.Box)

核能气质少年 提交于 2020-12-01 08:17:28
问题 I want to setup an RL agent on the OpenAI CarRacing-v0 environment, but before that I want to understand the action space. In the code on github line 119 says: self.action_space = spaces.Box( np.array([-1,0,0]), np.array([+1,+1,+1])) # steer, gas, brake How do I read this line? Although my problem is concrete wrt CarRacing-v0 I would like to understand the spaces.Box() notation in general 回答1: Box means that you are dealing with real valued quantities. The first array np.array([-1,0,0] are

'UnityEnvironment' object has no attribute 'behavior_spec'

╄→尐↘猪︶ㄣ 提交于 2020-06-28 03:42:20
问题 I followed this link to doc to create environment of my own. But when i run this from mlagents_envs.environment import UnityEnvironment env = UnityEnvironment(file_name="v1-ball-cube-game.x86_64") env.reset() behavior_names = env.behavior_spec.keys() print(behavior_names) Game window pop up and then terminal show error saying Traceback (most recent call last): File "index.py", line 6, in <module> behavior_names = env.behavior_spec.keys() AttributeError: 'UnityEnvironment' object has no

Problem Using Keras Sequential Model for “reinforcelearn” Package in R

大城市里の小女人 提交于 2020-05-30 12:19:38
问题 I am trying to use a keras (version 2.2.50) neural network / sequential model to create a simple agent in a reinforcement learning setting using the reinforcelearn package (version 0.2.1) according to this vignette: https://cran.r-project.org/web/packages/reinforcelearn/vignettes/agents.html . This is the code I use: library('reinforcelearn') library('keras') model = keras_model_sequential() %>% layer_dense(units = 10, input_shape = 4, activation = "linear") %>% compile(optimizer = optimizer