I am trying to install pytorch on windows and there is one which is available for it but shows an error.
conda install -c peterjc123 pytorch=0.1.12
>
I was getting some kind of Rollback error on Git bash and Windows Cmd prompt so had to run Anaconda prompt as admin for:
conda install pytorch-cpu -c pytorch
and then I got another when I tried the following command on Anaconda prompt:
pip3 install torchvision
so I switched back to Windows prompt to enter it and it worked.
To test the installation, I ran this from Git Bash:
$ python reinforcement_q_learning.py
with source code that looks like (the snippet near the top anyways):
"""
import gym
import math
import random
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from collections import namedtuple
from itertools import count
from PIL import Image
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision.transforms as T
env = gym.make('CartPole-v0').unwrapped
# set up matplotlib
is_ipython = 'inline' in matplotlib.get_backend()
if is_ipython:
from IPython import display
plt.ion()
# if gpu is to be used
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
######################################################################
# Replay Memory
# -------------
#
# We'll be using experience replay memory for training our DQN. It stores
# the transitions that the agent observes, allowing us to reuse this data
# later. By sampling from it randomly, the transitions that build up a
# batch are decorrelated. It has been shown that this greatly stabilizes
# and improves the DQN training procedure.
#
# For this, we're going to need two classses:
#
# - ``Transition`` - a named tuple representing a single transition in
# our environment. It maps essentially maps (state, action) pairs
# to their (next_state, reward) result, with the state being the
# screen difference image as described later on.
# - ``ReplayMemory`` - a cyclic buffer of bounded size that holds the
# transitions observed recently. It also implements a ``.sample()``
# method for selecting a random batch of transitions for training.
#
Transition = namedtuple('Transition',
('state', 'action', 'next_state', 'reward'))
class ReplayMemory(object):
def __init__(self, capacity):
self.capacity = capacity
self.memory = []
self.position = 0
def push(self, *args):
"""Saves a transition."""
if len(self.memory) < self.capacity:
self.memory.append(None)
self.memory[self.position] = Transition(*args)
self.position = (self.position + 1) % self.capacity
def sample(self, batch_size):
return random.sample(self.memory, batch_size)
def __len__(self):
return len(self.memory)
############continues to line 507...