I trained the whole network and got the parameters. When I use one pth to infer the result on a test dataset, just testing, by debugging, each time when it goes to "tor