Why does RNN always output 1

你离开我真会死。 提交于 2019-12-04 11:26:54

Most RNNs don't like data that don't have a constant mean. One strategy for dealing with this is differencing the data. To see how this works, lets work with a base R time series co2. This is a time series with a nice smooth seasonality and trend, so we should be able to forecast it.

For our model our input matrix is going to be the "seasonality" and "trend" of the co2 time series, created using the stl decomposition. So lets make our training and testing data as you did before and train the model (note I reduced the numepochs for runtime). I will use all the data up to the last year and a half for training, and then use the last year and a half for testing:

#Create the STL decomposition
sdcomp <- stl(co2, s.window = 7)$time.series[,1:2]

Y <- window(co2, end = c(1996, 6))
M <- window(sdcomp, end = c(1996, 6))
#Taken from OP's code
mt <- array(c(M),dim=c(NROW(M),1,NCOL(M)))
yt <- array(c(Y),dim=c(NROW(M),1,NCOL(Y))) 
model <- trainr(X=mt,Y=yt,learningrate=0.5,hidden_dim=10,numepochs=100)

Now we can create our predictions on the last year of testing data:

M2 <- window(sdcomp, start = c(1996,7))
mt2 <- array(c(M2),dim=c(NROW(M2),1,NCOL(M2)))
predictr(model,mt2)

output:
      [,1]
 [1,]    1
 [2,]    1
 [3,]    1
 [4,]    1
 [5,]    1
 [6,]    1
 [7,]    1
 [8,]    1
 [9,]    1
[10,]    1
[11,]    1
[12,]    1
[13,]    1
[14,]    1
[15,]    1
[16,]    1
[17,]    1
[18,]    1

Ewe, it is all ones again, just like in your example. Now lets try this again, but this time we will difference the data. Since we are trying to make our predictions one and a half years out, we will use 18 as our differencing lag as those are the values we would know 18 months ahead of time.

dco2 <- diff(co2, 18)
sdcomp <- stl(dco2, s.window = "periodic")$time.series[,1:2]
plot(dco2)

Great, the trend is now gone so our neural net should be able to find the pattern better. Lets try again with the new data.

Y <- window(dco2, end = c(1996, 6))
M <- window(sdcomp, end = c(1996, 6))

mt <- array(c(M),dim=c(NROW(M),1,NCOL(M)))
yt <- array(c(Y),dim=c(NROW(M),1,NCOL(Y)))
model <- trainr(X=mt,Y=yt,learningrate=0.5,hidden_dim=10,numepochs=100)

M2 <- window(sdcomp, start = c(1996,7))
mt2 <- array(c(M2),dim=c(NROW(M2),1,NCOL(M2)))
(preds <- predictr(model,mt2))

output:
              [,1]
 [1,] 9.999408e-01
 [2,] 9.478496e-01
 [3,] 6.101828e-08
 [4,] 2.615463e-08
 [5,] 3.144719e-08
 [6,] 1.668084e-06
 [7,] 9.972314e-01
 [8,] 9.999901e-01
 [9,] 9.999916e-01
[10,] 9.999916e-01
[11,] 9.999916e-01
[12,] 9.999915e-01
[13,] 9.999646e-01
[14,] 1.299846e-02
[15,] 3.114577e-08
[16,] 2.432247e-08
[17,] 2.586075e-08
[18,] 1.101596e-07

Ok, now there is something there! Lets see how it compares to what were were trying to forecast, dco2:

Not ideal, but we but it is finding the general "up down" pattern of the data. Now all you have to do is tinker with your learning rates and start optimizing with all those lovely hyper-parameters that make working with neural nets such a joy. When it is working how you want, you can just take your final output and add back in the last 18 months of your training data.

eafpres

From my review of the examples with the package (see ?trainr) the inputs into the training function have to be binary. There are the functions int2bin and bin2int in the package.

I have not been able to get them to work correctly, but it appears conversion to binary is needed.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!