h2o.deeplearning autoencoder, calculating deep features manually

你离开我真会死。 提交于 2021-01-28 03:51:00

问题


I am trying to understand how deep features are made in an autoencoder.

I created an autoencoder with h2o.deeplearning and then I tried to calculate the deepfeatures manually.

The autoencoder

fit = h2o.deeplearning(
x = names(x_train),
training_frame = x_train,
activation = "Tanh",
autoencoder = TRUE,
hidden = c(25,10),
epochs = 100,
export_weights_and_biases = TRUE,
)    

I used as activation function Tanh and 2 hidden layers with no dropout, to make the things simple.

Calculating hidden layer 1 deep features manually

Then I extracted the weighs and biases that goes from the input layer to the hidden layer 1

w12 = as.matrix(h2o.weights(fit, 1))
b12 = as.matrix(h2o.biases (fit,1))

I prepared the training data for the operations normalizing it between the compact interval of [-0.5 , 0.5] because h2o does that automatically in autoencoders.

normalize = function(x) {(((x-min(x))/(max(x)-min(x))) - 0.5)}
d.norm =  apply(d, 2, normalize)`

Then I calculated manually the deepfeatures of the first layer

a12 = d.norm %*% t(w12)
b12.rep = do.call(rbind, rep(list(t(b12)), nrow(d.norm)))
z12 = a12 + b12.rep
f12 = tanh(z12)

When I compared those values with hidden layer 1 deep features, they didnt match

hl1.output = as.matrix(h2o.deepfeatures(fit, x_train, layer = 1))
all.equal(
as.numeric(f12[,1]),
hl1.output[, 1],
check.attributes = FALSE,
use.names = FALSE,
tolerance = 1e-04
)
[1] "Mean relative difference: 0.4854887"

Calculating hidden layer 2 deep features manually

Then I tried to do the same thing to calculate manually the deep features of the hiddem layer 2 from the deep features of the hidden layer 1

a23 = hl1.output %*% t(w23)
b23.rep = do.call(rbind, rep(list(t(b23)), nrow(a23)))
z23 = a23 + b23.rep
f23 = tanh(z23)

Comparing these values with the deep features of the hidden layer 2 I saw that they match perfecly

hl2.output = as.matrix(h2o.deepfeatures(fit,x_train,layer = 2))
all.equal(
as.numeric(f23[,1]),
hl2.output[, 1],
check.attributes = FALSE,
use.names = FALSE,
tolerance = 1e-04
)
[1] TRUE

Calculating the output layer features manually

I tried the same thing for the output layer

a34 = hl2.output %*% t(w34)
b34.rep = do.call(rbind, rep(list(t(b34)), nrow(a34)))
z34 = a34 + b34.rep
f34 = tanh(z34)

I compared the result with the output I had and I could not get the same result

all.equal(
as.numeric(f34[1,]),
output[1,],
check.attributes = FALSE,
use.names = FALSE,
tolerance = 1e-04
)
[1] "Mean relative difference: 3.019762"

The questions

I think that I am not normalizing data in the correct way because I can recreate the deep features of the hidden layer 2 with the features of the hidden layer 1. I do not understand what is wrong, because with autoencoder = TRUE h2o should normalize the data between[-0.5:0.5]

I dont understand why the manual calculation of the output layer does not work

1) How to calculate manually the deep features of the hidden layer 1?

2) How to calculate manually the output features?


回答1:


You're using:

 normalize = function(x) {(((x-min(x))/(max(x)-min(x))) - 0.5)}

They are using this Java code:

 normMul[idx] = (v.max() - v.min() > 0)?1.0/(v.max() - v.min()):1.0;
 normSub[idx] = v.mean();

And then it is used like this:

numVals[i] = (numVals[i] - normSub[i])*normMul[i];

I.e. subtract the mean, then divide by the range (or, equivalently, multiply by 1 over the range). So, ignoring the check for divide-by-zero, I think your R code needs to be:

 normalize = function(x) {(x-mean(x))/(max(x)-min(x))}

With the check for zero, something like:

 normalize = function(x) {mul=max(x)-min(x);if(mul==0)mul=1;return((x-mean(x))/mul)}

Just playing around with that, it seems to have a range of 1.0, but it is not centred around 0.0, i.e. it is not the -0.5 to +0.5 described in the H2O documentation (e.g. p.20 in deep learning booklet). Did I miss something in the Java code?

By the way this line is where it decides to NORMALIZE for auto-encoders, rather than STANDARDIZE for other deep learning.



来源:https://stackoverflow.com/questions/49711455/h2o-deeplearning-autoencoder-calculating-deep-features-manually

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!