Artifcial Neural Networks for prediction

问题

I have been looking at reasearch papers that attempt to predict stock price. I have noticed in these papers that the activation function is applied to the output using one of the following types of activation function. Unipolar sigmoid, Bipolar sigmoid, Tan hyperbolic, Radial basis function.

My question If one of the above types of activation function is applied to the output then how can it be used to predict stock price i.e. a value like $103.56? Because most of these functions have min or max values between (0,1) or (-1,1).

Reply to bakkal Before being put as input into the ANN, the inputs were normalized according to the ‘zscore’ function defined in MATLAB, wherein the mean was subtracted and the value divided by the variance of the data. The target outputs were also normalized according to the target functions, dividing by their maximum values, keeping in mind the upper and lower limits for the respective activation functions ((0,1) for unipolar sigmoid, (-1, 1) for the bipolar sigmoid and the tan hyperbolic functions).

Hi , as mentioned below if the activation function is not applied to the output then could someone explain the paragraph in bold, thanks.

回答1:

We used normalization to map the target values to range (0, 1) or (-1, 1) or whatever you want according to your activation function. Generally, we also map the input values to a range near to (-1, 1). The most frequently used normalization to scale the input values is Gaussian normalization. If the input vector is x and if you are working with numpy arrays, then the following is the gaussian normalization of x:

xScaled = (x-x.mean())/(x.std())

where mean() gives the average and std() gives standard deviation.

Another normalization is:

xScaled = (x-x.min())/(x.max()-x.min())

which scales the input vector values to the range (0,1).

So, you work with normalized input and output values in order to fasten the learning process. You can also refer to Andrew Ng course as to know why this happens. When you want to scale the normalized values back to their actual values, you can use reverse normalization. For example, for the above (0,1) normalization, the reverse normalization would be:

x = x.min() + (x.max()-x.min())*xScaled

You can similarly obtain the reverse normalization for the Gaussian case.

回答2:

If you're looking for a continuous output like 103.56, then you're using the neural network to implement a regression (as opposed to a classification). In that case you wouldn't apply an activation layer on the output. Your output would be the sum of the weighted inputs from the previous layer.

That said, nothing stops you from using activation layers on the hidden layers in the network (e.g. to create intermediate features, that are then used for the regression)

Why doesn't the use of an activation function act like a normalisation function? So do we need to normalise if we are using an activation function? Because the activation function will act like a normaliser?

Normalization

Well not exactly, feature normalization is e.g. taking all your historical stock prices data, finding the max, the min, std dev etc, and apply a transformation so that all that historical data fits into e.g. [0, 1].

Why do this? Because your historical data may have prices from AMZN that can go up to say $500, but it's market cap is say $200 billion. That's a lot of zeros in difference between the two features price and market cap, that's not good for some numerical algorithms. So what you do is normalize those into some standardized scale, so that all prices be between [0, 1], and that all market caps be [0, 1]. E.g. this helps the backpropagation algorithim.

Activation

Now the activation functions does a different thing, it's there so as to create an effect of activation, as in a neuron either fires or doesn't fire. The activation function takes an input say [-inf, +inf] and tries to snap it into say [-1, +1]. That's different from normalizing

Now how can the activation effect help with regression? Well e.g. in stocks, predicting prices in penny stocks (say ~4 million USD company) can be wildly different from predicting prices in blue chips (~200 billion USD companies), so you may want to have a feature that turns on/off based on penny/large cap. That feature can be used to better do the regression of the predicted price.

来源：https://stackoverflow.com/questions/37939936/artifcial-neural-networks-for-prediction

标签

machine-learning

neural-network

computer-science

mathematical-optimization