In the theory of machine learning, in order to use data to obtain the existing relation between Y and X, say Y= f(X), we impose a probability structure, then by this structu