Neural Networks: What does “linearly separable” mean?

后端 未结 3 897
野趣味
野趣味 2020-12-23 16:54

I am currently reading the Machine Learning book by Tom Mitchell. When talking about neural networks, Mitchell states:

\"Although the perceptron rule

相关标签:
3条回答
  • 2020-12-23 17:26

    Look at the following two data sets:

    ^                         ^
    |   X    O                |  AA    /
    |                         |  A    /
    |                         |      /   B
    |   O    X                |  A  /   BB
    |                         |    /   B
    +----------->             +----------->
    

    The left data set is not linearly separable (without using a kernel). The right one is separable into two parts for A' andB` by the indicated line.

    I.e. You cannot draw a straight line into the left image, so that all the X are on one side, and all the O are on the other. That is why it is called "not linearly separable" == there exist no linear manifold separating the two classes.

    Now the famous kernel trick (which will certainly be discussed in the book next) actually allows many linear methods to be used for non-linear problems by virtually adding additional dimensions to make a non-linear problem linearly separable.

    0 讨论(0)
  • 2020-12-23 17:30

    This means that there is a hyperplane (which splits your input space into two half-spaces) such that all points of the first class are in one half-space and those of the second class are in the other half-space.

    In two dimensions, that means that there is a line which separates points of one class from points of the other class.

    EDIT: for example, in this image, if blue circles represent points from one class and red circles represent points from the other class, then these points are linearly separable.

    enter image description here

    In three dimensions, it means that there is a plane which separates points of one class from points of the other class.

    In higher dimensions, it's similar: there must exist a hyperplane which separates the two sets of points.

    You mention that you're not good at math, so I'm not writing the formal definition, but let me know (in the comments) if that would help.

    0 讨论(0)
  • 2020-12-23 17:37

    Suppose you want to write an algorithm that decides, based on two parameters, size and price, if an house will sell in the same year it was put on sale or not. So you have 2 inputs, size and price, and one output, will sell or will not sell. Now, when you receive your training sets, it could happen that the output is not accumulated to make our prediction easy (Can you tell me, based on the first graph if X will be an N or S? How about the second graph):

            ^
            |  N S   N
           s|  S X    N
           i|  N     N S
           z|  S  N  S  N
           e|  N S  S N
            +----------->
              price
    
    
            ^
            |  S S   N
           s|  X S    N
           i|  S     N N
           z|  S  N  N  N
           e|    N N N
            +----------->
              price
    

    Where:

    S-sold,
    N-not sold
    

    As you can see in the first graph, you can't really separate the two possible outputs (sold/not sold) by a straight line, no matter how you try there will always be both S and N on the both sides of the line, which means that your algorithm will have a lot of possible lines but no ultimate, correct line to split the 2 outputs (and of course to predict new ones, which is the goal from the very beginning). That's why linearly separable (the second graph) data sets are much easier to predict.

    0 讨论(0)
提交回复
热议问题