Whereas the original problem may be stated in a finite dimensional space, it often happens that the sets to discriminate are not linearly separable in that space. For this reason, it was proposed that the original finite-dimensional space be mapped into a much higher-dimensional space, presumably making the separation easier in that space. To keep the computational load reasonable, the mappings used by SVM schemes are designed to ensure that dot products may be computed easily in terms of the variables in the original space, by defining them in terms of a kernel function
2014年4月23日星期三
SVM
More formally, a support vector machine constructs a hyperplane or set of hyperplanes in a high-
 or infinite-dimensional space, which can be used for classification, 
regression, or other tasks. Intuitively, a good separation is achieved 
by the hyperplane that has the largest distance to the nearest training 
data point of any class (so-called functional margin), since in general 
the larger the margin the lower the generalization error of the classifier.
Whereas the original problem may be stated in a finite dimensional space, it often happens that the sets to discriminate are not linearly separable in that space. For this reason, it was proposed that the original finite-dimensional space be mapped into a much higher-dimensional space, presumably making the separation easier in that space. To keep the computational load reasonable, the mappings used by SVM schemes are designed to ensure that dot products may be computed easily in terms of the variables in the original space, by defining them in terms of a kernel function selected to suit the problem.[2]
 The hyperplanes in the higher-dimensional space are defined as the set 
of points whose dot product with a vector in that space is constant. The
 vectors defining the hyperplanes can be chosen to be linear 
combinations with parameters
 selected to suit the problem.[2]
 The hyperplanes in the higher-dimensional space are defined as the set 
of points whose dot product with a vector in that space is constant. The
 vectors defining the hyperplanes can be chosen to be linear 
combinations with parameters  of images of feature vectors that occur in the data base. With this choice of a hyperplane, the points
 of images of feature vectors that occur in the data base. With this choice of a hyperplane, the points  in the feature space that are mapped into the hyperplane are defined by the relation:
 in the feature space that are mapped into the hyperplane are defined by the relation:  Note that if
 Note that if  becomes small as
 becomes small as  grows further away from
 grows further away from  , each term in the sum measures the degree of closeness of the test point
, each term in the sum measures the degree of closeness of the test point  to the corresponding data base point
 to the corresponding data base point  .
 In this way, the sum of kernels above can be used to measure the 
relative nearness of each test point to the data points originating in 
one or the other of the sets to be discriminated. Note the fact that the
 set of points
.
 In this way, the sum of kernels above can be used to measure the 
relative nearness of each test point to the data points originating in 
one or the other of the sets to be discriminated. Note the fact that the
 set of points  mapped into any hyperplane can be quite convoluted as a result, 
allowing much more complex discrimination between sets which are not 
convex at all in the original space.
 mapped into any hyperplane can be quite convoluted as a result, 
allowing much more complex discrimination between sets which are not 
convex at all in the original space.
However, not all sets of four points, no three collinear, are 
linearly separable in two dimensions. The following example would need two straight lines and thus is not linearly separable:
 
 
Whereas the original problem may be stated in a finite dimensional space, it often happens that the sets to discriminate are not linearly separable in that space. For this reason, it was proposed that the original finite-dimensional space be mapped into a much higher-dimensional space, presumably making the separation easier in that space. To keep the computational load reasonable, the mappings used by SVM schemes are designed to ensure that dot products may be computed easily in terms of the variables in the original space, by defining them in terms of a kernel function
订阅:
博文评论 (Atom)
 
没有评论:
发表评论