1.4. Support Vector Machines
支持向量機
Support vector machines (SVMs) are a set of supervised learning methods used for classification, regression and outliers detection.
支持向量機是一種用于分類,回歸和異常值檢測的監(jiān)督學習方式对室。
The advantages of support vector machines are:
SVM的優(yōu)點如下:
Effective in high dimensional spaces.
在多維度空間中具有高效性模燥。
Still effective in cases where number of dimensions is greater than the number of samples.
在特征值大于樣本數(shù)情況下仍舊高效。
Uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.
在決定函數(shù)(稱為支持向量)中使用訓練集數(shù)據(jù)的一個子集掩宜,因此內(nèi)存表現(xiàn)也高效蔫骂。
Versatile: different Kernel functions can be specified for the decision function. Common kernels are provided, but it is also possible to specify custom kernels.
多功能性:可以為決定函數(shù)指定不同的核函數(shù)。提供常用的核函數(shù)牺汤,也可以指定你所習慣的核函數(shù)辽旋。
The disadvantages of support vector machines include:
缺點如下:
If the number of features is much greater than the number of samples, the method is likely to give poor performances.
如果特征值數(shù)遠遠超過樣本數(shù),SVM性能可能不太好。
SVMs do not directly provide probability estimates, these are calculated using an expensive five-fold cross-validation (see Scores and probabilities, below).
SVM不直接提供概率評估补胚,而是使用開銷五倍的交叉驗證來計算()
The support vector machines in scikit-learn support both dense (numpy.ndarray and convertible to that by numpy.asarray) and sparse (any scipy.sparse) sample vectors as input. However, to use an SVM to make predictions for sparse data, it must have been fit on such data. For optimal performance, use C-ordered numpy.ndarray (dense) or scipy.sparse.csr_matrix (sparse) with dtype=float64.
scikit-learn里的支持向量機同時支持dense(numpy中的ndarray數(shù)組和其他轉(zhuǎn)化成ndarray的數(shù)組)和sparse(scipy.sparse)樣本向量作為輸入码耐。然而如果用SVM來對sparse數(shù)據(jù)做預測,要保證數(shù)據(jù)已經(jīng)被適配溶其。
SVC, NuSVC and LinearSVC are classes capable of performing multi-class classification on a dataset.
SVN骚腥,NuSVC 和LinearSVC都可以用來進行數(shù)據(jù)的多類分類。
../_images/plot_iris_0012.png
SVC and NuSVC are similar methods, but accept slightly different sets of parameters and have different mathematical formulations (see section Mathematical formulation). On the other hand, LinearSVC is another implementation of Support Vector Classification for the case of a linear kernel. Note that LinearSVC does not accept keyword kernel, as this is assumed to be linear. It also lacks some of the members of SVC and NuSVC, like support_.
As other classifiers, SVC, NuSVC and LinearSVC take as input two arrays: an array X of size [n_samples, n_features] holding the training samples, and an array y of class labels (strings or integers), size [n_samples]: