支持向量機 SVM的實現

365bet规则 2025-11-08 20:33:44 admin 5129 377

支持向量機 SVM的實現楊學智 🐨@筆記😊9 min read·Jan 18, 2024--

摘要：

支持向量機（Support Vector Machines，簡稱SVM）是一種強大的機器學習模型，能夠建立線性和非線性的決策邊界。這份研究報告深入探討了SVM的複雜性，強調了它們在分類和回歸任務中的多樣應用。本教程重點介紹了如何使用scikit-learn實現線性和非線性SVM進行分類。此外，我們還將探討參數對SVM性能的影響。

1. 引言：

支持向量機（SVM）由於其建立最優決策邊界的能力而引起機器學習界的廣泛關注。本報告旨在提供對SVM的深入理解，特別關注其在分類任務中的實現。SVM不僅擅長於線性分類，還通過核技巧擴展了其應用範圍，從而增強了在不同領域的實用性。

2. SVM在分類中的應用：

SVM作為本質上的二元分類器，通過識別最大化分隔不同類別的超平面而脫穎而出。我們探討了線性SVM背後的基本概念，強調了其處理線性可分數據的能力。隨後，我們將研究擴展到非線性SVM，這得益於核技巧，使得SVM在更複雜的非線性空間中能夠高效運作。

from sklearn.datasets import load_breast_cancercancer = load_breast_cancer()# 建立 training and test 數據from sklearn.model_selection import train_test_splitX_train, X_test, y_train, y_test = train_test_split( cancer.data, cancer.target, stratify=cancer.target, random_state=42)# 歸一化（Normalise data）datafrom sklearn.preprocessing import MinMaxScalerscaler = MinMaxScaler()#creating an objectscaler.fit(X_train)#計算訓練資料的最小值和最大值X_train_norm = scaler.transform(X_train) #對訓練集應用歸一化X_test_norm = scaler.transform(X_test) #對訓測試集應用歸一化3. 使用sklearn進行實現：

scikit-learn（sklearn）庫為實現SVM提供了一個堅實的平台。本教程提供了一個分步指南，介紹如何使用sklearn在分類場景中部署SVM。實際示例說明了線性和非線性SVM的實現，闡明了這些模型如何輕松融入機器學習工作流程。

我們首先創建一個線性SVM。我們需要將核參數設置為”linear”；默認值是”rbf”，對應到徑向基函數（RBF）核，即非線性SVM。我們可以看到其他默認參數：

from sklearn.svm import SVClin_svm = SVC(kernel="linear")lin_svm.fit(X_train_norm, y_train)SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, decision_function_shape='ovr', degree=3, gamma='auto_deprecated', kernel='linear', max_iter=-1, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False)測試

y_pred = lin_svm.predict(X_test_norm)print("Linear SVM - accuracy on test set: {:.3f}".format(accuracy_score(y_test, y_pred)))Linear SVM - accuracy on test set: 0.979現在讓我們創建另外兩個SVM模型，分別使用多項式核和RBF核，並進行結果比較

# SVM with polynomial kernelpoly_svm = SVC(kernel="poly", degree=2)#polynomial kernel with degree 2poly_svm.fit(X_train_norm, y_train)y_pred = poly_svm.predict(X_test_norm)print("SVM 用polynomial 當核心 - accuracy on test set: {:.3f}".format(accuracy_score(y_test, y_pred)))# SVM with RBF kernelrbf_svm = SVC(kernel="rbf", gamma="auto")rbf_svm.fit(X_train_norm, y_train)y_pred = rbf_svm.predict(X_test_norm)print("SVM 用 RBF 當核心 - accuracy on test set: {:.3f}".format(accuracy_score(y_test, y_pred)))SVM 用 polynomial 當核心 - accuracy on test set: 0.839SVM 用 RBF 當核心 - accuracy on test set: 0.9444. 探索參數影響：

SVM性能的一個關鍵方面在於其對參數配置的敏感性。本報告將仔細研究核的選擇、正則化參數（C）和核係數（gamma）等關鍵參數對SVM分類準確性的影響。我們使用真實世界的數據集展示了參數變化的影響，並提供了有關參數調整以獲得最佳模型性能的見解。

SVM分類器對參數的取值非常敏感。為了展示這一點，我們將使用一個更簡單的數據集 — 月亮數據集。它包含來自兩個類別的數據，每個類別由兩個特徵描述；這些數據點形成了兩個半圓（月亮）。讓我們生成並繪製這些數據：

from sklearn.datasets import make_moonsX, y = make_moons(n_samples=100, noise=0.15, random_state=42)def plot_dataset(X, y, axes): plt.plot(X[:, 0][y==0], X[:, 1][y==0], "bs") plt.plot(X[:, 0][y==1], X[:, 1][y==1], "g^") plt.axis(axes) plt.grid(True, which='both') plt.xlabel(r"$x_1$", fontsize=20) plt.ylabel(r"$x_2$", fontsize=20, rotation=0)plot_dataset(X, y, [-1.5, 2.5, -1, 1.5])plt.show()Press enter or click to view image in full size現在，讓我們在 moons 數據集上運行帶有 RBF 核的 SVM，使用不同值的 gamma 和 C 參數（最重要的參數），觀察決策邊界的變化。

參數 C 控制正則化，而參數 gamma 控制高斯核的寬度 — 較小的 gamma 意味著更大的寬度，反之亦然。

# 輔助函數，用於繪製決策邊界def plot_predictions(clf, axes): x0s = np.linspace(axes[0], axes[1], 100) x1s = np.linspace(axes[2], axes[3], 100) x0, x1 = np.meshgrid(x0s, x1s) X = np.c_[x0.ravel(), x1.ravel()] y_pred = clf.predict(X).reshape(x0.shape) y_decision = clf.decision_function(X).reshape(x0.shape) plt.contourf(x0, x1, y_pred, cmap=plt.cm.brg, alpha=0.2) plt.contourf(x0, x1, y_decision, cmap=plt.cm.brg, alpha=0.1)# 創建帶有不同 gamma 和 C 值的 SVM 模型gamma1, gamma2 = 0.1, 5C1, C2 = 0.001, 1000hyperparams = (gamma1, C1), (gamma1, C2), (gamma2, C1), (gamma2, C2)svm_clfs = []for gamma, C in hyperparams: rbf_kernel_svm_clf = Pipeline([ ("scaler", StandardScaler()), ("svm_clf", SVC(kernel="rbf", gamma=gamma, C=C)) ]) rbf_kernel_svm_clf.fit(X, y) svm_clfs.append(rbf_kernel_svm_clf)fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(10.5, 7), sharex=True, sharey=True)# 繪製數據集和決策邊界for i, svm_clf in enumerate(svm_clfs): plt.sca(axes[i // 2, i % 2]) plot_predictions(svm_clf, [-1.5, 2.45, -1, 1.5]) plot_dataset(X, y, [-1.5, 2.45, -1, 1.5]) gamma, C = hyperparams[i] plt.title(r"$\gamma = {}, C = {}$".format(gamma, C), fontsize=16) if i in (0, 1): plt.xlabel("") if i in (1, 3): plt.ylabel("")save_fig("moons_rbf_svc_plot")plt.show()Press enter or click to view image in full size5. 結論：

總之，這份研究報告是一個全面指南，用於理解和實現支持向量機進行分類任務。實際示例的加入以及對參數敏感性的探討有助於全面了解SVM及其對各種數據集的適應能力。隨著機器學習的不斷發展，SVM的多功能性和穩健性使其成為數據科學家不可或缺的工具。

玉米糁煮10分钟会熟吗？老厨娘教你3招，软糯香甜不糊锅！何为战争规则？其重要意义何在？

支持向量機 SVM的實現

相关推荐

Steam 上的百万题库

笔记本电脑怎么拍照笔记本电脑拍照功能使用方法【详解】

刘禅投降前城中还有数万兵马,阿斗为什么会开门投降

友情链接

支持向量機 SVM的實現

相关推荐

Steam 上的 百万题库

笔记本电脑怎么拍照 笔记本电脑拍照功能使用方法【详解】

刘禅投降前城中还有数万兵马,阿斗为什么会开门投降

友情链接

Steam 上的百万题库

笔记本电脑怎么拍照笔记本电脑拍照功能使用方法【详解】