DAaML/z3
From WikiZMSI
< DAaML
[edytuj]
Support Vector Machines (SVM)
[edytuj]
Classroom
- Download and unzip the file: data_for_svm.zip.
- From the inside, read the binary file (MatLab format) containing three small data sets (named: X1, X2, X3). Take advantage of function loadmat('...') from scipy.io package - example:
from scipy.io import loadmat D = loadmat('data_for_svm.mat') X1 = D['X1'] X2 = D['X2'] X3 = D['X3']
- Using class sklearn.svm.SVC, carry out the following actions (remark: set up kernel='linear'). For the first data set: (1) find the separation line having the largest margin, (2) compute the value of margin, (3) on a planar plot visualize: the data set, the optimal line, helper side lines (margin-away from the optimal), distinguish data points being the support vectors, the margin (it can be drawn from one of the support vectors to the optimal line). For the second data set carry out the same actions but do the 3D visualizations (in the R^3 space).
- Repeat the training of classifiers as in the previous point but do so via the direct formulation of the optimization task (mathematical quadratic programming) taking advantage of the cvxopt (Convex Optimization). Useful links: https://scaron.info/blog/quadratic-programming-in-python.html, http://cvxopt.org/applications/svm/, https://xavierbourretsicotte.github.io/SVM_implementation.html.
- For the soft-margin SVM variant (data set X3): (1) start using again the sklearn.svm.SVC class, (2) carry out three experiments to find the separation line for C = 0.1, 1.0, 10.0, (3) present obtained lines on three plots, (4) compute the value of margin for each experiment.
[edytuj]
Homework
- Do experiments for the soft-margin SVM variant with the non-linear decision boundary. Change the parameter deciding about the kernel function applied to: kernel='rbf'. Find the non-linear decision boundary for X3 data set. Visualize the boundary by means of a contour plot (commands: meshgrid + contour or contourf).
- Do experiments with several SVM classifiers on the Olivetti data set (with PCA features). Report obtained accuracy measures from SVMs and compare them against CART classifiers' accuracies.