Implementation of SMO Based on matlab

Keywords: MATLAB network

This is my homework in the machine learning course, SMO realized by matlab, record my experience.
I have implemented a simplified version of SMO code, and most of the code circulating on the network is also the code of this idea, mainly referring to Peter's "Machine Learning Practice" about SMO algorithm. Complete engineering portal Thank you, Mr. yqx.
My own simplified version of the code:

function [alpha,bias] = my_seqminoptSimple(training,groupIndex,C,maxIter,tol)

% init
[sampleNum,featuerNum]=size(training);
alpha=zeros(sampleNum,1);
bias=0;
iteratorTimes=0;

K=training*training';
while iteratorTimes<maxIter
    %iteratorTimes=iteratorTimes+1;
    alphaPairsChanged=0;
    % calculate predict value
    %K=training*training';
    %g=(alpha.*groupIndex)'*K+repmat(bias,1,sampleNum);
    %g=g';

    % calculate error
    %E=g-groupIndex;

    % find alpha1
    for i=1:sampleNum
        g1=(alpha.*groupIndex)'*(training*training(i,:)')+bias;
        E1=g1-groupIndex(i,1);
       % choose i: avoid KKT conditions
       if(((E1*groupIndex(i,1)<-tol)&&alpha(i,1)<C)||((E1*groupIndex(i,1)>tol)&&alpha(i,1)>0))
           % choose j: different from i 
           j=i;
           while j==i
                j=randi(sampleNum);
           end

            alpha1=i;
            alpha2=j;

            % updata alpha1 and alpha2
            alpha1Old=alpha(alpha1,1);
            alpha2Old=alpha(alpha2,1);
            y1=groupIndex(alpha1,1);
            y2=groupIndex(alpha2,1);

            g2=(alpha.*groupIndex)'*(training*training(j,:)')+bias;
            E2=g2-groupIndex(j,1);

            if y1~=y2
                L=max(0,alpha2Old-alpha1Old);
                H=min(C,C+alpha2Old-alpha1Old);
            else
                L=max(0,alpha2Old+alpha1Old-C);
                H=min(C,alpha2Old+alpha1Old);
            end

            if L==H
                fprintf('H==L\n');
                continue;
            end

            parameter=K(alpha1,alpha1)+K(alpha2,alpha2)-2*K(alpha1,alpha2);

            if parameter<=0
                fprintf('parameter<=0\n');
                continue;
            end

            alpha2New=alpha2Old+y2*(E1-E2)/parameter;

            if alpha2New>H
                alpha2New=H;
            end

            if alpha2New<L
                alpha2New=L;
            end

            if abs(alpha2New-alpha2Old)<=0.0001
                fprintf('change small\n');
                continue;
            end

            alpha1New=alpha1Old+y1*y2*(alpha2Old-alpha2New);

            % updata bias
            bias1=-E1-y1*K(alpha1,alpha1)*(alpha1New-alpha1Old)-y2*K(alpha2,alpha1)*(alpha2New-alpha2Old)+bias;
            bias2=-E2-y1*K(alpha1,alpha2)*(alpha1New-alpha1Old)-y2*K(alpha2,alpha2)*(alpha2New-alpha2Old)+bias;

            if alpha1New>0&&alpha1New<C
                bias=bias1;
            elseif alpha2New>0&&alpha2New<C
                bias=bias2;
            else
                bias=(bias2+bias1)/2;
            end

            alpha(alpha1,1)=alpha1New;
            alpha(alpha2,1)=alpha2New;
            alphaPairsChanged=alphaPairsChanged+1;
       end  
    end

    if alphaPairsChanged==0
        iteratorTimes=iteratorTimes+1;
    else
        iteratorTimes=0;
    end
    fprintf('iteratorTimes=%d\n',iteratorTimes);

end

Part of the simplified SMO code in Machine Learning Practice, thanks to Peter's code.

'''
Created on Nov 4, 2010
Chapter 5 source file for Machine Learing in Action
@author: Peter
'''
from numpy import *
from time import sleep

def loadDataSet(fileName):
    dataMat = []; labelMat = []
    fr = open(fileName)
    for line in fr.readlines():
        lineArr = line.strip().split('\t')
        dataMat.append([float(lineArr[0]), float(lineArr[1])])
        labelMat.append(float(lineArr[2]))
    return dataMat,labelMat

def selectJrand(i,m):
    j=i #we want to select any J not equal to i
    while (j==i):
        j = int(random.uniform(0,m))
    return j

def clipAlpha(aj,H,L):
    if aj > H: 
        aj = H
    if L > aj:
        aj = L
    return aj

def smoSimple(dataMatIn, classLabels, C, toler, maxIter):
    dataMatrix = mat(dataMatIn); labelMat = mat(classLabels).transpose()
    b = 0; m,n = shape(dataMatrix)
    alphas = mat(zeros((m,1)))
    iter = 0
    while (iter < maxIter):
        alphaPairsChanged = 0
        for i in range(m):
            fXi = float(multiply(alphas,labelMat).T*(dataMatrix*dataMatrix[i,:].T)) + b
            Ei = fXi - float(labelMat[i])#if checks if an example violates KKT conditions
            if ((labelMat[i]*Ei < -toler) and (alphas[i] < C)) or ((labelMat[i]*Ei > toler) and (alphas[i] > 0)):
                j = selectJrand(i,m)
                fXj = float(multiply(alphas,labelMat).T*(dataMatrix*dataMatrix[j,:].T)) + b
                Ej = fXj - float(labelMat[j])
                alphaIold = alphas[i].copy(); alphaJold = alphas[j].copy();
                if (labelMat[i] != labelMat[j]):
                    L = max(0, alphas[j] - alphas[i])
                    H = min(C, C + alphas[j] - alphas[i])
                else:
                    L = max(0, alphas[j] + alphas[i] - C)
                    H = min(C, alphas[j] + alphas[i])
                if L==H: print "L==H"; continue
                eta = 2.0 * dataMatrix[i,:]*dataMatrix[j,:].T - dataMatrix[i,:]*dataMatrix[i,:].T - dataMatrix[j,:]*dataMatrix[j,:].T
                if eta >= 0: print "eta>=0"; continue
                alphas[j] -= labelMat[j]*(Ei - Ej)/eta
                alphas[j] = clipAlpha(alphas[j],H,L)
                if (abs(alphas[j] - alphaJold) < 0.00001): print "j not moving enough"; continue
                alphas[i] += labelMat[j]*labelMat[i]*(alphaJold - alphas[j])#update i by the same amount as j
                                                                        #the update is in the oppostie direction
                b1 = b - Ei- labelMat[i]*(alphas[i]-alphaIold)*dataMatrix[i,:]*dataMatrix[i,:].T - labelMat[j]*(alphas[j]-alphaJold)*dataMatrix[i,:]*dataMatrix[j,:].T
                b2 = b - Ej- labelMat[i]*(alphas[i]-alphaIold)*dataMatrix[i,:]*dataMatrix[j,:].T - labelMat[j]*(alphas[j]-alphaJold)*dataMatrix[j,:]*dataMatrix[j,:].T
                if (0 < alphas[i]) and (C > alphas[i]): b = b1
                elif (0 < alphas[j]) and (C > alphas[j]): b = b2
                else: b = (b1 + b2)/2.0
                alphaPairsChanged += 1
                print "iter: %d i:%d, pairs changed %d" % (iter,i,alphaPairsChanged)
        if (alphaPairsChanged == 0): iter += 1
        else: iter = 0
        print "iteration number: %d" % iter
    return b,alphas


dataArr,labelArr=loadDataSet('testSet.txt')
b,alphas=smoSimple(dataArr, labelArr, 0.6, 0.001, 40)

The basic idea of SMO algorithm is that it is difficult to solve the optimization problem with a large number of variables, so we adopt a relatively simple idea: only update the value of two variables at a time to find a better solution.
SMO theory is no longer repetitive and can be used for reference. Portal This is a summary of a brother who chose this course last year. Thank you.
Let me talk about my experience with SMO. The main difference between the simplified version of SMO and the original SMO is the selection of two update variables.

  • Idea 1
    Select the variable alpha 1 which violates the maximum KKT, and then select the update value | E1-E2 | the maximum alpha 2. But the simple operation is blocked after several updates. It seems that SMO is not so simple.
  • Idea 2
    Select the most serious violation of KKT alpha 1, and then randomly select an alpha 2.
  • Idea 3
    Choose an alpha 1 for traversal and a alpha 2 for random selection.
    This simplified SMO is Choosing Idea 3, and many restrictions must be added to prevent getting stuck in a pair of selected values. Refer to the SMO code for specific restrictions.
    For advanced users, to implement a full version of SMO, refer to Chapter 6 of Machine Learning Practice.

Posted by ss-mike on Sat, 06 Jul 2019 17:55:16 -0700