AI Sidewalk #3: Delving Into Binary SVMs(Part 2)

Shantanu Phadke

Beginning of a new journey. Credit to

Here’s a link to the part 1 article:

Last time we saw the following optimization:

The Lagrangian

In general the Lagrangian is just a way of solving optimization problems. Given the general optimization


In general suppose we are given an optimization in the form

Duality Gap and Strong Duality

Theoretically, if the solution to the original optimization from above is s1 and the solution to its dual is s2, we define the notion of optimal duality gap as being the difference between these two solutions (s1-s2). Furthermore, strong duality is when the solutions to the primal and dual are equal (s1=s2).

KKT (Karush-Kuhn-Tucker) Conditions

Given that strong duality exists, we can use the Karush-Kuhn-Tucker conditions in order to solve for the duality we derive.

First we can start off by converting all of the conditions into the general form we have seen above:


What is the Stationarity KKT Condition for the above case? What about the Complementary Slackness KKT Condition?

Big Picture

The whole point of going through the calculations above we to go through the math behind transforming an optimization involving many variables into one involving only a single vector. This result is much easier to optimize than what we started off with!

There will be a separate post in the future covering the concept of kernels in more depth, but essentially kernels are just a means of helping us “transform” the given data to be more linearly separable.

Polynomial Kernel

Gaussian Kernel

Radial Basis Function (RBF) Kernel

Next time we will utilize the theory we have covered in these past two posts (including the bit about kernels above) to code up a binary SVM from scratch, train and test it on the University of Wisconsin Breast Cancer Dataset, and perform an analysis of the results!

read original article at——artificial_intelligence-5

Do NOT follow this link or you will be banned from the site!