What are Support Vector Machines?

Researchers and students widely study Support Vector Machines, or SVMs, for their theoretical understanding and as a building block to their approach to Machine Learning. This article will cover a summary of Machine Learning followed by an introduction to SVMs and their different types. We will also gain an understanding of the basic terminologies of SVMs and how they work.

What is Machine Learning?

Machine Learning is the field of Computer Science where you allow machines (especially computers) to learn tasks. We provide them with specific samples to learn, and based on these learnings, the machine gains “experience.” Based on this experience, it further predicts outputs based on new inputs and gains a unique experience. In this way, the “machine learns.” This is the basic concept of Machine Learning.

What are Support Vector Machines?

We use Support Vector Machines (Supervised Learning algorithms) as classification tools in Machine Learning. We determine a “best line” based on the data points. In technical terms, we call this “best line” a “hyperplane”. This hyperplane splits the differentiating classes in the n-dimensional plane. But how to construct this hyperplane? We utilize specific data points on the plane called “support vectors”.

Broadly speaking, based on the type of decision boundary, we have two kinds of Support Vector Machines:

Linear SVM: Since we divide based on the decision boundary, Linear Support Vector Machines are used whenever the decision boundary is linear. We can express the border separating the two or more classes using a linear equation of “y=mx+b“.
Non-linear SVM: As the name suggests, we use Non-Linear Support Vector Machines when the decision boundary is not a straight line. It can be expressed in terms of higher powers and combinations but not by the equation “y=mx+b“

Let’s have a look at a few terms next.

Terminologies

Have a look at the picture below:

Consider the blue squares class “A” and the red circles class “B”. Let’s look at the individual terms now:

Maximum Margin Hyperplane (A)

This is the best line. The Maximum Margin Hyperplane is the line that is equidistant from the closest points to it of opposite classes. For instance, in the above diagram, consider the red circle on line C and the blue square on line B. The hyperplane is equidistant to both these points. This hyperplane forms the differentiating factor between class A and class B. So a natural question arises, why do we need lines B and C?

Positive Margin Hyperplane (B)

The Positive Margin Hyperplane forms the line passing through the closest point of one class and parallel to the maximum margin hyperplane. In this case, it is class A that falls on the positive side of the hyperplane.

Negative Margin Hyperplane (C)

The Negative Margin Hyperplane forms the line passing through the closest point of the other class and parallel to the maximum margin hyperplane. In this case, it is class B that falls on the negative side of the hyperplane.

Maximum Margin (D)

The “Maximum Margin” denotes the perpendicular distance between the closest points of the two classes. In other words, we can also say that it is the vertical distance between the Positive and the Negative Margin Hyperplanes.

The red circle on the Negative Margin Hyperplane and the blue square on the Positive Margin Hyperplane form the “Support Vectors“.

Application of Support Vector Machines

Let’s have a look at how Support Vector Machines work. Assume we have two classes. The goal is to determine a separating boundary or “decision boundary” between them. We also need to predict new samples based on this decision boundary.

Now given the dataset, there can be multiple lines that can separate the two classes. So, how to determine the best line? We must find that line that places a “maximum margin” between the two classes. This is possible if the line is equidistant from the points of classes closest to this line. This line forms the hyperplane. For two-dimensional classification, the hyperplane is a line. For three-dimensional, it is a plane, and so on. Once we define this hyperplane, the space on either side of this up to the data points is a “gutter”. Any point lying in this region is unclassified or classified with low accuracy. All points beyond the Positive and Negative Margin Hyperplanes can be classified with high accuracy as respective classes.

We use these closest points to the line, Support Vectors, to determine this Maximum Margin Hyperplane. Hence we get the name of this algorithm Support Vector Machines.

Conclusion

I hope I could provide a brief introduction to Support Vector Machines. We discussed a short theoretical introduction about Support Vector Machines and their subject concept. I believe this process of understanding is essential for every Machine Learning enthusiast to understand the core of every tool they work with truly. Because that is precisely how you will develop as a researcher. Grasp the concepts and practically implement them same. This will enhance your knowledge and coding skills as well. Happy Coding!

What is Machine Learning?

What are Support Vector Machines?

Terminologies

Maximum Margin Hyperplane (A)

Positive Margin Hyperplane (B)

Negative Margin Hyperplane (C)

Maximum Margin (D)

Application of Support Vector Machines

Conclusion

No comments so far