What are Hyper Parameter Tuning Techniques in Machine Learning?

Machine learning is learning how to predict based on the data provided to us and adding some weights to the same. These weights or parameters are technically termed hyper-parameter tuning. The machine learning developers must explicitly define and fine-tune to improve the algorithm’s efficiency and produce more accurate results.

Introduction

The hyperparameters are a property of the model itself. They need to be specified while instantiating a new model. However, model parameters are not necessarily model hyperparameters and vice versa. Developers often get confused; however, the author has tried to draw a contrast between both to understand better what parameters affect the learning of the model and what don’t.

Difference between Parameters and Hyper Parameters

Model parameters are what the machine learning model learns independently without external interference from the developers. For example, suppose there is a neural network model with several hidden layers. In that case, this model learns the weights to be applied before putting the inputs through the activation function for creating the next layer. These weights are something that the model learns on its own through loss functions. In the same neural network, a term called bias is also introduced to prevent overfitting.

Similarly, in other algorithms, the models have specific parameters that it learns independently. Centroids in k-means clustering, variable coefficients in support vector machines, and coefficients in regression models – to name a few. The model learns model parameters from the data and the values presented in the same. These model parameters are the nucleus of the model’s working and contribute directly to enhancing the model’s accuracy. These parameters are not learned during the learning process but are learned when the model starts making predictions.

On the other hand, hyperparameters are the model parameters that the developer or the machine learning engineer will define explicitly to enhance the model’s training. These parameters are set before the training time. There is no set pre-defined technique that enables the engineers to set hyperparameters for models. Engineers and developers set these parameters by the hit and trial method by repeated evaluation of model accuracy and loss rates when a hyperparameter is tuned.

Categories of Hyper Parameters

Optimization Hyper Parameters

These hyperparameters serve the hyperparameter’s general purpose, essentially making our model even more optimized. These parameters are explicitly set to increase the general efficiency of the model and contribute to its improved accuracy.

Model Specific Hyper Parameters

Model-specific hyperparameters, as the name suggests, are specific to certain kinds of models. For example, for a neural network, the hyperparameters can be the number of hidden layers, the number of neurons in every layer, and so on. For example, the k-nearest neighbors’ algorithm uses the distance metric (euclidean, manhattan, etc.) and the value of k as the hyperparameters.

Techniques to Perform hyper-parameter tuning

Grid Search

It is humanly impossible to explore all permutations and combinations for hyperparameters for a model. If a human sets these parameters on his own, a significant amount of time will be taken for the same. The Scikit learn library offers an exhaustive grid search function responsible for automating the checking of all possible permutations and combinations of hyperparameters. This technique is very effective when we have fewer hyperparameters for a model. With the increasing number of hyperparameters, the time to run a grid search and the number of possible permutations and combinations will increase vastly.

Randomized Search

To curb the problem of time and computation consumption in a grid search, another algorithm was developed – randomized search. Not all possible combinations of hyperparameters are found, but random combinations are chosen. Although the time is reduced, a randomized search algorithm is not guaranteed to obtain the most optimum value of the hyperparameters.

Conclusion

Learning about hyper-parameter tuning is essential while working with machine learning, deep learning, and computer vision, as it enables you to get the most accurate results and predictions. The author encourages the readers to use GridSearchCV and RandomisedSearchCV algorithms from the Scikit library and analyze and compare the results of the same.

Read more about learning paths and Python at codedamn here

Happy Learning!