Cross Validation

2 min readApr 22, 2021

Here we discuss methods that will get you through the parameters that you have to take a guess at, and whose value no one will tell you.

K-Fold Cross Validation:

K-Fold is popular and easy to understand, it generally results in a less biased model compare to other methods. Because it ensures that every observation from the original dataset has the chance of appearing in training and test set. This is one of the best approaches if we have limited input data. This method follows the below steps.

This method follows the below steps.

Randomly split your entire dataset into k” folds”
For each k-fold in your dataset, build your model on k — 1 folds of the dataset. Then, test the model to check the effectiveness for kth fold
Record the error you see on each of the predictions
Repeat this until each of the k-folds has served as the test set
The average of your k recorded errors is called the cross-validation error and will serve as your performance metric for the model

Repeat this process until every K-fold serves as the test set. Then take the average of your recorded scores. That will be the performance metric for the model.

This will take more time but, in this method, we used every part of the data to training and every part of the data for cross-validation.

Resources:

https://medium.com/analytics-vidhya/k-nearest-neighbors-algorithm-7952234c69a4#:~:text=KNN%20is%20a%20non%2Dparametric%20and%20lazy%20learning%20algorithm.,structure%20determined%20from%20the%20dataset.&text=KNN%20is%20one%20of%20the,parametric%20techniques%20to%20classify%20samples.,

Cross Validation

Written by Shaily jain

No responses yet