It's Bias and Variance
Recently my nephew got 97 marks in his mathematics test . I told the little kid , hey !! you have an accuracy of 97% to which he asked what does it means. I told him when you divide number of correctly answered questions by total number of questions you get the accuracy . Then I told him about error as well which basically means ratio of number of wrongly answered questions (assuming he answered every questions ) to total number of questions.
Hope now you get the relationship between error and accuracy which is very much like higher the accuracy lesser the error and vice versa.
Similarly in machine learning accuracy is the ratio of correctly classified data points to the total number of data points. Let's say in cat and dog classification , features of cat is given to model and if model classifies it as a cat then it is called as Correctly Classified point and if model classifies it as a dog then it is Misclassified point. Let's say we have total 1000 data points out of which our ML model was able to correctly classify 970 points ,then it can be known to have 97% accuracy with 3 % error.
In any Machine Learning model we have 3 types of error :
- Irreducible Error
- Bias Error
- Variance Error
Phewww a lot of terminologies...!!
Let's take a break and discuss something else. Hey you know what recently I saw 3 idiots after a long time,what an evergreen movie it is!!! Like we can relate it to real life as well .We do have friends like Farhan whose interest lies in something else so he is not that good at learning ,so he don't do well in class as well as in exams .Friends like Raju who is very tensed about his future ,does every class ,tries to mug up every concepts but still ends up getting less marks in exams because different type of question were asked in it than that taught in class. And there are few like Rancho who learns very well ,understands every concept ,exceptions and performs very well in exams because he is able to generalize.
In terms of ML ,learning means training and giving exam means testing as discussed in Split It Up - Part 1 and Part 2
Same things take place in Machine learning as well . Some models does well only on train set and not on test set ,some does not do well on both and some models are able to perform well on both .
Back to the Terminologies...!!!
Bias is caused by simplifying the assumptions used in a model so the functions are easier to approximate . A high bias model is same as Farhan who only study some parts rather than going through whole syllabus .He makes assumptions that only certain part is important for exam so I will learn only that part .A high bias model also makes naive assumptions and maps function accordingly without having global picture .
Variance indicates how much estimate of target function will alter if the different training set is used . Consider that high variance model is just like Raju who has learnt or maybe mugged up the training data so well that it does have good accuracy on train set but during testing as it sees some different/new data it performs very badly because model was not able to generalize.
Ideally model must be optimal that is it must have less error both on train set and test set. Cinematically speaking model must be like Rancho who is able to generalize well ,find the patterns in the data and the accordingly map a function in such a way that both train and test error are minimum.
- High bias models leads to underfitting which basically means it gives more error on train set as well as test set.
"Underfitting refers to a model that can neither model the training data nor generalize to new data.An underfit machine learning model is not a suitable model and will be obvious as it will have poor performance on the training data. "by
- High variance models leads to overfitting which basically means it performs very well on train set but gives a lot of error on test set .
"Overfitting happens when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data. This means that the noise or random fluctuations in the training data is picked up and learned as concepts by the model. The problem is that these concepts do not apply to new data and negatively impact the models ability to generalize. "by
- High bias or High variance model must not be used.
- A low bias and low variance model is a good and valid model.
- High bias is highly simple model and leads to underfitting.
- High variance is highly complex model and leads to overfitting.
Excellent explanation with simple real world examples
ReplyDelete