Least Square Classification

1 min readFeb 12, 2021

--

Good to know why we don’t do least square regression like technique for classification in most of the cases.

Three classes problem

Steps involved are :

Firstly convert response variable into indicator variable with (no. of rows, K), where k is number of classes possible for y. any row * col takes value 1 if corresponding data row and column corresponds to the response class. otherwise it is a 0.
Then simply apply least square formula . The criterion is a sum of square Euclidean distance of fitted vectors from their targets. A new observation is classified by computing its fitted vector and classifying to the closest target. OR we can simply find argmax(k) f(x), where f(x) = (1, x.T) w, and w = (X.T* X)^(-1) *(X.T) *y

Problems with using above techniques:

It is very dynamic to presence of outliers in data, and decision boundary changes a lot with addition of points.
It is associated with masking problem, which makes it un suitable for classification of more than three classes.

I really recommend reading through the code from scratch to understand the algorithm more closely.

Consider following if you find this informative.

See you again.

Linear Regression

Data Science Training

Written by Shaily jain

Problem Solver, Data Science, Actuarial Science, Knowledge Sharer, Hardcore Googler

No responses yet

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams