Least Square Classification

Shaily jain
1 min readFeb 12, 2021

--

Good to know why we don’t do least square regression like technique for classification in most of the cases.

Three classes problem

Steps involved are :

  • Firstly convert response variable into indicator variable with (no. of rows, K), where k is number of classes possible for y. any row * col takes value 1 if corresponding data row and column corresponds to the response class. otherwise it is a 0.
  • Then simply apply least square formula . The criterion is a sum of square Euclidean distance of fitted vectors from their targets. A new observation is classified by computing its fitted vector and classifying to the closest target. OR we can simply find argmax(k) f(x), where f(x) = (1, x.T) w, and w = (X.T* X)^(-1) *(X.T) *y

Problems with using above techniques:

  • It is very dynamic to presence of outliers in data, and decision boundary changes a lot with addition of points.
  • It is associated with masking problem, which makes it un suitable for classification of more than three classes.

I really recommend reading through the code from scratch to understand the algorithm more closely.

Consider following if you find this informative.

See you again.

--

--

Shaily jain
Shaily jain

Written by Shaily jain

Problem Solver, Data Science, Actuarial Science, Knowledge Sharer, Hardcore Googler

No responses yet