Data Science : Regression & its variants
03 Jun 2021
Data Science is all about analyzing the data, finding patterns, and predicting the future. If the pattern identified is accurate, the prediction is correct. So the struggle is always to do the right analysis. Though being skilled in Coding is essential for a data scientist, that’s not all. A data scientist needs to have skills in coding, statistics, and critical thinking. Our online course on Data Science using Python is a complete course from basics to the latest tools and techniques followed.
Let’s look at a few commonly used regression forms:
- Linear Regression: This is the most simple, popular regression form. Here there is only 1 dependent value and mostly only 1 independent value. The shape of the regression is linear (straight line).
- Multiple Regression: Quite similar to linear regression however the difference is that the independent variable is more than 1. Here since there are more independent values the result is expected to be more accurate.
- Logistic Regression: This regression is used to find out the probability of a class or event. Whether the result will be ‘Pass or Fail’ or ‘Yes or No’. it is widely used for classification problems. Logistic regression can be binary, ordinal, or multinomial.
- Stepwise Regression: This form of regression helps with high dimensional data sets. It is used when we are working with more than one independent variable. The selection of the independent variable is an automatic process. In each step, a variable is considered to be added or removed from the set based on specific criteria.
- Ridge Regression: In a data set where the independent variables are highly correlated (multicollinearity) ridge regression is used. Here the L2 regularization tool is used. Ridge regression uses a type of shrinkage called ‘ridge shrinkage’. It shrinks the value of the coefficients but not to zero. Unlike least square estimates here a degree of bias is added to the regression estimates to reduce the standard errors.
- LASSO (Least Absolute Shrinkage & Selection Operator) Regression: Here unlike Ridge regression, the value of the coefficient gets shrunk to zero. This regression uses the L1 regularization technique. LASSO regression provides a subset of predictors which is simple and sparse.
- ElasticNet Regression: This is a combination of Ridge and LASSO regression forms .i.e a hybrid of L1 and L2 regression methods. Though it inherits advantages from both the models it might also suffer from double shrinkage.