What is Multiple Linear Regression?
Multiple linear regression, commonly known as multiple regression, is one of the most common forms of regression analysis. It is a statistical measure that uses various explanatory (independent) variables to anticipate the effect of a response (dependent) variable. In simple words, it is a predictive analysis tool that explains the relationship between one continuous dependent variable two or more (multiple) independent variables.
Assumptions Of Multiple Linear Regression
In statistics, generally, every model has some underlying assumptions to it. Similarly, the multiple regression model works on some assumptions. Some of them are as below:
The independent variables are not having too high of a correlation with one another. There must be a linear relationship between the two variables, i.e. dependent and independent variables. Regression residuals must be having a normal distribution with a mean of 0 and variance. The residuals must be homoscedastic.
What Can You Conclude From Multiple Linear Regression?
Multiple linear regression is a form of regression where the dependent variable shows a linear relationship with other independent variables. The independent variables can be two or more. Multiple linear regression can be non-linear as well. In that case, the two variables, dependent and independent, do not follow a straight line.
Both kinds of multiple regression, linear and nonlinear, track a specified response using multiple (two or more) variables graphically. While both of them trace the same thing, non-linear regression is a bit difficult to execute. The assumptions for the multiple non-linear regression are usually derivations from the trial and error method.
The simple linear regression helps in predicting the value of one variable using the identified value of another variable. Unlike it, the Multiple regression model focuses on several explanatory variables.
The Formula for Multiple Linear Regression
yi = β0 + β1?i1 + β2?i2 +...+ βp?ip + ϵ
where, i = n observations yi = dependent variable β0 = y-intercept β1 and β2 = regression coefficients that give the change in y relative to a single-unit alter in xi1 and xi2. βp = slope coefficients for explanatory (independent) variables. xi1 and xi2 = explanatory variables ϵ = model's error term (also known as residuals)