Machine Learning | Dummy Variables
When performing regression analysis, it is oftentimes desirable to model the effect of different categories within the data on the dependent variable. In these cases, a common technique is to use a dummy variable for each category, where each dummy variable indicates whether a particular observation falls under that category or not (by having a value of either 1 or 0). Then, a multilinear regression is run on the independent variables, including the dummy ones.
The result is that each dummy variable shifts the y-intercept of the regression line by some amount. In the following example from Wikipedia [2], the dummy variable indicating whether an observation was categorized as "male" or "female" changes the y-intercept of the prediction for the "wage" variable by 𝛿_0.
Source: [1]
Resources: