Guru's Verification engine ensures consistency, confidence, and trust in the knowledge your organization shares. Learn more.

Machine Learning | Dummy Variables

When performing regression analysis, it is oftentimes desirable to model the effect of different categories within the data on the dependent variable. In these cases, a common technique is to use a dummy variable for each category, where each dummy variable indicates whether a particular observation falls under that category or not (by having a value of either 1 or 0). Then, a multilinear regression is run on the independent variables, including the dummy ones.

The result is that each dummy variable shifts the y-intercept of the regression line by some amount. In the following example from Wikipedia [2], the dummy variable indicating whether an observation was categorized as "male" or "female" changes the y-intercept of the prediction for the "wage" variable by 𝛿_0.

image.png

Source: [1]

Resources:

You must have Author or Collection Owner permission to create Guru Cards. Contact your team's Guru admins to use this template.