Regression…

Regression…
Photo by Kelli Tungay on Unsplash

Regression is not a topic that I originally spent much time thinking about but as I started to get more involved with analyzing large dataset, I found myself learning more about regression so I coud extract the most out of a dataset. I have noticed one thing as I have worked with data scientiests from different backgrounds.

Those who come from a statistics or math background are very concerned with the coefficient estimation while people from an applied science background only concentrate on the overall model quality. Part of this might be due to how the two groups use the model after fitting - the more theoretical science backgrounds are interested in calculating probabilities and odds from the model and the applied science backgrounds are interested in reproduceability and forecasting.

Here are some important skills when performing linear regression that I learned from the theoretical scientists:

  1. Incorporating data types into models
    • Continuous
    • Binary
    • Categorical
    • Non-linear
  2. Evaluate Model with Residuals * Normality Assumption Check - Plot Y estimated versus Y known * Independence of Residuals - Plot Residuals versus Y estimated * Verify Constant Variance (Homoscedasticity) - Plot Residuals versus Y estimated * Check for Multicolinearity - Calculate VIF
  3. Feature Diagnostics * Forward Selection * Backward Selection * Stepwise Regression * Use Subject Matter Expert (SME) Input to find important factors
  4. Once each feature shows significance, perform additional analysis * Calculate DFBETAS, DFFITs, Cook’s D * R^2, MAE, Mean Squared Error (MSE) * Plot Results

From my engineering experiences we would also perform a sensitivity assessment. This could be done by inputing a lot of inputs or by creating extreme case studies.

This is a preview of Clap Button, a new feedback and analytics tools for Hydejack, built by yours truly. You can try it out on localhost for free, but it will be removed (together with this message) when building with JEKYLL_ENV=production. To use Clap Button on your site, get a subscription
and set clap_button: true in your config file.


This site is a modified version of Hydejack v9.1.4 created by Erin Wills.