Git Product home page Git Product logo

dsc-regression-model-eval-recap-online-ds-pt-051319's Introduction

Multiple Regression and Model Validation - Recap

Introduction

In this section you extended your knowledge of building regression models by adding additional predictive variables and subsequently validating those models. Moreover, you also got a brief introduction to data ethics. Remember that throughout your data work it is essential to consider personal privacy and the potential impacts of the data you have access to.

Regression

You saw a number of techniques and concepts related to regression. This included the idea of using multiple predictors in order to build a stronger estimator. That said, there were caveats to using multiple predictors. For example, multicollinearity between variables should be avoided. One option for features with particularly high correlation is to only use one of these features. This improves model interpretability. In addition, linear regression is also most effective when features are of a similar scale. Typically, feature scaling and normalization are used to achieve this. There are also other data preperation techniques such as creating dummy variables for categorical variables, and transforming non-normal distributions using functions such as logarithms. Finally, in order to validate models it is essential to always partition your dataset such as with train-test splits or k-fold cross validation.

Ethics

Aside from regression, you also took a look at data privacy and ethics. You probably had already heard some of these ideas, but may have not been familiar with GDPR or privacy advocacy groups like the Electornic Frontier Foundation. The digital age has brought a slew of political and philosophical questions to the arena, and there are always fascinating (and disturbing) conversations to be had. Be sure to keep these and other issues at the forefront of your thought process, and not simply be dazzled by the power of machine learning algorithms. Ask yourself questions like, "What is the algorithm being used for?" or "What are the ramifications or impact of this analysis/program/algorithm?".

When Einstein released his theory of relativity, its impact had tremendous benefit in advancing the field of physics yet the subsequent development of the Manhattan project was arguably a great detriment of humanity. To a similar vain, be thoughtful of which planes of thought you are operating on, and always be sure to include an ethical and philosophical perspective of the potential ramifications of your work.

dsc-regression-model-eval-recap-online-ds-pt-051319's People

Contributors

sumedh10 avatar mathymitchell avatar

Watchers

James Cloos avatar Kevin McAlear avatar  avatar Mohawk Greene avatar Victoria Thevenot avatar Belinda Black avatar Bernard Mordan avatar raza jafri avatar  avatar Joe Cardarelli avatar Sara Tibbetts avatar The Learn Team avatar Sophie DeBenedetto avatar  avatar Antoin avatar Alex Griffith avatar  avatar Amanda D'Avria avatar  avatar Scott Ungchusri avatar Nicole Kroese  avatar Lore Dirick avatar Kaeland Chatman avatar Lisa Jiang avatar Vicki Aubin avatar Maxwell Benton avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.