Git Product home page Git Product logo

b-spline-regression's Introduction

B-Spline-Regression

To overcome the limitations of Linear Regression we can use an improved technique in which we can split the training data into multiple bins and fit each bin with a different model. This technique is known as the Regression spline.

Knots and Bins

The points where division occurs are called Knots and the intervals or pieces of data are called bins. The functions used to fit each bin are called Piecewise functions. These piecewise functions can be of linear of polynomial types, here we have used a cubic polyomial.

Piecewise Function

C0(X) = I(X < k0)
C1(X) = I(k0 < k1)
C2(X) = I(k1 < k2)
.
.
.
Cm(X) = I(km)
where k0, k1, k2......km are the cut points and I(condition) is the function that return 1 when condition=True.
Instead of fitting the same function over different bins of X. We fitted a cubic Piecewise Polynomial over 5 bins.

Basis Function

Idea behind using low degree polynomial is to avoid high oscillations of the curve. Instead of treating the functions that are applied to the bins as linear, it would be even more efficient to treat them as non-linear. To do this, a very general family of functions is applied to the variable. This family should nor be too flexible to over-fit neither be too rigid to under-fit. These families of functions are called basic functions.

Dataset

The Dataset used in this project is called Boston House Price Prediction(easily available on Kaggle). This dataset contains 506 Observations and 14 Variables.

  1. CRIM (per capital crime rate by town)
  2. ZN (proportion of residential land zoned for lots over 25,000 sq.ft.)
  3. INDUS (proportion of non-retail business acres per town)
  4. CHAS (Charles River dummy variable (= 1 if tract bounds river; 0 otherwise))
  5. NOX (nitric oxides concentration (parts per 10 million)
  6. RM (average number of rooms per dwelling)
  7. AGE (proportion of owner-occupied units built prior to 1940)
  8. DIS (weighted distances to five Boston employment centers)
  9. RAD (index of accessibility to radial highways)
  10. TAX (full-value property-tax rate per 10,000 USD)
  11. PTRATIO (pupil-teacher ratio by town)
  12. B (1000(Bk โ€” 0.63)2 where Bk is the proportion of blacks by town)
  13. LSTAT (percent lower status of the population)
  14. MEDV (Housing Prices)

b-spline-regression's People

Contributors

aps19 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.