Comments (2)
This is the exercise I started writing...
Let's consider the Wine Quality dataset (winequality-all.csv
)
that you can download from SIT114's CloudDeakin site
(Resources → Datasets). It is assumed that the file is stored in
the same current working directory (e.g., the same where the Rmd file is).
wines <- read.csv("winequality-all.csv", comment="#")
head(wines)
These are Vinho Verde red and white wine samples from the north of Portugal,
see https://www.vinhoverde.pt/en/homepage.
There are 11 physicochemical features reported.
Moreover, there is a wine rating on the scale 0 (bad) to 10 (excellent)
given by wine experts, read more at
https://archive.ics.uci.edu/ml/datasets/Wine+Quality.
-
Remove all the
red
wines fromwines
(replacing the old data frame
with the new one). Then, get rid of thecolor
andresponse
columns.Click here for a solution
First let's handle the red wines.
wines <- wines[wines$color != "red", ]
Omitting the aforementioned columns can be done in a few ways. Here is one:
wines <- wines[, is.na(match(names(wines), c("color", "response")))]
To recall,
names(data.frame)
gives the vector of column names.
On the other hand, thematch()
function, matches all the values
in the first argument against all the values in the second argument.
If there is no match (in our case, if a column name is not amongst
the two names we wish to remove),NA
is generated. -
Compute the Pearson correlation coefficient between
alcohol
and
every other variable in the dataset.Click here for a solution
This can be done via a call to
cor()
. A quick glimpse at the manual page
(?cor
) reveals, that this task can be solved as follows:cor(wines$alcohol, wines)
We could have got rid of the the
alcohol
vs.alcohol
comparison,
but wanted to show that any column is always perfectly linearly correlated
with itself, hence the1.0
coefficient obtained. -
Fit simple regression models for
alcohol
as functions
of the three most correlated variables (three individual models).
round(cor(wines),2)
f_density <- lm(alcohol~density, data=wines)
plot(alcohol~density, data=wines)
summary(f_density)
abline(f_density, col="red")
step(lm(alcohol~1, data=wines), # empty model
scope=formula(lm(alcohol~., data=wines)), # full model
direction="forward")
from lmlcr.
(the to-do list has moved)
from lmlcr.
Related Issues (20)
- hill climbing / Tabu-like search in ch.8 HOT 3
- text classification example HOT 1
- cross-validation
- optimisation exercises: model regularisation (L2, L1), LAD regression HOT 1
- working with discrete variables? HOT 1
- genieclust: ARI, genie() in Ch.7 HOT 1
- pandoc-fignos pandoc-eqnos pandoc-tablenos pandoc-secnos HOT 1
- use MSR instead of SSR HOT 1
- hierarchical clustering: illustrate on a small dataset
- Get rid of Ch.8, add an exercise with DEoptim and BFGS + k-means on unbalance HOT 3
- stress in the optim chapter that error funs is supervised learning depend on X and Y!! HOT 1
- freeze at v0.2, create a new branch for v0.3
- R 4.0 uses stringsAsFactors = FALSE by default
- define residuals as y_true-y_pred HOT 1
- prefer notation X[,col] to X$col and X[["col"]] HOT 1
- Upquote
- diff styles with my other book, copy/merge changes HOT 1
- R^2 (coef. of determination) is not r (Pearson's)-squared HOT 1
- Pareto distribution / power law HOT 1
- SVD??? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lmlcr.