Create one R script called run_analysis.R that does the following:
- Merges the training and the test sets to create one data set.
- Extracts only the measurements on the mean and standard deviation for each measurement.
- Uses descriptive activity names to name the activities in the data set.
- Appropriately labels the data set with descriptive activity names.
- Creates a second, independent tidy data set with the average of each variable for each activity and each subject.
- use function read.table() to read data into data set and then use function rbind() to put it together in the format of: subjects, labels, everything else;
- use function read.table() read the features and retain features of mean and standard deviation; select only the means and standard deviations from data and increment by 2 because data has subjects and labels in the beginning;
- read the labels (activities) and replace labels in data with label names;
- make a list of the current column names and feature names, then tidy that list by removing every non-alphabetic character and converting to lowercase, then use the list as column names for data;
- use function aggregate() to find the mean for each combination of subject and label;
- run source("run_analysis.R") , then it will generate a new file tinydata.txt in your working directory.
- download the zip files from https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip
- unzip the file to the R working directory
- save run_analysis.R to the R working directory
- run source("run_analysis.R") , then it will generate a new file tinydata.txt in your working directory.