getting_and_cleaning_data_assignment's Introduction

Assignment Discription

Create one R script called run_analysis.R that does the following:

Merges the training and the test sets to create one data set.
Extracts only the measurements on the mean and standard deviation for each measurement.
Uses descriptive activity names to name the activities in the data set.
Appropriately labels the data set with descriptive activity names.
Creates a second, independent tidy data set with the average of each variable for each activity and each subject.

Steps

use function read.table() to read data into data set and then use function rbind() to put it together in the format of: subjects, labels, everything else;
use function read.table() read the features and retain features of mean and standard deviation; select only the means and standard deviations from data and increment by 2 because data has subjects and labels in the beginning;
read the labels (activities) and replace labels in data with label names;
make a list of the current column names and feature names, then tidy that list by removing every non-alphabetic character and converting to lowercase, then use the list as column names for data;
use function aggregate() to find the mean for each combination of subject and label；
run source("run_analysis.R") , then it will generate a new file tinydata.txt in your working directory.

download the zip files from https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip
unzip the file to the R working directory
save run_analysis.R to the R working directory
run source("run_analysis.R") , then it will generate a new file tinydata.txt in your working directory.

Recommend Projects