ahmshahparan Goto Github PK

followers: 3.0 following: 1.0 repos: 31.0 gists: 0.0

Name: A H M Shahparan

Type: User

A H M Shahparan's Projects

casestudy.fullstack_webapplication

Some define Statistics as the field that focuses on turning information into knowledge. The first step in that process is to summarize and describe the raw information - the data. In this lab, you will gain insight into public health by generating simple graphical and numerical summaries of a data set collected by the Centers for Disease Control and Prevention (CDC). As this is a large data set, along the way you’ll also learn the indispensable skills of data processing and subsetting.

data606_lab02

data606_lab03

data607_project01

In this project, you’re given a text file with chess tournament results where the information has some structure. Your job is to create an R Markdown file that generates a .CSV file (that could for example be imported into a SQL database) with the following information for all of the players: Player’s Name, Player’s State, Total Number of Points, Player’s Pre-Rating, and Average Pre Chess Rating of Opponents. For the first player, the information would be: Gary Hua, ON, 6.0, 1794, 1605 1605 was calculated by using the pre-tournament opponents’ ratings of 1436, 1563, 1600, 1610, 1649, 1663, 1716, and dividing by the total number of games played. If you have questions about the meaning of the data or the results, please post them on the discussion forum. Data science, like chess, is a game of back and forth. The chess rating system (invented by a Minnesota statistician named Arpad Elo) has been used in many other contexts, including assessing relative strength of employment candidates by human resource departments. You may substitute another text file (or set of text files, or data scraped from web pages) of similar or greater complexity, and create your own assignment and solution. You may work in a small team. All of your code should be in an R markdown file (and published to rpubs.com); with your data accessible for the person running the script.

data607_project02

Read the information from your .CSV file into R, and use tidyr and dplyr as needed to tidy and transform your data.

data607_week01

Our task is to study the famous Mushrooms Dataset and the associated description of the data (i.e. “data dictionary”). We should take the data, and create a data frame with a subset of the columns in the dataset. We should include the column that indicates edible or poisonous and three or four other columns. We should also add meaningful column names and replace the abbreviations used in the data—for example, in the appropriate column, “e” might become “edible.” Our deliverable is the R code to perform these transformation tasks.