Git Product home page Git Product logo

computing-bootcamp-2019's Introduction

Duke University :: Department of Statistical Science Computing Bootcamp 2019

This is a five hour computing bootcamp for incoming Ph.D. and M.S. students to the Department of Statistical Science at Duke University. These materials are adapted from the 2018 bootcamp by Mine Çetinkaya-Rundel and Colin Rundel.

The workshop will cover the following topics:

Introduction to the DSS and Duke computing eco-systems

  • Account activation and access to departmental servers
  • Discussion of how to responsibly use distributed computing resources
  • Docker containers and Duke VM
    • RStudio
    • Jupyter Notebook

Introduction to reproducible research

  • Recognize the problems that reproducible research helps address, featuring a brief discussion of case studies gone wrong and how reproducible research could have possibly helped
  • Identify pain points in getting your analysis to be reproducible
  • The role of documentation, sharing, automation, and organization in making your research more reproducible
  • Introduce some tools to solve these problems, specifically R / RStudio / R Markdown

Organizing your project to facilitate reproducible research

  • Organize projects and folders to enable reproducibility and reusability
  • Understand the structure of data files and the importance of documenting all changes made
  • Create a reproducible project workflow using R / RStudio / R Markdown

Version control

  • Introduce git and GitHub.
  • Initiate a project directory, understand the git workflow, and create a pull request to a remote repository
  • Discuss the role of version control in reproducibility
  • Discuss version control best practices

R / RStudio and R Markdown

  • Navigate R Markdown and RStudio
  • Analyze data and create graphics with package tidyverse
  • Discuss workflow

Python and Jupyter notebook

  • Navigate Jupyter notebooks
  • Introduce Python basics, control flow, and functions
  • Discuss popular Python packages including: NumPy, SciPy, pandas, matplotlib, seaborn, scikit-learn, and TensorFlow
  • Discuss similarities and differences between Python and R
  • Discuss how to leverage the best of R and Python

Acknowledgments

Git / GitHub

Other

Python

R

computing-bootcamp-2019's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.