Git Product home page Git Product logo

mlb-update's Introduction

gmailr MLB Injury Example

This project is a demo to show how to use gmailr and cronR.

Project Details

There's six R scripts.

  1. vars.R - Constants like HTML headers, email details, urls, etc.
  2. packages.R - Libraries to load.
  3. scraper_functions.R - Functions to access, scrape, and reshape MLB injury data.
  4. email_functions.R - Functions to create an email using the data from scraper_functions.R. The email pipeline uses the gmailr package.
  5. plan.R - Pipeline that puts everything together. The webscraper and email and wrapped inside a drake pipeline.
  6. cron_job.R - Schedule and run a cron job that calls the pipeline plan.R script using the cronR package.

Notes

  • This set of scripts will not work out of the box. You need to get header and cookie data from the site. The site we're using for the injury data is at the top of the vars.R script. Go to that site in a webbrowser and then follow the instruction on this page to extract the header and cookie info. Then paste it into the text input field on the left. Change the language to R. Copy the transformed header and cookie output and paste it in the proper locations. The cookie data goes in the mk_cookie function in the email_functions.R file. You could just assign it to a variable and use it but if you plan on using the scripts for anything more than a couple of days you need to update the dates in your cookie. The function does that. Replace the header info in the vars.R file.
  • Don't forget to add your emails (to and from) to the vars.R script.
  • Don't write your own cron settings use the Rstudio addin that comes with the package.
  • drake is overkill for a project this small but I use it often and it's worth checking out. The drake book has lots of info on use cases and examples.
  • I try to always call a function from its package instead of loading it in the namespace. Example, dplyr::select vs select. But I wanted to reduce clutter for anyone looking at how the pipeline works. If I decided to use this in production I'd prepend all third functions with their package.
  • data_for_debugging.csv is a demo dataset to use if you're troubleshooting gmailr.

Important Packages

Setting up gmailr

mlb-update's People

Contributors

cswaters avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.