The scholardaemon from lmmx

🔎 🐥 📃 Google Scholar Alerts Twitter bot 📃 🐥 🔎

Google Scholar lacks an API, but unlike PubMed links directly to papers. Often the stream of a Pubmed-sourced bot is filled with papers not deposited with direct links. Occasionally they will have a DOI, but Medline's indexing of these is inconsistent (the XML for articles themselves can be pretty inconsistent as I found out on a previous excursion under Pubmed's bonnet).

Even when a paper is deposited with this identifier, the DOI minting process means it's not guaranteed that the link will work straight away - I myself have felt (and regularly see other scientists online expressing the same) frustration at having the basic line of scientific enquiry rudely interrupted by technical issues. Preprints are another consideration.

via

Preprints are undeniably coming into the fold of bioscience research, a practice originating in the physics/mathematical sciences that crept in through common ground at arXiv's q-bio section. There are various dedicated sites/accounts monitoring particular subfields (e.g. Haldane's sieve/@haldanessieve for population/evolutionary genetics).

Google Scholar indexes all fields, and in my own experience this leads to casual interdisciplinary reading in a way not possible from Pubmed's purely biomedical library - a facet of research which the BBSRC, MRC and the Society of Biology feel is lacking amongst bioscientists.

Creating a feed of interest through Google Scholar

Google Scholar Alerts can provide up to 20 results in an e-mail, and posting/archiving these somewhere other than a busy inbox makes new research more accessible
Gmail for instance has various APIs and libraries, including an official Python 2.6-2.7 package and gmailr for R
Twitter likewise has python-twitter and twitteR

This script checks for Google Scholar Alerts in a Gmail account, parses through the message for paper titles and links, and sends the list of new articles through to Twitter

this could perhaps be automated with a cron job like Lynn Root used for her IfMeetThenTweet IFTTT alternative
it could also perhaps be hosted on a free micro instance of Amazon Web Services EC2 (but I've not tried yet) etc.
sending the papers to Buffer doesn't make much sense since it seems to be at most 1 email a day, though perhaps other queries may vary

Installation and usage

For a walkthrough on installation see the Wiki homepage. Briefly:

Install gmailr and twitteR, set up apps on Google Dev console and likewise for Twitter's
Authorise gmailr (gmail_auth) with the JSON obtained by setting up an app
Run Rscript run_daemon with --help to show available flags and bots.
- Bots can be passed as arguments to run_daemon indicating which of the available account configurations to use, default behaviour being to check and tweet for all sequentially if unspecified.
- These arguments are specified under config/bot_registry.json, where they are stored alongside the corresponding sub-directories to retrieve authentication information from. See the Wiki for more info.

Automation

Dave Tang seems to have beaten me to the idea of using R for a paper bot by just a couple of weeks - he has a working example of a cron script, timed for Pubmed's release, as he worked with eUtils (i.e. Pubmed, like all the other existing bots in Casey Bergman's list, with the exception of eQTLpapers which has Scholar Alerts added manually by Sarah Brown).

crontab -l
#minute hour dom month dow user cmd
0 15-23 * * * cd /Users/davetang/Dropbox/transcriptomes && ./feed.R &> /dev/null

Cron automation makes sense for daily MEDLINE (PubMed) updates, but not for emails - IFTTT-like 'triggering' would be ideal, and can be achieved with custom 'events' through Amazon Lambda [free tier], reacting to changes in AWS S3 file storage, which may be modified with dat pull --live.

Wiki: Proposed workflow with AWS and dat

For now I'm using cron (hourly entry added with crontab -e) to:

source my .bashrc which
- exports the location of the scholaRdaemon directory to an eponymous variable
- sets an alias runsdaemon as Rscript "$scholaRdaemon/run_daemon"
record the date/time in the sd.log file there
run the daemon for all bots (default behaviour, for all bots listed in config/bot_registry.json)

0 * * * * source /home/louis/.bashrc; date >> "$scholaRdaemon"sd.log; runsdaemon >> "$scholaRdaemon"sd.log

lmmx / scholardaemon Goto Github PK

scholardaemon's Introduction

🔎 🐥 📃 Google Scholar Alerts Twitter bot 📃 🐥 🔎

Creating a feed of interest through Google Scholar

Installation and usage

Automation

scholardaemon's People

Contributors

Stargazers

Watchers

Forkers

scholardaemon's Issues

Intermittent communication error

Errors not logged!

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent