Git Product home page Git Product logo

db.rstudio.com's Introduction

db.rstudio.com

This repo (and resulting website) is licensed as CC BY-SA.

This is a blogdown site. To make changes, you can add a new .Rmd under /content.

blogdown::serve_site() will automatically re-render any pages needed from .Rmd to html and will let you view a preview.

db.rstudio.com's People

Contributors

ajmcoqui avatar akgold avatar ashleyhenry15 avatar blairj09 avatar cderv avatar coatless avatar colearendt avatar edavidaja avatar edgararuiz avatar edgararuiz-zz avatar hadley avatar jspiewak avatar markderry avatar mungojam avatar nwstephens avatar ricardofandrade avatar sarahemlin avatar sellorm avatar yihui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

db.rstudio.com's Issues

Managing credentials section

I think you need a short section on managing credentials in between "run queries safely" and "deploying content". This would basically say that you should never store a password in your R script and never type it into the console.

For now, this section would discuss three options to avoid this:

  • Use rstudioapi::askForPassword()
  • Use keyring
  • Use an environment variable

Typo in dbplyr github repo on the Redshift page

Under the Availability header the code block for installing dbplyr reads:

devtools::install_github("tidyvers/dbplyr")

It should instead read:

devtools::install_github("tidyverse/dbplyr")

Fix google analytics tracking

Add:

{{ template "_internal/google_analytics.html" . }}

To a new footer-custom.html file inside the layouts/partial folder

Have an article explaining when (or how to decide) to use DBI vs dplyr

This is something that comes up whenever I talk about databases, and it would be good to have "official" guidelines that we can direct people to. Since DBI and dplyr overlap a lot in their database functionality, users (especially newbies) are often at a loss on the topic of which one to choose.

In the olden times, when dplyr only supported SELECT-power statements, the rule was easy. Now, not so much. However, that may still be a good rule of thumb as non-SELECT statements in dplyr have not been time-tested yet.

Whatever the official recommendation is, we should have one, so that the answer to "Which package to use?" is not completely the opinion of whomever happens to be answering it.

cc @hadley

Best Practices Article - Improving Shiny apps

Put together an article that discusses steps to take during and after a Shiny app that uses a database as it's source is built. Using an example, the plan is to show how to use the following tools to improve the app's performance:

  • The pool package
  • Use profvis, and possibly shinyloadtest, after the app is complete
  • How to use show_query() and explain() to troubleshoot long running queries

The article should also include links to these pages:

@bborgesr
@jcheng5

Add an example of "Invalidate Metadata" for Impala

Add a section to Impala about cleaning up your metadata. If you create a table in Impala and then drop the Hive metadata, you will need to invalidate the Impala metadata.

impala_con <- dbConnect(odbc::odbc(), "Impala")
dbWriteTable(impala_con, "mtcars", mtcars)
hive_con <- dbConnect(odbc::odbc(), "Hive")
dbRemoveTable(hive_con, "mtcars")
dbReadTable(impala_con, "mtcars") # succeeds
dbExistsTable(impala_con, "mtcars") # fails
dbGetQuery(odbcCon, "INVALIDATE METADATA mtcars")
dbExistsTable(impala_con, "mtcars") # succeeds

This happens because dropping the Hive metadata does not drop the Impala metadata. More information can be found here.

Add an article about adding and updating existing records in a database through R

I think it would be really helpful, if there was an article about adding and more importantly updating existing records in a database through R, and the best practices to use. There are many excellent resources on downloading, cleaning and transforming data in R, but data warehousing is a crucial piece missing in the data pipeline.

Add documentation for supported data types

Add a table that lists supported data types for each data source. For example:

Supported Teradata Data Types

R type Teradata type
factor VARCHAR(255)
time VARCHAR(255)
date VARCHAR(255)
binary BLOB
integer INTEGER
double BINARY_DOUBLE
character VARCHAR(255)
logical DECIMAL
list VARCHAR(255)

Unsupported types will throw an error. See r-dbi/odbc#238.

MySQL does not support boolean

MySQL does not have a boolean type and uses TINYINT instead. Therefore, if you write logical values they will get returned as integers. Please make a note in the documentation.

Troubleshooting Connections page

Provide some FAQ level answers to common issues users have reported when connecting and using a database. It should consist mostly of links to other articles, such as:

And links to the Known Issues sections of those databases we have documented issues for.

The main idea is to have a very visible "Troubleshooting" link in the page that users can go in. We can update it as users report to use that they couldn't find what they were looking for.

Another idea is to include the link to community.rstudio.com to also cover non-customers who are still having issues.

@jimhester
@nwstephens

Consistent Section Headers

Right now we have "Best Practices" as a section header, but no other section headers, so the sidebar feels inconsistent. We should change the structure to something like this:

screen shot 2017-06-20 at 7 52 25 am

I added "Examples" and "Packages" as section headers

This should be a relatively simple change to config.toml

Expand on Pool and Shiny best practices and include custom SQL example

The current pool page doesn't go into much detail about where and when to use different pool functions.

For example:

  • where/when in your shiny app to create a pool object
  • where/when to checkout connections from the pool with poolCheckout()
  • where/when to when to return connections to the pool with poolReturn()
  • where/when to close the pool using poolClose()

The current page also only has guidance when using dplyr to query, but not using custom SQL.

Addresses: https://community.rstudio.com/t/pool-and-shiny-best-practices-for-getting-custom-queries-from-database/3707/4

Database switching

Add page to explain how to switch between databases using dbplyr and DBI/odbc.

It should include how to use our functions and function arguments, such as in_schema() and dbListTables(...schema_name=""), and some DB specific commands like Oracle's SET SCHEMA. Also, cross reference those DB specific commands to their respective page in the site.

cc.
@jimhester
@nwstephens
@hadley

Visualization

  • Add reference to dbplot in the best practice visualization article
  • Add dbplot to the packages page

Link to Drivers section leads to "Page Not found" on the Redshift and PostgreSQL pages

Under the "Connection Settings" header, then the "Driver" bullet.

In "See the Drivers section for setup information", the word Driver is a hyperlink to https://db.rstudio.com/redshift/drivers which loads a "Page Not found" page.

It is not entirely clear to me if the expected behavior is to lead to the anchor tag above https://db.rstudio.com/redshift/#driver-options or if there was supposed to be a page in the location it links to.

The PostgreSQL has identical behavior, except the link that isn't found is https://db.rstudio.com/postgresql/drivers and I do not seem to see an anchor tag on the page that would make sense for it to jump to.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.