Git Product home page Git Product logo

public_ks_jonesact's Introduction

Replication Package for "Impacts of the Jones Act on U.S. Petroleum Markets" by Ryan Kellogg and Richard L. Sweeney

Overview


The code in this replication package takes as inputs a mixture of publicly available data and proprietary data, and outputs the figures, tables, and LaTeX input files used in the paper. A replicator in possession of all of raw data (including proprietary data from Bloomberg and Argus Media) can run all of code by executing the jones_bash_file.sh shell script from the root level of the replication package. This script will execute of series of Stata and R scripts that clean the raw data, execute all analyses, and produce all the figure, numeric, and tabular results in the paper. The script will also generate a PDF file of the paper, using LaTeX and the files generated by the aforementioned Stata and R programs.

Data Availability and Provenance Statements


This paper uses several publicly accessible data sources, exact copies of which are included in the replication package, and two commercially accessible data sources, from Bloomberg and Argus Media, that are not included.

To access the publicly-accessible data, replicators should download to their machines this publicly-accessible (zipped) folder. This folder contains three sub-folders:

  • rawdata holds the raw data used in the project
    • The orig subfolder holds directories for all data that we downloaded from publicly-available sources or purchased from commercial providers (Bloomberg and Argus Media). Each subfolder within rawdata/orig contains a README file that describes the data source. The Bloomberg and Argus folders do not contain the proprietary data but do include README files describing the Bloomberg and Argus data that we used.
    • Replicators must obtain the Bloomberg data through a Bloomberg terminal, and must acquire the Argus Media data by executing a data use agreement with Argus Media.
    • The data subfolder holds data obtained through the EIA's API. The R-script that executes the API call is JonesAct/code/build/EIA_API_Output/run_eia_api_v2.R. Note that this script is not called by jones_bash_file.sh to ensure that replicators use the same raw data that were used to create the paper.
  • intdata holds cleaned version of the raw data files as well as intermediate data files that facilitate the paper's analysis. We have included all intermediate files that do not contain any proprietary data (thus, some subfolders within intdata are empty).
  • images includes two figures that are included in the paper that are not produced by the Stata and R scripts.

Instructions to Replicators


Initial downloads:

To replicate the paper, you should first clone this repository to your machine, e.g. to a directory such as /Users/kelloggr/JonesAct or C:/Work/JonesAct. You should next download the data folder to your machine, e.g. to a directory such as /Users/kelloggr/public_ks_jonesact_data or C:/Users/kelloggr/Dropbox/public_ks_jonesact_data.

Proprietary data:

  • If you obtain the proprietary data from Bloomberg and Argus Media, these files should be saved to rawdata/orig/Bloomberg and rawdata/orig/Argus, respectively. It is possible that the data formats have changed since we obtained our datasets; please contact us regarding any questions about data formatting.
  • If you do not obtain the proprietary data, you cannot execute the full replication. However, you will be able to execute the following scripts:
    • `JonesAct/code/build/ArmyCorps/LoadArmyCorpsData.do
    • `JonesAct/code/build/EIACompanyLevelImports/CleanCompanyImports.do
    • `JonesAct/code/build/EIACompanyLevelImports/CleanCompanyImports_rename.do
    • `JonesAct/code/build/EIATerritories/CleanEITerritories.do
    • `JonesAct/code/build/EIARefineryInputs/CleanRefineryInputs.R
    • `JonesAct/code/analysis/padd1c_portshares.do

Directory specification:

  • So that the Stata scripts locate your local files, create a file in your root JonesAct directory called globals.do. The contents of globals.do should look like the following, substituting in your directory paths to the root and data directories:
global repodir = "C:/Work/JonesAct"
global dropbox = "C:/Users/kelloggr/Dropbox/public_ks_jonesact_data"
  • So that the R scripts locate your local files, create a file in your root JonesAct/code directory called paths.R. The contents of paths.R should look like the following, substituting in your directory paths to the root and data directories:
repo <-
  file.path("C:/Work/JonesAct")
dropbox <-
  file.path("C:/Users/kelloggr/Dropbox/public_ks_jonesact_data")
  • So that the shell scripts locate your local files, you must specify the REPODIR, DBDIR, OS, and STATA variables in hbp_bash_file.sh. These point to your local root repo directory, dropbox directory, operating system, and stata version
    • REPODIR should look something like REPODIR=C:/Work/JonesAct
    • DBDIR should look something like `DBDIR="C:/Users/kelloggr/Dropbox/public_ks_jonesact_data"
    • OS should be OS="Windows" or OS="Unix" (MacOS users should use the Unix version)
    • STATA should be STATA="MP" or STATA="SE"
      • Note: do NOT include a white space on either side of the equal sign in any of these expressions
    • To identify your $HOME variable, you can type echo $HOME into your bash shell command line

Software installs

  1. If you haven't already installed R, install it.
  2. If you haven't already installed Stata, install it.
  3. If you haven't already installed LaTeX, install it.
  4. Packages:
  • To run the scripts, you must have first installed one Stata package ("pathutil") and one R package ("here"). These packages help Stata and R scripts find the local path in which they are located. All other R package installs are handled, if necessary, by the included code/basic_setup.R script.
  • These packages can be automatically installed as part of the jones_bash_file.sh shell script by uncommenting the line bash -x $REPODIR/jones_stata_r_installs.sh |& tee jones_stata_r_installs_out.txt. Alternatively, you can install the packages manually within your Stata and R interfaces (see the commands included within jones_stata_r_installs.sh)

Running the scripts

  • The jones_bash_file.sh shell script is located in the root repo directory. For users with access to the full set of public and proprietary data, executing this script will: (1) delete all intermediate data and results, leaving only the raw data files; (2) copy the files in images into the root's output/figures subfolder; (3) conduct all of the data work and analysis starting from the raw data; and (4) compile the paper and appendix.
    • The deletion of the intermediate data and results files ensures that the paper's results are fully replicated from the raw data and that there are no hidden, improper file dependencies. Users with access to the confidential data who wish to replicate the entire data cleaning and analysis should proceed with this deletion (the raw data will not be deleted).
    • Users who only wish to run part of the code and users who only have access to the public data should NOT run the script that deletes the intermediate data and results files.
  • To execute jones_bash_file.sh, we recommend opening your bash shell, changing the directory to your local repository, and then using the following command
bash -x jones_bash_file.sh |& tee jones_bash_file_out.txt
  • This command will log output and any error messages to jones_bash_file_out.txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.