Replication Package for "Impacts of the Jones Act on U.S. Petroleum Markets" by Ryan Kellogg and Richard L. Sweeney

Overview

The code in this replication package takes as inputs a mixture of publicly available data and proprietary data, and outputs the figures, tables, and LaTeX input files used in the paper. A replicator in possession of all of raw data (including proprietary data from Bloomberg and Argus Media) can run all of code by executing the jones_bash_file.sh shell script from the root level of the replication package. This script will execute of series of Stata and R scripts that clean the raw data, execute all analyses, and produce all the figure, numeric, and tabular results in the paper. The script will also generate a PDF file of the paper, using LaTeX and the files generated by the aforementioned Stata and R programs.

Data Availability and Provenance Statements

This paper uses several publicly accessible data sources, exact copies of which are included in the replication package, and two commercially accessible data sources, from Bloomberg and Argus Media, that are not included.

To access the publicly-accessible data, replicators should download to their machines this publicly-accessible (zipped) folder. This folder contains three sub-folders:

rawdata holds the raw data used in the project
- The orig subfolder holds directories for all data that we downloaded from publicly-available sources or purchased from commercial providers (Bloomberg and Argus Media). Each subfolder within rawdata/orig contains a README file that describes the data source. The Bloomberg and Argus folders do not contain the proprietary data but do include README files describing the Bloomberg and Argus data that we used.
- Replicators must obtain the Bloomberg data through a Bloomberg terminal, and must acquire the Argus Media data by executing a data use agreement with Argus Media.
- The data subfolder holds data obtained through the EIA's API. The R-script that executes the API call is JonesAct/code/build/EIA_API_Output/run_eia_api_v2.R. Note that this script is not called by jones_bash_file.sh to ensure that replicators use the same raw data that were used to create the paper.
intdata holds cleaned version of the raw data files as well as intermediate data files that facilitate the paper's analysis. We have included all intermediate files that do not contain any proprietary data (thus, some subfolders within intdata are empty).
images includes two figures that are included in the paper that are not produced by the Stata and R scripts.

Instructions to Replicators

Initial downloads:

To replicate the paper, you should first clone this repository to your machine, e.g. to a directory such as /Users/kelloggr/JonesAct or C:/Work/JonesAct. You should next download the data folder to your machine, e.g. to a directory such as /Users/kelloggr/public_ks_jonesact_data or C:/Users/kelloggr/Dropbox/public_ks_jonesact_data.

Proprietary data:

If you obtain the proprietary data from Bloomberg and Argus Media, these files should be saved to rawdata/orig/Bloomberg and rawdata/orig/Argus, respectively. It is possible that the data formats have changed since we obtained our datasets; please contact us regarding any questions about data formatting.
If you do not obtain the proprietary data, you cannot execute the full replication. However, you will be able to execute the following scripts:
- `JonesAct/code/build/ArmyCorps/LoadArmyCorpsData.do
- `JonesAct/code/build/EIACompanyLevelImports/CleanCompanyImports.do
- `JonesAct/code/build/EIACompanyLevelImports/CleanCompanyImports_rename.do
- `JonesAct/code/build/EIATerritories/CleanEITerritories.do
- `JonesAct/code/build/EIARefineryInputs/CleanRefineryInputs.R
- `JonesAct/code/analysis/padd1c_portshares.do

Directory specification:

So that the Stata scripts locate your local files, create a file in your root JonesAct directory called globals.do. The contents of globals.do should look like the following, substituting in your directory paths to the root and data directories:

global repodir = "C:/Work/JonesAct"
global dropbox = "C:/Users/kelloggr/Dropbox/public_ks_jonesact_data"

So that the R scripts locate your local files, create a file in your root JonesAct/code directory called paths.R. The contents of paths.R should look like the following, substituting in your directory paths to the root and data directories:

repo <-
  file.path("C:/Work/JonesAct")
dropbox <-
  file.path("C:/Users/kelloggr/Dropbox/public_ks_jonesact_data")

So that the shell scripts locate your local files, you must specify the REPODIR, DBDIR, OS, and STATA variables in hbp_bash_file.sh. These point to your local root repo directory, dropbox directory, operating system, and stata version
- REPODIR should look something like REPODIR=C:/Work/JonesAct
- DBDIR should look something like `DBDIR="C:/Users/kelloggr/Dropbox/public_ks_jonesact_data"
- OS should be OS="Windows" or OS="Unix" (MacOS users should use the Unix version)
- STATA should be STATA="MP" or STATA="SE"
  - Note: do NOT include a white space on either side of the equal sign in any of these expressions
- To identify your $HOME variable, you can type echo $HOME into your bash shell command line

Software installs

If you haven't already installed R, install it.
If you haven't already installed Stata, install it.
If you haven't already installed LaTeX, install it.
Packages:

To run the scripts, you must have first installed one Stata package ("pathutil") and one R package ("here"). These packages help Stata and R scripts find the local path in which they are located. All other R package installs are handled, if necessary, by the included code/basic_setup.R script.
These packages can be automatically installed as part of the jones_bash_file.sh shell script by uncommenting the line bash -x $REPODIR/jones_stata_r_installs.sh |& tee jones_stata_r_installs_out.txt. Alternatively, you can install the packages manually within your Stata and R interfaces (see the commands included within jones_stata_r_installs.sh)

Running the scripts

The jones_bash_file.sh shell script is located in the root repo directory. For users with access to the full set of public and proprietary data, executing this script will: (1) delete all intermediate data and results, leaving only the raw data files; (2) copy the files in images into the root's output/figures subfolder; (3) conduct all of the data work and analysis starting from the raw data; and (4) compile the paper and appendix.
- The deletion of the intermediate data and results files ensures that the paper's results are fully replicated from the raw data and that there are no hidden, improper file dependencies. Users with access to the confidential data who wish to replicate the entire data cleaning and analysis should proceed with this deletion (the raw data will not be deleted).
- Users who only wish to run part of the code and users who only have access to the public data should NOT run the script that deletes the intermediate data and results files.
To execute jones_bash_file.sh, we recommend opening your bash shell, changing the directory to your local repository, and then using the following command

bash -x jones_bash_file.sh |& tee jones_bash_file_out.txt

This command will log output and any error messages to jones_bash_file_out.txt

rlsweeney / public_ks_jonesact Goto Github PK

public_ks_jonesact's Introduction

Replication Package for "Impacts of the Jones Act on U.S. Petroleum Markets" by Ryan Kellogg and Richard L. Sweeney

Overview

Data Availability and Provenance Statements

Instructions to Replicators

Initial downloads:

Proprietary data:

Directory specification:

Software installs

Running the scripts

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent