Git Product home page Git Product logo

pgqr's Introduction

PGQR

This is code to implement the Penalized Generative Quantile Regression (PGQR) model in "Generative Quantile Regression with Variability Penalty" by Shijie Wang, Minsuk Shin, and Ray Bai. https://arxiv.org/abs/2301.03661

Abstract

We introduce a deep learning generative model for joint quantile estimation called Penalized Generative Quantile Regression (PGQR). Our approach simultaneously generates samples from many random quantile levels, allowing us to infer the conditional distribution of a response variable given a set of covariates. Our method employs a novel variability penalty to avoid the problem of vanishing variability, or memorization, in deep generative models. Further, we introduce a new family of partial monotonic neural networks (PMNN) to circumvent the problem of crossing quantile curves. A major benefit of PGQR is that it can be fit using a single optimization, thus bypassing the need to repeatedly train the model at multiple quantile levels or use computationally expensive cross-validation to tune the penalty parameter. We illustrate the efficacy of PGQR through extensive simulation studies and analysis of real datasets.

Prerequisites for PGQR

In order to sucessfully run the PGQR model, we need to pre-install and confirm the following environments on your local machine. Moreover, there are several R packages that need to be installed beforehand.

A. Python, Pytorch and CUDA environment

The main code for implementing PGQR is in Python while simulation and data generation is coded in R. The partial monotonic neural networks (PMNN) is constructed by the Pytorch library. We strongly recommend using CUDA (GPU-based tool) to train PGQR, which can accelerate the runtime a lot more than using CPU.

B. Required R package

In R, we need the reticulate package to run PGQR which is coded in Python in R. For comparison, we also considered other traditional CDE methods, including

  • Random Forest CDE (RFCDE)
  • Nearest Neighbor Conditional Density Estimation (NNKCDE)
  • FlexCoDE the specifics of FlexCoDE installation can be found in at FlexCoDE.

The motorcycle dataset is included in the adlift package and nonparameteric quantile regression is implemented using R package quantreg.

install.package("reticulate")
install.package("RFCDE")
install.package("NNKCDE")
install.package("HDInterval")
install.package("adlift")
install.package("quantreg")

Implementation of PGQR

To implement PGQR, we provide the Python code of PGQR under Python code folder and R code for simulation under R code folder. More detailed expalanations are provided below.

Working directory

To run the PGQR model, save the results and produce the plots, we need to set the working directory beforehand. It's very crucial to set the working directory manually "/yourlocalmachine/" to sucessfully run the PGQR model for every R code file .

  • Create a Python code folder: "/yourlocalmachine/Python_code/" which should inculdes the python files the same in 'Github/Python_code/' folder.
  • Create a R code folder: "/yourlocalmachine/R_code/" which should inculdes the R files the same in 'Github/R_code/' folder.
  • Create a result folder: "/yourlocalmachine/result/"
  • Create two subfolders in the result folder: "/yourlocalmachine/result/2000/" for simulation studies and "/yourlocalmachine/result/real/" for real data analysis.

Python code folder

Under the Python_code folder, we have the following scripts:

  • QR_pen_m.py constructs the main body of penalized Generative Quantile Regression (PGQR).
  • QR_nopen_m.py constructs the Generative Quantile Regression(GQR) without regularization term.
  • Cond_WGAN.py constructs the Wasserstein generative conditional sampler (WGCS).
  • CondGAN_MS.py constructs Generative conditional distribution sampler (GCDS).

R_code folder

It is very crucial to set the working directory manually such as "/yourlocalmachine/" to sucessfully run the PGQR model for every R code file . Under the R_code folder, we provide code for training PGQR, saving the results in .RData form, plotting the graphs present in paper, and contructing the results tables for the simulation studies and real data analyses in Section 6 of the manuscript.

A. PGQR Simulation Train

  • PGQR.R is to train PGQR model under different simulation settings (see code annotation and descriptions in paper). The results should be saved under the path "/result/2000" where "/2000/" is the corresponding sample size.
  • data_gen.R is to generate the simulation dataset. It should be under "/R_code/" directory.
  • model_fit.R is to run the Python code for the deep generative models such as PGQR, GCDS or WGCS.

B. PGQR Simulation Graph

  • graph.R is to plot the graph from the saved results (in .RData form). The resultant graph will be saved in "/yourlocalmahine/result/2000/"

C. PGQR Simulation table

  • table_compute.R is to compute the simulation table in the paper and save the corresponding results.
  • table_eval.R is to evaluate the performance measure described in paper from the results saved by __table_compute.R__and summarize them in table form.
  • quantile_plot.R is to plot the predicted mean squared error (PMSE) of different quantiles and produce the resultant plot, which is evaluated from table_compute.R.

D. Real data analysis

  • real_fit.R is to implement PGQR on three real datasets, as well as two classic crossing-quantile benchmark datasets and save the results under "/yourlocalmachine/real/"
  • real.table.R is to evaluate the out-of-sample prediction interval width and coverage rate from results produced by real_fit.R.
  • cross_quantile.R is to produce the plot of the crossing quantile phenomenon in the motorcycle and bone mass density datasets from the results by real_fit.R.
  • APL_plot.R is to produce the plot of APL/F ratio plot from the results by real_fit.R.

pgqr's People

Contributors

raybai07 avatar shijiew97 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.