Git Product home page Git Product logo

bayesian-approach-predicting-well-production's Introduction

Bayesian Approach to Predicting Well Production

This repo contains a Bayesian analysis of predicting well production from completion parameters in a shale reservoir. The dataset was 178 wells and 25 features, which is not made available. The model is Bayesian hierarchical linear-regression implemented from the PyMC3 library in Python.

Motivation

Completion parameters refer to the engineering parameters of the well such as well length, number of stages, amount of fluid, etc. Geologic parameters are not a direct input to the model, but are handled indirectly through the hierarchical (multi-level) implementation of the linear-regression. The grouping in this example is by reservoir zone (3 zones) but can be expanded to include field boundaries (8 fields).

Choice of Model

An algorithm that handles non-linear effects, such as random forest or neural net would likely give a more accurate prediction. In this case, the advantage of linear-regression is the ability to interpret the features and apply the multi-level modeling. It also makes a nice entry point to Bayesian modeling.

Target and Features

Choosing a metric for production of a well is always a challenge. The data included the daily output of oil, gas and water for each well. Oil was chosen as the fluid to predict, though gas or some combination could also be chosen. The code includes application of a Savitzky-Golay filter to smooth the oil curve on a 31-day window and return the peak daily production as the target. This is a common industry practice because cumulative numbers can be affected by mechanical or storage issues not related to the reservoir.

GitHub Logo

The features were reduced from 25 to 13 using a two-step process. First, a correlation matrix was generated and one of any pair of features with correlation greater than 0.9 were removed. Then Lasso regularization was applied to remove additional features not important to the regression.

Pooled Model

What makes Bayesian modeling different is that instead of estimating single values for the model parameters and resulting predictions, we estimate distributions and carry the probabilities through the modeling process. The pooled description for this model refers to running a single regression on all the data, without regard to reservoir zone or field. If the inputs are standardized, we can interpret the importance of features based on their coefficient values relative to the zero line and their spreads.

GitHub Logo

Besides the expected value, we also get unique probabilities and ranges for each prediction.

GitHub Logo

A histogram of the uncertainties for each prediction shows a range of +/- 90 to +/- 240 bbl, based on two standard deviations. The model RMSE of +/- 188 bbl would probably be applied to all predictions in a non-Bayesian regression.

GitHub Logo

Hierarchical Model

The hierarchical model involves either part-pooling or un-pooling the data so that each reservoir zone is allowed to have its own intercept. This makes sense because at least one of the reservoir zones shows higher average production.

GitHub Logo

The part-pooled model assumes that the intercepts themselves come from a common distribution whereas the un-pooled model assumes they have independent distributions. The pooled model is likely to under-fit the data whereas the un-pooled model is likely to over-fit. The part-pooled model represents a compromise between the two extremes. This is especially helpful when one of the zones has fewer wells, so we can utilize the common distribution of all zones to "fill-in" missing information. This borrowing of information leads to "shrinkage" of the uncertainty of the part-pooled intercepts relative to the un-pooled intercepts. In this case, the part-pooled model reduces the test RMSE by about 20 bbl and is the preferred model.

GitHub Logo

References

This was a helpful example of the PyMC3 workflow by Jonathan Sedar:

http://blog.applied.ai/bayesian-inference-with-pymc3-part-3/

bayesian-approach-predicting-well-production's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.