Git Product home page Git Product logo

resources's Introduction

Resources

Howdy Folks, I'm Michael Pyrcz, an Associate Professor at The University of Texas. I teach and conduct research on data analytics, geostatistics and machine learning. I'm appointed in the Hildebrand Department of Petroleum and Geosystem Engineering, the Jackson School of Geosciences and the Bureau of Economic Geology. I'm also a principal investigator in the College of Natural Sciences Energy Analytics Freshmen Research Initiative and Inventors' Program and core a core faculty in the Machine Learning Laboratory in Computer Sciences.

I feel that the role of professor is a role of service to our community, so I post all my lectures and supporting content online for anyone in the world to learn. I hope this supports:

  • my students with evergreen content long after they finish my courses
  • working professionals facing the digital transformation and interested to learn new skills
  • potential students by breaking down barriers and making our university a welcoming place for all interested to learn

Here's an inventory of my online resources that I have made to help people learn about spatial data analytics, geostatistics and machine learning. I have produced these resources to support my students and I thought they would be useful to my students after completion of the class (an evergreen resource), to other students and working professionals interested in this topic.

I hear from students, working professionals and potential students everyday that benefit from these products!

Michael Pyrcz, Associate Professor, University of Texas at Austin

Novel Data Analytics, Geostatistics and Machine Learning Subsurface Solutions

With over 17 years of experience in subsurface consulting, research and development, Michael has returned to academia driven by his passion for teaching and enthusiasm for enhancing engineers' and geoscientists' impact in subsurface resource development.

One of my vairous roles is as principal investigator at the Texas Center for Geostatistics. For more about Michael check out these links:

About Michael

Want to learn more about my story, my publications and other contributions to open source, check this out:

  1. My story of how I got started in engineering and ended up as a professor at The University of Texas at Austin My Story

  2. My research, approach to research and views on building an inclusive and diverse team My Research

  3. Nothing is possible without awesome graduate students My Students

  4. I've written a bit, here's the books My Books

  5. My peer-reviewed publications My Papers

  6. My other contributions My Other Contributions

  7. I wrote an open source Python package for spatial data analytics and geostatistics. Much of it is a translation of GSLIB (Deutsch and Journel, 1998) from the original Fortran to Python for 2D geostatistical methods. I did this to support my students in my Spatial Data Analytics and Geostatistics courses. Check it out and consider contributing and become a coauthor at GeostatsPy on PyPi Repository and GitHub.

  8. I do quite a bit on social media, here's why I do it, My Social Media Efforts.

  9. Check out my TEDx talk on 'A Professor's Secret Weapon' TED Talk

  10. Check out my Twitter feed for resources, ideas and possitivity most days, where I'm the GeostatsGuy Twitter.

  11. I post a lot of code, demonstration workflows and course material to support anyone that wants to learn My GitHub

  12. I partnered with Prof. John Foster (UT Austin) and Bazean, a technology-enabled energy investment firm, to start the energy-focussed data science education company, daytum. We are currently offering short courses in Energy Data Science.

Michael Pyrcz, Associate Professor, University of Texas at Austin

Online Resources on Spatial Data Analytics, Geostatistics and Machine Learning

Recorded Lectures

I record all my university lectures and post them on YouTube. You are welcome to join my classes!

  1. Introduction - Howdy, I'm Michael

  2. YouTube Channel GeostatsGuy Lectures

  3. Introduction to Data Analytics, Geostatistics and Machine Learning Undergraduate Lectures (Lec00-Lec21)

  4. Subsurface Modeling Graduate Course (Lec00 - Lec22)

  5. Subsurface Machine Learning Graduate Course (Lec00 - Lec18)

  6. Open Source Spatial Data Analytics in Python with GeostatsPy

  7. Introduction to Spatial Continuity

  8. Geostatistical Workflows for Unconventional Reservoirs)

  9. Geostatistical Workflows for Unconventional Reservoirs at BEG

  10. What Does a Geoscientist Need to Know About Geostatistics? And Why It Would Be Helpful?

  11. Center for Petroleum and Geosystems Engineering Webinar - Big Data Analytics for Petroleum Engineering: Hype or Panacea?

  12. Michael's Unsolicited Advice and Ideas for a Successful and Happy Career in Our Industry

  13. My interview on AAPG's Digging Deeper podcast with the awesome host Vern Stefanic.

GeostatsPy Python Package Workflows

I wrote a Python Package called GeostatsPy for spatial data analytics and geostatistics. Here's a set of demonstration workflows in Python Jupyter Notebook for many of the fundamental workflow steps from data preparation, statistical inference to spatial prediction with uncertainty. They go along with my recorded lectures from my courses on my YouTube channels:

Here's the workflows:

  1. GeostatsPy: Reimplementation of GSLIB in Python
  2. Confidence Intervals and Hypothesis Testing with GeostatsPy
  3. Monte Carlo Simulation with GeostatsPy
  4. Bootstrap with GeostatsPy
  5. Data Distributions
  6. Declustering with GeostatsPy
  7. Indicator Kriging with GeostatsPy
  8. Kriging with GeostatsPy
  9. Multivariate Analysis with GeostatsPy
  10. Overfitting Models with GeostatsPy
  11. Plotting Spatial Data with GeostatsPy
  12. Directional Spatial Continuity with GeostatsPy
  13. Spatial Updating with GeostatsPy
  14. Data Transformation with GeostatsPy
  15. Spatial Trend Modeling with GeostatsPy
  16. Multivariate Feature Ranking with GeostatsPy
  17. Variogram Calculation with GeostatsPy
  18. Variogram Modeling with GeostatsPy
  19. Spatial Bootstrap

Interactive Python Worklfows to Support Education

I think interactive workflows are excellent tools to support education. For data analytics and machine learning, turning a dial and watching a system or machine change is a great method to gain intuition and experience. I started to put together interactive workflows with ipywidgets and matplotlib. Check them out here:

  1. General Bootstrap
  2. Bootstrap Colored Balls in a Cowboy Hat
  3. DYI Central Limit Theorem
  4. Confidence Interval by Bootstrap and Analytical
  5. Sivia's Bayesian Coin
  6. Spurious Correlation
  7. Correlation Coefficient
  8. Neural Networks
  9. LASSO Regression
  10. Principal Components Analysis
  11. Ridge Regression
  12. Simple Kriging
  13. Stochastic Simulation
  14. Uncertainty with Spatial Aggregation
  15. Kriging String Effect
  16. Uncertainty Model Checking
  17. Variogram Calculation
  18. Variogram Modeling
  19. Spectral Clustering

Resources on Statistics and Probability

  1. Probability Theory – my undergraduate lecture
  2. Statistics – undergraduate lecture
  3. Marginal, Joint & Conditional Probability – slides

Parametric Distributions

Parametric Distributions are fundamental to statistics and data analytics inferential and predictive workflows. Sometimes they are required by theory and often they result from nature. Many students struggle with them so I made simple demonstrations in Microsoft Excel that cover how to make them from scratch and how to work with them:

  1. How to make them in Excel
  2. Poisson distribution in Excel
  3. Gaussian transform in Excel and Python
  4. Log normal distribution in Excel

Hypothesis Testing

Hypothesis Testing is all about recognizing the difference that makes a difference. These tests protect us from the belief in small numbers and are bias to see patterns in random phenomenon.

  1. Difference in means in Excel and in Python
  2. Difference in variances in Excel and in Python
  3. Difference in distributions in Excel

Demos of Bayesian Statistics

Bayesian Apporaches are powerful. They integrate prior belief with new observations, provide explicit uncertainty models and more intuitive credible intervals for uncertainty in model parameters. Here's some accessible demonstrations to get you started thinking like a Bayesian statician.

  1. The Coin Problem from Sivia (1996) in Excel
  2. Bayesian updating with Gaussian in Excel
  3. Probability given a positive test in Excel
  4. Sivia's Bayesian Coin in Interactive Python
  5. Bayesian Regression in Python
  6. Naive Bayes Regression and Classification in Python

Other

  1. Bootstrap in Excel, in Python and in R
  2. Spatial Bootstrap in Python
  3. Linear regression in Excel and in R
  4. Loss functions in Excel
  5. Multivariate Analysis

Heterogeneity

Our subsurface systems are heterogeneous and heterogeneity matters in many subsurface prediction problems. Here are some accessible demonstrations to help you get started quantifying heterogeneity.

  1. Making an example well in Excel
  2. Lorenz coefficient in Excel
  3. Hurst coefficient in R
  4. Ripley Cross K in R

Machine Learning

I have an new Subsurface Machine Learning Course that builds from fundamental probability to artificial neural networks. The recorded lectures are available here:

You are welcome to follow along! The demonstration workflows from the lectures are here:

  1. Feature Ranking in Python
  2. Feature Transformations in Python
  3. Feature Uncertainty in Python
  4. Dimensional Reduction in Python and in R
  5. Clustering in Python
  6. Principal Components Analysis in Python
  7. Multidimensional Scaling and Random Projection in Python
  8. Linear Regression in Python
  9. Ridge Regression in Python
  10. LASSO Regression in Python
  11. Isotonic Regression in Python
  12. Bayesian Regression in Python
  13. Polynomial Regression in Python
  14. Naive Bayes Regression and Classification in Python
  15. Time Series Analysis
  16. k Nearest Neighbour
  17. Decision tree in PythonPython Advanced and in R
  18. Gradient Boosting in Python and Advanced Gradient Boosting in Python
  19. Support Vector Machines
  20. Neural Networks
  21. Course Conclusion
  22. scikit learn Overview

Geostatistics

  1. GeostatsPy: Reimplementation of GSLIB in Python
  2. Introduction to Data Analytics, Geostatistics and Machine Learning Undergraduate Lectures (Lec00-Lec21)
  3. What Does a Geoscientist Need to Know About Geostatistics? And Why It Would Be Helpful? and PPT
  4. Exercises, hands-on and demonstrations PPT Inventory
  5. Functions that reimplement or call GSLIB exes in Python
  6. Demo of the functions in Python
  7. Declustering in Python and with PyGSLIB Package
  8. Declustering and Debiasing in Excel
  9. Variogram calculation in Excel and in R
  10. Full variogram Calculation and Modeling in Excel and in PyGSLIB Package

Supplemental Slides

  1. Facies criteria in PPT
  2. Value of quantification in PPT
  3. Stationarity in PPT
  4. Uncertainty in PPT
  5. Suggested books in PPT
  6. Simple kriging in Excel and in R
  7. Uncertainty Away from Data in Excel
  8. Convolution methods in Python
  9. LU Simulation in Pyton
  10. Sequential Gaussian simulation in Excel and in R
  11. Truncated Gaussian simulation in Excel
  12. Spatial uncertainty in Excel
  13. Volume-variance relations in Excel
  14. Working with realizations in R
  15. Lecture on value in industry in PPT

I hope these resources are useful.

Want to Work Together?

I hope that this is helpful to those that want to learn more about subsurface modeling, data analytics and machine learning. Students and working professionals are welcome to participate.

  • Want to invite me to visit your company for training, mentoring, project review, workflow design and consulting, I'd be happy to drop by and work with you!

  • Interested in partnering, supporting my graduate student research or my Subsurface Data Analytics and Machine Learning consortium (co-PIs including Profs. Foster, Torres-Verdin and van Oort)? My research combines data analytics, stochastic modeling and machine learning theory with practice to develop novel methods and workflows to add value. We are solving challenging subsurface problems!

  • I can be reached at [email protected].

I'm always happy to discuss,

Michael

Michael Pyrcz, Ph.D., P.Eng. Associate Professor The Hildebrand Department of Petroleum and Geosystems Engineering, Bureau of Economic Geology, The Jackson School of Geosciences, The University of Texas at Austin

More Resources Available at: Twitter | GitHub | Website | GoogleScholar | Book | YouTube | LinkedIn

resources's People

Contributors

geostatsguy avatar tcmle avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.