Git Product home page Git Product logo

causal-inference-and-discovery-in-python's Introduction

Causal Inference and Discovery in Python

Causal Inference and Discovery in Python

This is the code repository for Causal Inference and Discovery in Python, published by Packt.

Unlock the secrets of modern causal machine learning with DoWhy, EconML, PyTorch and more

What is this book about?

Causal methods present unique challenges compared to traditional machine learning and statistics. Learning causality can be challenging, but it offers distinct advantages that elude a purely statistical mindset. Causal Inference and Discovery in Python helps you unlock the potential of causality.

You’ll start with basic motivations behind causal thinking and a comprehensive introduction to Pearlian causal concepts, such as structural causal models, interventions, counterfactuals, and more. Each concept is accompanied by a theoretical explanation and a set of practical exercises with Python code.

Next, you’ll dive into the world of causal effect estimation, consistently progressing towards modern machine learning methods. Step-by-step, you’ll discover Python causal ecosystem and harness the power of cutting-edge algorithms. You’ll further explore the mechanics of how “causes leave traces” and compare the main families of causal discovery algorithms.

The final chapter gives you a broad outlook into the future of causal AI where we examine challenges and opportunities and provide you with a comprehensive list of resources to learn more.

This book covers the following exciting features:

  • Master the fundamental concepts of causal inference
  • Decipher the mysteries of structural causal models
  • Unleash the power of the 4-step causal inference process in Python
  • Explore advanced uplift modeling techniques
  • Unlock the secrets of modern causal discovery using Python
  • Use causal inference for social impact and community benefit

If you feel this book is for you, get your copy today!

Instructions and Navigations

All of the code is organized into folders.

The code will look like the following:

preds = causal_bert.inference(
    texts=df['text'],
    confounds=df['has_photo'],
)[0]

Following is what you need for this book:

This book is for machine learning engineers, data scientists, and machine learning researchers looking to extend their data science toolkit and explore causal machine learning. It will also help developers familiar with causality who have worked in another technology and want to switch to Python, and data scientists with a history of working with traditional causality who want to learn causal machine learning. It’s also a must-read for tech-savvy entrepreneurs looking to build a competitive edge for their products and go beyond the limitations of traditional machine learning.

With the following software and hardware list you can run all code files present in the book (Chapter 1-15).

Software and Hardware List

Chapter Software required OS required
1-15 Python 3.9 Windows macOS, or Linux
1-15 DoWhy 0.8 Windows, macOS, or Linux
1-15 EconML 0.12.0 Windows, macOS, or Linux
1-15 CATENets 0.2.3 Windows, macOS, or Linux
1-15 gCastle 1.0.3 Windows, macOS, or Linux
1-15 Causica 0.2.0 Windows, macOS, or Linux
1-15 Causal-learn 0.1.3.3 Windows, macOS, or Linux
1-15 Transformers 4.24.0 Windows, macOS, or Linux

Join our Discord server Coding

Join our Discord community to meet like-minded people and learn alongside more than 2000 members at Discord Coding

Related products

Get to Know the Author

Aleksander Molak is a Machine Learning Researcher and Consultant who gained experience working with Fortune 100, Fortune 500, and Inc. 5000 companies across Europe, the USA, and Israel, designing and building large-scale machine learning systems. On a mission to democratize causality for businesses and machine learning practitioners, Aleksander is a prolific writer, creator, and international speaker. As a co-founder of Lespire, an innovative provider of AI and machine learning training for corporate teams, Aleksander is committed to empowering businesses to harness the full potential of cutting-edge technologies that allow them to stay ahead of the curve. He's the host of the Causal AI-centered Causal Bandits Podcast.

Note from the Author:

Environment installation

  1. See the section Using graphviz and GPU below

  2. To install the basic environment run: conda env create -f causal_book_py39_cuda117.yml

  3. To install the environment for notebook Chapter_11.2.ipynb run: conda create -f causal-pymc.yml

Selecting the kernel

After a successful installation of the environment, open your notebook and select the kernel causal_book_py39_cuda117

For notebook Chapter_11.2.ipynb change kernel to causal-pymc

Using graphviz and GPU

Note: Depending on your system settings, you might need to install graphviz manually in order to recreate the graph plots in the code. Check https://pypi.org/project/graphviz/ for instructions specific to your operating system.

Note 2: To use GPU you'll need to install CUDA 11.7 drivers. This can be done here: https://developer.nvidia.com/cuda-11-7-0-download-archive

Citation

BibTeX

@book{Molak2023,
    title={Causal Inference and Discovery in Python: Unlock the secrets of modern causal machine learning with DoWhy, EconML, PyTorch and more},
    author={Molak, Aleksander},
    publisher={Packt Publishing},
    address={Birmingham},
    edition={1.},
    year={2023},
    isbn={1804612987},
    note={\url{https://amzn.to/3RebWzn}}
}

APA

Molak, A. (2023). Causal Inference and Discovery in Python: Unlock the secrets of modern causal machine learning with DoWhy, EconML, PyTorch and more. Packt Publishing.

‼️ Known mistakes // errata

For known errors and corrections check:

If you spotted a mistake, let us know at book(at)causalpython.io or just open an issue in this repo. Thank you 🙏🏼

causal-inference-and-discovery-in-python's People

Contributors

alxndrmlk avatar packt-itservice avatar rahul-packt avatar tazeenpackt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

causal-inference-and-discovery-in-python's Issues

[Chapter9] Causal_estimator.effect for prediction

Hello, I am Jake Lee from Korea, a passionate reader of your book

I have found that couple of codes not working for prediction part in Ch09.
Your code flow as following

  1. Instantiate Causal model
  2. Estimand
  3. Estimate
  4. Predict test data using .causal_estimator.effect

However #4 is not working from my side (description said there is no object of causal_estimator)
It would be appreciate if you give me help on it, especially in case that code is running in latest DoWhy version (11.0)

Thanks in advance!
Jake

Chapter 7 notebook array shape error messages

Possibly related to the numba deprecation warning , the following code spits out an array shape error, which then propogates errors in the remaining cells in the chapter 7 notebook. I tried upgrading shap to 0.42.0 which resolved some but not all of the errors.

`estimate = model.estimate_effect(
identified_estimand=estimand,
method_name='backdoor.econml.dml.DML',
method_params={
'init_params': {
'model_y': GradientBoostingRegressor(),
'model_t': GradientBoostingRegressor(),
'model_final': LassoCV(fit_intercept=False),
},
'fit_params': {}}
)

print(f'Estimate of causal effect (DML): {estimate.value}')
`

A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().

Environment for M1 Silicon

Hi Alex,

Please find below an environment file that successfully runs GPU related codes in Chapters 11.1 and 14 (as .txt as it won't accept yml - should just be able to change the extension back)
causal_book_py39_for_m1.txt. The changes from the yml provided in your repo are:

  • remove - nvidia from channels; remove - pytorch and -pytorch-cuda=11.7 from dependencies
  • add - notebook=6.5 to dependencies

Then replace the set device cell with

# Set device
device = "mps" if torch.backends.mps.is_available() else "cpu"

I still then had to pip install CausalPy once the env was activated.

The full yml as exported by conda is
causal_book_py39_applem1.txt

Notes:

  • This has only been tested to run on notebooks 11.1 and 14 but I did not closely monitor whether the results were the same. I'm only assuming at this point it should run fine on the other chapters
  • In notebook 14, "Expert knowledge" section, in the cell after the one with augmented Lagrangian loss objects (first line assert len(dataset_train.batch_size) == 1, "Only 1D batch size is supported"), an errors occurs with message "NotImplementedError: The operator 'aten::triu_indices' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable PYTORCH_ENABLE_MPS_FALLBACK=1 to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS."

Dowhy CausalModel does not have 'causal_estimator' attribute

Both chapter 9 and chapter 10 notebooks have code like effect_pred = model.causal_estimator.effect().
I got an error running them: AttributeError: 'CausalModel' object has no attribute 'causal_estimator'

The book states that it uses DoWhy 0.8 but I am currently using DoWhy 0.10.1 (just want to keep my learning experience up-to-date) but I cannot determine if that's the cause of it. If it is, then how to implement the model on test dataset with current version of DoWhy? If not, then what have I missed?

Thanks!

Small errata for p27 of book (post-June 2023)

Loving that these resources are on GitHub - thank you @AlxndrMlk!

Quick notational suggestion: on p27, the line $X_{sample} = 1.9 < X < -1.9$ doesn't make sense at the moment.
The right-side reads as a boolean, and the condition can never be satisfied, as written.

Would be clearer what's meant if it was "...sampled according to the condition $X < - 1.9$ or $X > 1.9$".

Notebook 13 np.np typo

Screenshot 2023-12-26 at 11 21 30 PM

Notebook 13 seems to have a typo of np.np.tril, which causes an error. Changing to np.tril resolves this error.

Solving environment not completing

The command conda env create -f causal_book_py39_cuda117.yml never finishes. Tested on Windows and WSL. Anyone else encountering this issue? Found a solution?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.