Git Product home page Git Product logo

google / trimmed_match Goto Github PK

View Code? Open in Web Editor NEW
54.0 7.0 20.0 4.59 MB

This Python library implements Trimmed Match for analyzing randomized paired geo experiments and also implements Trimmed Match Design for designing randomized paired geo experiments.

License: Apache License 2.0

Starlark 1.98% Python 68.24% C++ 15.05% Jupyter Notebook 14.73%
geo-experiments experimental-design robust-statistics iroas causal-inference abtest

trimmed_match's Introduction

Trimmed Match: a robust statistical technique for measuring ad effectiveness through the design and analysis of randomized geo experiments

Copyright (C) 2020 Google LLC. License: Apache 2.0

Disclaimer

This is not an officially supported Google product. For research purposes only.

Description

How to properly measure the effectiveness of online advertising (e.g. search, display, video, etc) is a fundamental problem not only for advertisers but for Google. Randomized geo experiments (Vaver & Koehler, 2010) have been recognized as the gold standard for the measurements, but how to design and analyze them properly is a non-trivial statistical problem. Unlike the usual A/B tests, in GeoX, the number of geos is usually small; Moreover, there is often severe heterogeneity across geos, which makes traditional regression adjustment less reliable. Furthermore, due to temporal dynamics, geos between the treatment group and the control group may become less comparable during the test period even if they were comparable during the design phase, which is often obvious by looking at the time period after the design was done but before the experiment started. Trimmed Match (Chen & Au, 2019) has recently been developed in order to address these technical issues in analyzing randomized paired geo experiments. We also apply Trimmed Match, optimal pairing and cross validation to improve the traditional design of matched pairs (Chen, Longfils & Remy, 2021).

This version contains

  • C++ core library and Python wrapper for Trimmed Match, and
  • Python package for geo experimental design (preliminary version) using Trimmed Match and Cross Validation.
  • Corresponding colab demos for post analysis and experimental design, separately.

Installation

Our current version has been tested with Python 3.7 in Linux. The code may be incompatible with Python 3.6 or lower versions.

Prerequisites

  • A Python 3 development environment with setuptools and git:
sudo apt-get install python3-dev python3-setuptools git

Trimmed Match can be installed using setuptools and pip

First clone from github:

git clone https://github.com/google/trimmed_match

Then build and install the extension using the supplied setup.py file and setuptools and pip.

python3 -m pip install ./trimmed_match

This will automatically build the extension using bazel.

You can run the unit tests using:

cd trimmed_match
PYTHON_BIN_PATH=`which python3` bazel test //...:all

Note that unit tests require package dependencies to be installed. This automatically happens when the package is installed as above using pip. Otherwise, the dependencies can be installed manually using the following command:

python3 -m pip install absl-py matplotlib numpy pandas seaborn scipy

Usage

Without programming, the best way to learn how to use the package is to follow one of the notebooks, and the recommended way of opening them is Google Colab.

With Python programming, here is a toy example.

import trimmed_match
from trimmed_match.estimator import TrimmedMatch


#####################################################################
# Reports the estimate of incremental return on ad spend (iROAS)
# using geo experimental data from a matched pairs design (5 geo pairs)
#####################################################################

delta_response = [1, 10, 3, 8, 5]  # response difference between treatment and control in each geo pair
delta_spend = [1, 5, 2, 5, 3]      # spend difference between treatment and control in each geo pair
confidence_level = 0.80            # for the two-sided confidence interval
tm = TrimmedMatch(delta_response, delta_spend)
report = tm.Report(confidence=confidence_level)
print('iroas=%.2f, ci=(%.2f, %.2f)' % (
      report.estimate, report.conf_interval_low, report.conf_interval_up))

# iroas=1.60, ci=(1.52, 1.66)

References

Aiyou Chen and Tim Au (2019). Robust Causal Inference for Incremental Return on Ad Spend with Randomized Geo Experiments. (Accepted by Annals of Applied Statistics) (https://research.google/pubs/pub48448/)

Aiyou Chen, Marco Longfils, and Nicolas Remy (2021). Trimmed Match Design for Randomized Paired Geo Experiments. (https://research.google/pubs/pub50322/)

Jon Vaver and Jim Koehler (2011). Measuring Ad Effectiveness Using Geo Experiments. (https://research.google/pubs/pub38355/)

Contact and mailing list

If you want to contribute, please read CONTRIBUTING and send us pull requests. You can also report bugs or file feature requests.

If you'd like to talk to the developers or get notified about major updates, you may want to subscribe to our mailing list.

Developers

  • Aiyou Chen
  • Marco Longfils
  • Christoph Best
  • Emeric Thibaud
  • Matteo Courthoud
  • Yang Li
  • Bicheng Ying

trimmed_match's People

Contributors

aiyouchen avatar marcolongfils avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

trimmed_match's Issues

Installation Problem - error: could not create 'build': File exists

Hello there.

I have made a python 3.7 environment, and then I have also installed Bazel as instructed via Homebrew.

I have then cloned the repository onto the machine - so far so good.

When I then try to install the package with this command:

python3 -m pip install ./trimmed_match

It begins installing dependencies (numpy etc.) just fine, and then when we get to this block, things go south:

Building wheels for collected packages: trimmed-match
  Building wheel for trimmed-match (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [5 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      error: could not create 'build': File exists
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for trimmed-match
  Running setup.py clean for trimmed-match
Failed to build trimmed-match
Installing collected packages: pytz, typing_extensions, six, pyparsing, pillow, packaging, numpy, MarkupSafe, fonttools, cycler, absl-py, scipy, python-dateutil, kiwisolver, jinja2, pandas, matplotlib, seaborn, trimmed-match
  Running setup.py install for trimmed-match ... error
  error: subprocess-exited-with-error
  
  × Running setup.py install for trimmed-match did not run successfully.
  │ exit code: 1
  ╰─> [7 lines of output]
      running install
      /Users/xx.xx/anaconda3/envs/xx/lib/python3.7/site-packages/setuptools/command/install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
        setuptools.SetuptoolsDeprecationWarning,
      running build
      running build_py
      creating build
      error: could not create 'build': File exists
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> trimmed-match

Please help!

Error in installing the trimmed match package

I was following along the installation guide and then got an error when installing the extension using the supplied setup.py file and setuptools and pip by calling the command line "python3 -m pip install ./trimmed_match" in the terminal.

Please see below for the stacktrace of error:

(base) user:~ username$ python3 -m pip install ./trimmed_match

Processing ./trimmed_match

Requirement already satisfied: absl-py in ./opt/anaconda3/lib/python3.7/site-packages (from trimmed-match==1.0.0) (0.11.0)

Requirement already satisfied: numpy>=1.8.0rc1 in ./opt/anaconda3/lib/python3.7/site-packages (from trimmed-match==1.0.0) (1.17.2)

Requirement already satisfied: pandas in ./opt/anaconda3/lib/python3.7/site-packages (from trimmed-match==1.0.0) (0.25.1)

Requirement already satisfied: matplotlib in ./opt/anaconda3/lib/python3.7/site-packages (from trimmed-match==1.0.0) (3.3.0)

Requirement already satisfied: scipy in ./opt/anaconda3/lib/python3.7/site-packages (from trimmed-match==1.0.0) (1.3.1)

Requirement already satisfied: seaborn in ./opt/anaconda3/lib/python3.7/site-packages (from trimmed-match==1.0.0) (0.9.0)

Requirement already satisfied: six in ./opt/anaconda3/lib/python3.7/site-packages (from absl-py->trimmed-match==1.0.0) (1.15.0)

Requirement already satisfied: pillow>=6.2.0 in ./opt/anaconda3/lib/python3.7/site-packages (from matplotlib->trimmed-match==1.0.0) (6.2.0)

Requirement already satisfied: cycler>=0.10 in ./opt/anaconda3/lib/python3.7/site-packages (from matplotlib->trimmed-match==1.0.0) (0.10.0)

Requirement already satisfied: python-dateutil>=2.1 in ./opt/anaconda3/lib/python3.7/site-packages (from matplotlib->trimmed-match==1.0.0) (2.8.0)

Requirement already satisfied: kiwisolver>=1.0.1 in ./opt/anaconda3/lib/python3.7/site-packages (from matplotlib->trimmed-match==1.0.0) (1.1.0)

Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in ./opt/anaconda3/lib/python3.7/site-packages (from matplotlib->trimmed-match==1.0.0) (2.4.2)

Requirement already satisfied: setuptools in ./opt/anaconda3/lib/python3.7/site-packages (from kiwisolver>=1.0.1->matplotlib->trimmed-match==1.0.0) (51.3.3)

Requirement already satisfied: pytz>=2017.2 in ./opt/anaconda3/lib/python3.7/site-packages (from pandas->trimmed-match==1.0.0) (2019.3)

Building wheels for collected packages: trimmed-match

Building wheel for trimmed-match (setup.py) ... error

ERROR: Command errored out with exit status 1:

command: /Users/username/opt/anaconda3/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/4l/g4glrsp94337rjpdfdjhq9gw0000gn/T/pip-req-build-y26s5sr9/setup.py'"'"'; file='"'"'/private/var/folders/4l/g4glrsp94337rjpdfdjhq9gw0000gn/T/pip-req-build-y26s5sr9/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /private/var/folders/4l/g4glrsp94337rjpdfdjhq9gw0000gn/T/pip-wheel-ewzag689

   cwd: /private/var/folders/4l/g4glrsp94337rjpdfdjhq9gw0000gn/T/pip-req-build-y26s5sr9/

Complete output (5 lines):

running bdist_wheel

running build

running build_py

creating build

error: could not create 'build': File exists


ERROR: Failed building wheel for trimmed-match

Running setup.py clean for trimmed-match

Failed to build trimmed-match

Installing collected packages: trimmed-match

Running setup.py install for trimmed-match ... error

ERROR: Command errored out with exit status 1:

 command: /Users/username/opt/anaconda3/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/4l/g4glrsp94337rjpdfdjhq9gw0000gn/T/pip-req-build-y26s5sr9/setup.py'"'"'; __file__='"'"'/private/var/folders/4l/g4glrsp94337rjpdfdjhq9gw0000gn/T/pip-req-build-y26s5sr9/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/4l/g4glrsp94337rjpdfdjhq9gw0000gn/T/pip-record-u_3c5b_3/install-record.txt --single-version-externally-managed --compile --install-headers /Users/username/opt/anaconda3/include/python3.7m/trimmed-match

     cwd: /private/var/folders/4l/g4glrsp94337rjpdfdjhq9gw0000gn/T/pip-req-build-y26s5sr9/

Complete output (5 lines):

running install

running build

running build_py

creating build

error: could not create 'build': File exists

----------------------------------------

ERROR: Command errored out with exit status 1: /Users/username/opt/anaconda3/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/4l/g4glrsp94337rjpdfdjhq9gw0000gn/T/pip-req-build-y26s5sr9/setup.py'"'"'; file='"'"'/private/var/folders/4l/g4glrsp94337rjpdfdjhq9gw0000gn/T/pip-req-build-y26s5sr9/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /private/var/folders/4l/g4glrsp94337rjpdfdjhq9gw0000gn/T/pip-record-u_3c5b_3/install-record.txt --single-version-externally-managed --compile --install-headers /Users/username/opt/anaconda3/include/python3.7m/trimmed-match Check the logs for full command output.

Thank you.

KeyError when uploading Data for Design spreadsheet

Hello.

I was able to get the upload to work when I used the sample spreadsheet but when I use my own spreadsheet I get the following error. Do you know why I am getting it? Thanks.

KeyError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2897 try:
-> 2898 return self._engine.get_loc(casted_key)
2899 except KeyError as err:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'response'

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last)
2 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2898 return self._engine.get_loc(casted_key)
2899 except KeyError as err:
-> 2900 raise KeyError(key) from err
2901
2902 if tolerance is not None:

KeyError: 'response'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.