Git Product home page Git Product logo

google / megalista Goto Github PK

View Code? Open in Web Editor NEW
117.0 20.0 55.0 1.37 MB

First Party data integration solution built for marketing teams to enable audience and conversion onboarding into Google Marketing products (Google Ads, Campaign Manager, Google Analytics).

License: Apache License 2.0

Python 94.45% Shell 4.05% HCL 1.23% Dockerfile 0.27%
google googleads data-integration googleanalytics dataflow python bigquery customermatch conversions audience-targeting

megalista's Introduction

Megalista

Sample integration code for onboarding offline/CRM data from BigQuery as custom audiences or offline conversions in Google Ads, Google Analytics 360, Google Display & Video 360, and Google Campaign Manager.

Disclaimer: This is not an officially supported Google product.

Supported integrations

  • Google Ads

    • Contact Info Customer Match (email, phone, address) [details]
    • Id Based Customer Match (device Id, user id)
    • Offline Conversions through gclid [details]
    • Enhanced Conversions for Leads or Offline Conversions through user_identifiers [details]
    • Store Sales Direct (SSD) conversions [details]
  • Google Analytics (Universal analytics)

  • Campaign Manager

    • Offline Conversions API (user id, device id, match id, gclid, dclid, value, quantity, and customVariables) [details]
  • Google Analytics 4

  • Display & Video

    • Contact Info Customer Match (email, phone, address) [details]
    • Id Based Customer Match (device Id)
  • Appsflyer

    • S2S Offline events API (conversion upload), to be used for audience creation and in-app events with Google Ads and DV360 [details]

How does it work

Megalista was designed to separate the configuration of conversion/audience upload rules from the engine, giving more freedom for non-technical teams (i.e. Media and Business Intelligence) to setup multiple upload rules on their own.

The solution consists of #1 a configuration environment (either Google Sheet or JSON file, or a Google Cloud Firestore collection) in which all rules are defined by mapping a data source (BigQuery Table) to a destination (data upload endpoint) and #2, an Apache Beam workflow running on Google Dataflow, scheduled to upload the data in batch mode.

Prerequisites

Google Cloud Services

  • Google Cloud Platform account
    • Billing enabled
    • BigQuery enabled
    • Dataflow enabled
    • Cloud storage enabled
    • Cloud scheduler enabled
    • App Engine enabled
  • At least one of:
    • Google Ads API Access
    • Campaign Manager API Access
    • Google Analytics API Access
    • Display & Video API Access
  • Python3
  • Google Cloud SDK

Access Requirements

Those are the minimum roles necessary to deploy Megalista:

  • OAuth Config Editor
  • BigQuery User
  • BigQuery Job User
  • BigQuery Data Viewer
  • Cloud Scheduler Admin
  • Storage Admin
  • Dataflow Admin
  • Service Account Admin
  • Logs Viewer
  • Service Consumer

APIs

Required APIs will depend on upload endpoints in use.

  • Google Sheets (required if using Sheets configuration) [link]
  • Google Analytics [link]
  • Google Analytics Reporting [link]
  • Google Ads [link]
  • Campaign Manager [link]
  • Google Cloud Firestore [link]
  • Display & Video [link]

Configure Megalista

Megalista can be configured via Google Sheets, a JSON file, or a Google Cloud Firestore collection. Expected data schemas (Sources) and metadata (Destinations) for each use case are defined in the Megalista Wiki.

Instructions for each configuration method method can be found in the Megalista wiki

Deployment

This guide assumes it'll be followed inside Google Cloud Platform Console.

Creating required access tokens

To access campaigns and user lists on Google's platforms, this dataflow will need OAuth tokens for an account that can authenticate in those systems.

In order to create it, follow these steps:

  • Access the GCP console
  • Go to the API & Services section on the top-left menu.
  • On the OAuth Consent Screen and configure an Internal Consent Screen
  • Then, go to the Credentials and create an OAuth client Id with Application type set as Desktop App
  • This will generate a Client Id and a Client secret. Save these values as they are required during the deployment
  • Run the generate_megalista_token.sh script in this folder providing these two values and follow the instructions
    • Sample: ./generate_megalista_token.sh client_id client_secret
  • This will generate the Access Token and the Refresh token
    • The user who opened the generated link and clicked on Allow must have access to the platforms that Megalista will integrate, including the configuration Sheet, if this is the chosen method for configuration.

Deploying Pipeline

  • Download the latest Megalista code. To deploy the full Megalista pipeline, use the following command from the deployment folder: ./deploy.sh The script will require some parameters, please add them to the config.json file. Some parameters have default values and can be changed.

  • Auxliary bigquery dataset for Megalista operations to create

    • This dataset will be used for storing operational data and will be created by the deployment script
  • Google Cloud Storage Bucket to create

    • This Cloud Storage Bucket will be used to store Megalista compiled binary, metadata, and temp files and will be created by the deployment script.
  • Setup Firestore collection, URL for JSON configuration and Setup Sheet Id

    • Only one of these three should be filled and the other should be left black accordingly to the chosen configuration method.
  • Client ID, Client Secret, Access Token and Refresh Token from the previous step.

    Disclaimer: Please store your config.json file in a secure place or delete it after the deployment.

Updating the Binary

To update the binary without redoing the whole deployment process, run:

  • ./deployment/deploy_cloud.sh gcp_project_id bucket_name region service_account_email

Usage

Every upload method expects as source a BigQuery data with specific fields, in addition to specific configuration metadata. For details on how to setup your upload routines, refer to the Megalista Wiki.

Errors notifications by email

To have uploaders errors captured and sent by email, do the following: In Cloud Scheduler, in the parameters section of the request body, add notify_errors_by_email parameter as true and errors_destination_emails with a list of emails divided by comma ([email protected],[email protected] etc). These parameters should be added to the same list of pre-configured ones, such as client_id, client_secret etc.

If the access tokens being used were generated prior to version v4.4, new access and refresh tokens must be generated to activate this feature. This is necessary because old tokens don't have the gmail.send scope.

Note about Google Ads API access

Calls to the Google Ads API will fail if the user that generated the OAuth2 credentials (Access Token and Refresh Token) doesn't have direct access to the Google Ads account to which the calls are being directed. It's not enough for the user to have access to a MCC above this account and being able to access the account through the interface, it's required that the user has permissions on the account itself.

megalista's People

Contributors

aleflavio-g avatar anaesqueda avatar antoniolmm avatar astivi avatar blevitan516 avatar caiotomazelli avatar caiotramontina avatar charlie6 avatar cmorgeson avatar cymbaum avatar dependabot[bot] avatar diogoaihara avatar donaldseaton avatar guadix avatar hatuna avatar joaoaleixogoogle avatar joaquimsn avatar jraucci avatar jsraucci avatar lucasrsant avatar mohabfekry avatar mr-lopes avatar nivaldoh avatar nmuchon avatar prasadjivane avatar rickygodoy avatar roberto-goncalves avatar slowy07 avatar tdsymonds avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

megalista's Issues

Trouble uploading audience to GoogleAds

I'm trying to send audience lists to Google Ads, but having the following error in all forms of Customer Match:

ERROR:megalista.GoogleAdsCustomerMatchAbstractUploader:'int' object has no attribute 'name' Traceback (most recent call last): File "/home/bianca_santos/google-marketing-data-sync/megalista-v2/megalista/megalista_dataflow/uploaders/utils.py", line 72, in inner return func(*args, **kwargs) File "/home/bianca_santos/google-marketing-data-sync/megalista-v2/megalista/megalista_dataflow/uploaders/google_ads/customer_match/abstract_uploader.py", line 209, in process execution.destination.destination_metadata)) File "/home/bianca_santos/google-marketing-data-sync/megalista-v2/megalista/megalista_dataflow/uploaders/google_ads/customer_match/abstract_uploader.py", line 60, in _create_list_if_it_does_not_exist customer_id, list_name, list_definition) File "/home/bianca_santos/google-marketing-data-sync/megalista-v2/megalista/megalista_dataflow/uploaders/google_ads/customer_match/abstract_uploader.py", line 70, in _do_create_list_if_it_does_not_exist resource_name = self._get_user_list_resource_name(customer_id, list_name) File "/home/bianca_santos/google-marketing-data-sync/megalista-v2/megalista/megalista_dataflow/uploaders/google_ads/customer_match/abstract_uploader.py", line 109, in _get_user_list_resource_name query_aux = f"AND user_list.access_reason={ads_client.enums.AccessReasonEnum.OWNED.name}" AttributeError: 'int' object has no attribute 'name' ERROR:megalista.GoogleAdsCustomerMatchAbstractUploader:Error uploading data. Traceback (most recent call last): File "/home/bianca_santos/google-marketing-data-sync/megalista-v2/megalista/megalista_dataflow/uploaders/utils.py", line 72, in inner return func(*args, **kwargs) File "/home/bianca_santos/google-marketing-data-sync/megalista-v2/megalista/megalista_dataflow/uploaders/google_ads/customer_match/abstract_uploader.py", line 209, in process execution.destination.destination_metadata)) File "/home/bianca_santos/google-marketing-data-sync/megalista-v2/megalista/megalista_dataflow/uploaders/google_ads/customer_match/abstract_uploader.py", line 60, in _create_list_if_it_does_not_exist customer_id, list_name, list_definition) File "/home/bianca_santos/google-marketing-data-sync/megalista-v2/megalista/megalista_dataflow/uploaders/google_ads/customer_match/abstract_uploader.py", line 70, in _do_create_list_if_it_does_not_exist resource_name = self._get_user_list_resource_name(customer_id, list_name) File "/home/bianca_santos/google-marketing-data-sync/megalista-v2/megalista/megalista_dataflow/uploaders/google_ads/customer_match/abstract_uploader.py", line 109, in _get_user_list_resource_name query_aux = f"AND user_list.access_reason={ads_client.enums.AccessReasonEnum.OWNED.name}" AttributeError: 'int' object has no attribute 'name' INFO:megalista.GoogleAdsOfflineUploader:Uploading 1000 rows... INFO:megalista.GoogleAdsOfflineConversionsUploader:Uploading 1000 offline conversions on customers/5852184472/conversionActions/792598949 to Google Ads. ERROR:megalista.GoogleAdsOfflineConversionsUploader:Error on uploading offline conversions: Multiple errors in ‘details’. First error: The click or call is owned by a customer account that the uploading customer does not manage., at conversions[0].gclid. INFO:megalista:Completed successfully!

The account is a MCC Google Ads Account.

BigQuery column names schema should be "dimension" not "cd" for Data Import Destination.

Hi Guys, I'm using megalista to upload some audiences from bigquery to google analytics data import.

Right now you're checking the column names in the data source in bigquery with the schema 'cd\d+', but it doesn't work when we upload the data into google analytics, since the data import only accept 'dimension\d+' schema.

So, my recommendation is to change in the script megalista_dataflow/data_sources/data_source.py, the line:

'GA_DATA_IMPORT': {
    'columns': [
        {'name': 'cd\\d+', 'required': True, 'data_type': 'string'},
        {'name': 'cd\\d+', 'required': True, 'data_type': 'string'},
        {'name': 'cd\\d+', 'required': False, 'data_type': 'string'},
    ],
    'groups': []
},

for:

'GA_DATA_IMPORT': {
    'columns': [
        {'name': 'dimension\\d+', 'required': True, 'data_type': 'string'},
        {'name': 'dimension\\d+', 'required': True, 'data_type': 'string'},
        {'name': 'dimension\\d+', 'required': False, 'data_type': 'string'},
    ],
    'groups': []
},

Best,

Gibran

Allow auth via manager access

I noticed that the README states:

Calls to the Google Ads API will fail if the user that generated the OAuth2 credentials (Access Token and Refresh Token) doesn't have direct access to the Google Ads account to which the calls are being directed. It's not enough for the user to have access to a MCC above this account and being able to access the account through the interface, it's required that the user has permissions on the account itself.

However, the Google Ads API client library for Java supports auth via manager access by specifying login-customer-id as described in https://developers.google.com/google-ads/api/docs/client-libs/java/config-file and https://developers.google.com/google-ads/api/docs/concepts/call-structure#cid.

Error Google Ads integration - DEVELOPER_TOKEN_PARAMETER_MISSING

Hi there,

I am not able to enable Google Ads(ADS_CUSTOMER_MATCH_MOBILE_DEVICE_ID_UPLOAD) and keep getting following issue.

[Action Required] Megalista error detected - ADS_CUSTOMER_MATCH_MOBILE_DEVICE_ID_UPLOAD
......
......
......
"file":"src/core/lib/surface/call.cc","file_line":1074,"grpc_message":"Request contains an invalid argument.","grpc_status":3}" >, errors { error_code { request_error: DEVELOPER_TOKEN_PARAMETER_MISSING } message: "developer-token parameter is missing." } request_id: "ciDlQz5F_Uv9FWBewaSZuw" , 'ciDlQz5F_Uv9FWBewaSZuw')

Thanks,
-Askar

Error while installing hashicorp/null v3.1.1: unexpected EOF

Initializing the backend...

Initializing provider plugins...

  • Finding latest version of hashicorp/google...
  • Finding hashicorp/null versions matching "3.1.1"...
  • Installing hashicorp/google v4.29.0...
  • Installing hashicorp/null v3.1.1...

    │ Error: Failed to install provider

    │ Error while installing hashicorp/google v4.29.0: unexpected EOF


│ Error: Failed to install provider

│ Error while installing hashicorp/null v3.1.1: unexpected EOF

Dataflow --subnetwork flag missing

The deployment fails on a project where a subnetwork is defined. Please could you add a subnetwork key in the cloud configuration json and append its value on the dataflow call.
Most of our clients have subnets and need this feature. Thanks

Not being to upload Custom Variables as a part of CM offline conversion data

Hi There,

We are running Megalista implementation for a client for quite a long time. It's been working wonderfully so far.
However, in one of the scenarios, where we use the CM offline conversion upload functionality, it uploads all the data except Custom Variables. Below are the steps that we have performed and confirmed at our end.

  1. All these custom variables are available and enabled on the floodlight level.
  2. We also used regular API calls to push these same variables into the CM platform and that worked fine. So the problem is not with the data or setup on the CM part.

We value your time and effort. However, it would be of huge help if it would be possible to look at this from Megalista's end.
Any help or guidance would be highly appreciated, as we are not sure how to proceed beyond this point.

Additionally, if any input is needed from our end, we would be happy to contribute.

Thanks & Regards,
Sandhya

[Docs] Possible outdated documentation

We have identified 1 possible instance of outdated documentation:

About

This is part of a research project that aims to automatically detect outdated documentation in GitHub repositories. We are evaluating the validity of our approach by identifying instances of outdated documentation in real-world projects.

We hope that this research will be a step towards keeping documentation up-to-date. If this has been helpful, consider updating the documentation to keep it in sync with the source code. If this has not been helpful, consider updating this issue with an explanation, so that we can improve our approach. Thanks!

Question: project name is megalist or megalista?

I have a doubt, @astivi and @caiotomazelli

The name of the repository and documentation is Megalista
But the folder structure uses the name megalist and some parameters as well.

We understand that the name of the solution is Megalista, and all coding must use the megalist nomenclature. Is correct?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.