Git Product home page Git Product logo

mozart's Introduction

Mozart - Business logic for Search Ads 360

Table of Contents

Mozart is a framework for automating tasks on Search Ads 360 (SA360). Mozart lets advertisers and agencies apply their own business logic to SA360 campaigns by leveraging well-known technologies such as Apache Beam.

Mozart is designed to be deployed in an Airflow+Beam platform. The rest of this documentation assumes Google Cloud Platform (GCP) is used for deployment. Composer is the name of GCP's managed version of Airflow, whereas DataFlow is the name of the managed version of Beam.

How it works

Mozart leverages SA360 Reporting API and SA360 Bulksheet uploads to perform the automation tasks.

The sequence of high-level operations is:

  1. Reports are downloaded from SA360 API. These reports must include all entities to be processed (e.g.: Keywords, Ads).
  2. Downloaded reports are analyzed, applying the custom business logic. The output of this logic is a CSV file containing updated information about the entities (Keywords, Ads). For example: a new Max CPC value for certain keywords
  3. CSV files with updated values are uploaded into SA360 using sFTP Bulksheet upload

Architecture

Mozart consists of two main modules:

  1. An Airflow DAG
  2. A Beam Pipeline

Set-up

This guide describes how to set-up Mozart on Google Cloud Platform.

For the rest of the guide, it is assumed that you have created a Google Cloud project, enabled billing and that you have Admin access to the project via console.cloud.google.com.

For instructions on how to create a Google Cloud project and enable billing please refer to Google Cloud Platform documentation.

You must also enable certain Google Cloud Platform APIs to be able to use Mozart. In order to enable all of them in a single step, click on enable APIs and follow the instructions.

Note: The link to enable APIs might take a while to load. Please be patient.

Pre-requisites

You must have the following software installed on your computer:

Composer set-up

First step is to set-up Google Cloud Composer. In order to do so, follow the steps:

  1. Create a new Composer environment

    1. Go to console.cloud.google.com/composer

    2. Check 'Enable beta features'

      Enabling beta features

    3. Click on 'Create'

      Create Composer environment

    4. Type in the following options:

    5. Click on 'Create' at the bottom of the page

    Tip: Creating the environment takes a while. We suggest you to continue with the other sections and come back later to finish environment configuration.

  2. Once the environment is created, go to the Composer page and open the 'Airflow webserver' for the newly created environment.

    Airflow webserver

  3. Click on 'Admin' > 'Variables'

    Variables

  4. Create the following variables:

Variable Description Example value
mozart/sa360_agency_id SA360 agency ID 123456789
mozart/start_date Enter today's date 2018-10-30
mozart/lookback_days Number of days back to pull reports for. E.g.: if you enter '7', you will work with data (clicks, impressions) from the last 7 days 7
mozart/gcp_project Your Google Cloud project ID mozart-123456
mozart/gcp_zone The zone of your Composer instance europe-west1-b
mozart/gcs_bucket Name of the GCS bucket you created (without 'gs://' prefix) mozart-data
mozart/dataflow_staging GCS URI for DataFlow staging folder gs://mozart/staging
mozart/dataflow_template GCS URI for DataFlow template gs://mozart/templates/MozartProcessElements
mozart/advertisers JSON describing the advertisers to work with. Each advertiser contains an entry with the advertiserId and information about the sFTP endpoint for that advertiser. sFTP enpoint connection must specify either a sftpConnId or the sFTP connection parameters: sftpHost, sftpPort, sftpUsername, sftpPassword. Any of these individual fields overrides the configuration provided in the connection ID [{"advertiserId": "123", "sftpConnId": "sa360_sftp", "sftpUsername": "username1", "sftpPassword": "password1"},{"advertiserId": "456", "sftpConnId": "sa360_sftp", "sftpUsername": "username2", "sftpPassword": "password2"} ]

DataFlow set-up

  1. Create a Service account
    1. In the GCP Console, go to the Create service account key page.
    2. From the Service account drop-down list, select 'New service account'.
    3. In the Service account name field, enter a name .
    4. From the Role drop-down list, select 'Project > Owner'
      • Use 'Project > Owner' for testing purposes. For a production system, you should select a more restrictive role.
    5. Click 'Create'. A JSON file that contains your key downloads to your computer.
  2. Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the absolute path of the JSON file downloaded in the previous step

Cloud Storage set-up

Mozart's DataFlow pipeline works with files on Google Cloud Storage. You need to create a bucket where these files will be stored:

  1. Go to console.cloud.google.com/storage

  2. Create a bucket

    1. Click on 'Create bucket'

    2. Choose a name for the bucket

    3. Choose a location that matches the location you used for the Composer configuration

      Note: A configuration based on 'Regional' storage class and the same location as the one used for Composer is suggested. However, you may want to use other options if you plan on using the bucket for storing custom data. Check the Cloud Storage docs for more info on all the options.

  3. Create a lifecycle rule for the bucket

    1. In the bucket list view, click on the Lifecycle column value

      Lifecycle

    2. Click on 'Add rule'

    3. Select 'Age' condition, and set it to 30 days

    4. As an action, select 'Delete'

    5. Save the rule

    Note: Lifecycle rules help you decrease Cloud Storage costs by deleting old elements. We suggest setting this 30-day policy, but you should adjust this if you wish to keep items for longer, or if you plan on storing other data in the same bucket.

  4. Create the following folders:

    1. staging
    2. templates
    3. sa360_reports
    4. sa360_upload

mozart's People

Contributors

pgilmon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.