Git Product home page Git Product logo

oculi's Introduction

Note: This is not an officially supported Google product. It is a reference implementation.

Oculi

Oculi is a Google Cloud-based pipeline for tagging large sets of images or videos with labels based on their content, generating a BigQuery dataset for further analysis. Content tagging is done through Cloud's pre-trained computer vision models (Vision API and Video Intelligence API).

The primary use case is for analyzing creatives (images and videos) in digital advertising. Combined with creative performance data, the output from this pipeline can be used to explore correlations between advertising content and performance (e.g. creatives with a human model tend to perform better).

Creative Sources

This pipeline supports three sources of creatives 1:

  • A Google Campaign Manager (CM) account. Oculi will attempt to extract all creatives on the account that have an image or video asset in a suitable format 2, then download the asset and save a copy to Cloud Storage. Users of DV360 may be able to use this option (see FAQ).
  • A BigQuery table of URLs. URLs must point to images or videos, and be accessible without login. Oculi will download the asset and save a copy to Cloud Storage. The required table columns are:
    • Creative_ID, an unique integer for each image or video
    • Advertiser_ID, an integer identifying a parent entity
    • Creative_Name, a text field
    • Full_URL, the URL to the image or video file
  • A Google Cloud Storage (GCS) bucket of creative files. Files must be image or video files (JPG and MP4 preferred) at the top level of the bucket.
    • If the filenames follow the convention {numeric_id}_{other_stuff}.jpg, then the numeric_id will be used as the creative_id.
    • Otherwise, a creative_id is generated by calling the Python hash() function on the entire filename.

You could deploy Oculi in couple of ways:

  • Colab - Files located under colab subdirectory
  • Dataflow - Files located under Dataflow subdirectory

Footnotes

  1. The term "creative" is used to indicate just the image or video content in an ad, excluding other components like targeting preferences.

  2. This excludes dynamic creatives. The goal of this pipeline is to break images and videos down into their content components; dynamic creatives are already broken into content components. Rather, analysis of the data from this pipeline can be used to inform a dynamic creative strategy.

oculi's People

Contributors

kalpanasuresh avatar sisrikshab avatar xerebus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

oculi's Issues

Tables not found on Feature extraction

Some cells in the third notebook (flattening, words, and performance) are failing and throwing not found errors.

E.g.
NotFound: 404 POST https://bigquery.googleapis.com/bigquery/v2/projects/[project_name]/datasets/test_ds_out/tables: Not found: Table [project_name]:test_ds_out.creative_sizes

It seems to be looking into the dataset created on the previous notebooks, so I'm unsure on whether it's failing to create them, or if my original datasets are incomplete. The first two notebooks threw no errors though.

Upgrade to Python3

Hi!

I see that Dataflow has deprecated Python2.7. Will Oculi still work on there? If not, is there a plan to upgrade this to Python3?

Thanks! Pelayo

Issue for Images with Faces - Face recognition not working

Hello,

I came across this code and have been trying to implement it for creative analysis on a set of images.

When I run the pipeline, it works fine for a dataset of images that don't have faces in them. However, when the dataset has images with faces, the pipeline fails in Dataflow and the BQ output has all tables except face_annotations.

I'm not sure if this is an issue since Dataflow is deprecated for python 2 or an issue within the code itself. Has anyone else faced this issue?

Can somebody who has implemented this code help please? @ @xerebus @cdibona @dberlin @kalpanasuresh @sisrikshab

Thanks
-Abhi

Dataflow Screenshot -
dataflow ss

Dataflow Errorlog Screenshot -
error log

BQ Screenshot when pipeline ran successfully on set of images without faces
dataset without faces

BQ Screenshot when pipeline failed on set of images with faces
dataset with faces

Incompatibility errors found when running setup script

When I ran the setup_environment.sh script most of it ran fine, but I did see a handful of errors:

ERROR: google-cloud-spanner 1.12.0 has requirement grpc-google-iam-v1<0.13dev,>=0.12.3, but you'll have grpc-google-iam-v1 0.11.4 which is incompatible.
ERROR: google-cloud-language 1.3.0 has requirement google-api-core[grpc]<2.0.0dev,>=1.14.0, but you'll have google-api-core 1.11.1 which is incompatible.
ERROR: google-cloud-translate 2.0.0 has requirement google-api-core[grpc]<2.0.0dev,>=1.14.0, but you'll have google-api-core 1.11.1 which is incompatible.
ERROR: google-cloud-translate 2.0.0 has requirement google-cloud-core<2.0dev,>=1.0.3, but you'll have google-cloud-core 1.0.1 which is incompatible.
ERROR: google-cloud-bigquery 1.6.2 has requirement google-cloud-core<0.30dev,>=0.28.0, but you'll have google-cloud-core 1.0.1 which is incompatible.
ERROR: tensorboard 2.0.1 has requirement grpcio>=1.24.3, but you'll have grpcio 1.21.1 which is incompatible.
ERROR: mssql-cli 0.17.0 has requirement prompt-toolkit<2.1.0,>=2.0.0, but you'll have prompt-toolkit 1.0.18 which is incompatible.
ERROR: mssql-cli 0.17.0 has requirement prompt-toolkit<2.1.0,>=2.0.0, but you'll have prompt-toolkit 1.0.18 which is incompatible.
ERROR: google-cloud-datastore 1.10.0 has requirement google-api-core[grpc]<2.0.0dev,>=1.14.0, but you'll have google-api-core 1.11.1 which is incompatible.
ERROR: google-cloud-datastore 1.10.0 has requirement google-cloud-core<2.0dev,>=1.0.3, but you'll have google-cloud-core 1.0.1 which is incompatible.
ERROR: google-cloud-logging 1.14.0 has requirement google-api-core[grpc]<2.0.0dev,>=1.14.0, but you'll have google-api-core 1.11.1 which is incompatible.
ERROR: google-cloud-logging 1.14.0 has requirement google-cloud-core<2.0dev,>=1.0.3, but you'll have google-cloud-core 1.0.1 which is incompatible.
ERROR: google-cloud-bigtable 0.31.1 has requirement google-cloud-core<0.29dev,>=0.28.0, but you'll have google-cloud-core 1.0.1 which is incompatible.

Some installs were done successfully and then one final error:

ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/pbr-5.4.3.dist-info'
Consider using the --user option or check the permissions.

I do have a couple of new directories (jobs and pipeline), but I'm not sure whether these errors are showstoppers or just inconvenient. Also if there's any guidance on how to modify code or config docs to point to the correct libs, that would also be great.

Invalid Grant : Account not found issue.

Hi,
I tried to clone this repo and run the code with source=cm only in the sample.yaml file. I have given full admin rights to my email id for a Campaign Manager account but while running the script I am getting this error -

google.auth.exceptions.RefreshError: ('invalid_grant: Invalid grant: account not found', u'{"error":"invalid_grant","error_description":"Invalid grant: account not found"}'

image

It might be a dumb question but can you help me on what I might be doing wrong? I also did a fair bit of Google Search and found out that it might be due to expired token but I am not sure where to fix it. Can you help? @xerebus @cdibona @dberlin

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.