Comments (1)
At Criteo, we are using the aggregation service when testing the end-to-end pipeline of ARA reports. We have been using the Aggregation Service for months, and have faced several issues when trying to run aggregation jobs. While the setup documentation is really clear, it turns out that most of our efforts w.r.t the aggregation service were spent not deploying or maintaining it, but in debugging it. Here we give some ideas of features that we think would greatly enhance our visibility when debugging aggregation jobs, as well as insight on information we think should be part of the aggregation service documentation.
1. More details on PRIVACY_BUDGET_EXHAUSTED errors
Root causes for aggregation jobs failing to execute are currently very obscure, and it’s hard to know where the error lies.
This is specifically the case for PRIVACY_BUDGET_EXHAUSTED errors. It would be a lot easier for us to locate and fix errors if an aggregation service failure could give information on either:
The report(s) causing the error, or at least the sharedID's information (or sharedIDs' information) related to the issue
The jobId of the aggregations that were related to the error, be it the aggregation that failed, but also any other, previous aggregation, that could have consumed the privacy budget for the faulty sharedIDs
2. Additional documentation on the AWS internal architecture
To simplify the understanding of the AS structure in AWS, it would be helpful to have a document explaining the various components of the aggregation service (job queue on SQS, job status table in DynamoDB, workers on EC2, access through API Gateway, etc.). Knowing what type of information is exposed via AWS tools, its format, and where to look for it would all be useful.
Additionally, once changes are made to the AS running online by the adtechs, a new deployment using Google’s cloned repositories will probably override the specific settings reached at that point (although we haven’t done this ourselves). It would be interesting to add more options when filling in the <filename>.auto.tfvar
files for the setup to be more reproducible.
3. Additional information on optimization of the AS within and without the AWS infrastructure
The sizing guidance provides useful guidelines for choosing EC2 instance types depending on batch sizes. However in our tests we observed that splitting the aggregation load into thousands of small batches (which is necessary to batch the data per client) leads to long end-to-end execution times, at least if done in a naive way, even if the processing times for individual batches are short. In order to facilitate the tuning of this process for AdTechs it would be useful to have:
- A description of how the processing is parallelized within aggregation service (across a single or different EC2 instances)
- Any recommendation on sending parallel batch processing requests (e.g. how many batches of a certain size can be processed simultaneously by a EC2 instance of a given type).
- Sizing recommendations for AWS components other than EC2, notably DynamoDB.
from trusted-execution-aggregation-service.
Related Issues (20)
- How to copy AMI to another region? HOT 2
- Could you provide encrypted sample report for testing? HOT 1
- Clarifications on aggregation service batches + enhanced debugging possibilities HOT 6
- Aggregation Service: Run a job without output domain. Unable to set domain_optional flag HOT 2
- A Cloud Migration Tool for Aggregation Service: Feedback Requested
- Aggregation Service: AWS worker build issue and workaround HOT 1
- Mismatch between API response and specification for `debug_privacy_epsilon` field HOT 1
- Update Docs HOT 1
- Confused about the output_domain.avro HOT 2
- 403 errors when deploying aggregation-service HOT 8
- Invalid value for member: issue when trying to deploy Aggregation Service to GCP HOT 4
- GCP Build container fails to build due to hanging apt-get install HOT 3
- Build Feature: GCP Build to upload zips to GCS HOT 2
- Aggregation service setup notes, snags & suggestions. HOT 1
- Clarification on aggregated report batching and privacy budget exhaustion HOT 7
- Job status is always RECEIVED HOT 8
- Feedback on consolidating Coordinator Services
- Could someone help me validate if I am collecting the reports correctly (attribution-report NODE JS version) HOT 4
- Aggregation job failing in AWS with error DECRYPTION_KEY_NOT_FOUND HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from trusted-execution-aggregation-service.