Git Product home page Git Product logo

aggregation-service's People

Contributors

erintwalsh avatar ghanekaromkar avatar hostirosti avatar peiwenhu avatar ruclohani avatar zyw229 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aggregation-service's Issues

Invalid value for member: issue when trying to deploy Aggregation Service to GCP

Hello!

I am following the guide outlined here: https://github.com/privacysandbox/aggregation-service/blob/main/docs/gcp-aggregation-service.md#adtech-setup-terraform

And I am now at the stage where I am trying to deploy the individual environments:

GOOGLE_IMPERSONATE_SERVICE_ACCOUNT="aggregation-service-deploy-sa@ag-edgekit-prod.iam.gserviceaccount.com" terraform plan

However I am faced with this error:

╷
│ Error: invalid value for member (IAM members must have one of the values outlined here: https://cloud.google.com/billing/docs/reference/rest/v1/Policy#Binding)
│
│   with module.job_service.module.autoscaling.google_cloud_run_service_iam_member.worker_scale_in_sched_iam,
│   on ../../coordinator-services-and-shared-libraries/operator/terraform/gcp/modules/autoscaling/workerscalein.tf line 104, in resource "google_cloud_run_service_iam_member" "worker_scale_in_sched_iam":
│  104:   member   = "serviceAccount:${var.worker_service_account}"
│
╵
╷
│ Error: invalid value for member (IAM members must have one of the values outlined here: https://cloud.google.com/billing/docs/reference/rest/v1/Policy#Binding)
│
│   with module.job_service.module.worker.google_spanner_database_iam_member.worker_jobmetadatadb_iam,
│   on ../../coordinator-services-and-shared-libraries/operator/terraform/gcp/modules/worker/main.tf line 98, in resource "google_spanner_database_iam_member" "worker_jobmetadatadb_iam":
│   98:   member   = "serviceAccount:${local.worker_service_account_email}"
│
╵
╷
│ Error: invalid value for member (IAM members must have one of the values outlined here: https://cloud.google.com/billing/docs/reference/rest/v1/Policy#Binding)
│
│   with module.job_service.module.worker.google_pubsub_subscription_iam_member.worker_jobqueue_iam,
│   on ../../coordinator-services-and-shared-libraries/operator/terraform/gcp/modules/worker/main.tf line 104, in resource "google_pubsub_subscription_iam_member" "worker_jobqueue_iam":
│  104:   member       = "serviceAccount:${local.worker_service_account_email}"
│
╵

I am new to terraform and have not been able to find a way to log the value of serviceAccount:${var.worker_service_account} & serviceAccount:${local.worker_service_account_email}.

Any help here would be greatly appreciated!

EDIT: The below seems to show that TF state does correctly store the two service accounts created in the adtech_setup step.

terraform state show 'module.adtech_setup.google_service_account.deploy_service_account[0]'

# module.adtech_setup.google_service_account.deploy_service_account[0]:
resource "google_service_account" "deploy_service_account" {
    account_id   = "aggregation-service-deploy-sa"
    disabled     = false
    display_name = "Deploy Service Account"
    email        = "aggregation-service-deploy-sa@ag-edgekit-prod.iam.gserviceaccount.com"
    id           = "projects/ag-edgekit-prod/serviceAccounts/aggregation-service-deploy-sa@ag-edgekit-prod.iam.gserviceaccount.com"
    member       = "serviceAccount:aggregation-service-deploy-sa@ag-edgekit-prod.iam.gserviceaccount.com"
    name         = "projects/ag-edgekit-prod/serviceAccounts/aggregation-service-deploy-sa@ag-edgekit-prod.iam.gserviceaccount.com"
    project      = "ag-edgekit-prod"
    unique_id    = "106307936135287037408"
}

Support for aggregation over a set of keys

Hello,

Currently, the aggregation service does a sum of the values on the set of keys which is declared in the output domain files. This explicit declaration of keys mean that the encoding must be well-done at report creation time (eg on the source and trigger side for ARA or in Shared Storage for Private Aggregation API). This is quite inflexible in its use.

To bring in some flexibility, I propose to add a system to the aggregation service where a predeclared set of keys would be summed by the aggregation service. This set of keys would constitute a partition of the key space for the service not to violate the DP limit. A simple check done by the aggregation service could reject the query if a key is in two sets.

Here is what the output domain file would look like. I am not sure "super bucket" is a great name, but this is the only I could think of right now.

Super bucket Bucket
0x123 0x456
0x123 0x789
0x124 0xaef
0x125 0x12e

The aggregation service would provide the output only on the "super buckets".

The operational benefits of this added flexibility would be huge. Currently, one has to decide on an encoding before knowing what one can measure. For ARA or PAA for Fledge, this means having a very good idea before hand of the size and the performance of the campaign. When the campaign is running, then adjustment have to be made if the volume estimate was not good (or if the settings of the campaign are changed). Encoding change can be difficult to track, especially in ARA where sources and triggers both contribute to the keys, but at different point in time. This proposal allows to have a fixed encoding, and adjust after the fact (using the volume of reports as a proxy) the encoding actually used.

Update Docs

The sample provided here is using an out of date shared_info which also doesn't contain a version.

Better to use the one from the sampledata dir - here is the plaintext

"{\"api\":\"attribution-reporting\",\"version\":\"0.1\",\"scheduled_report_time\":1698872400.000000000,\"reporting_origin\":\"http://adtech.localhost:3000\",\"source_registration_time\":1698796800.000000000,\"attribution_destination\":\"dest.com\",\"debug_mode\":\"enabled\",\"report_id\":\"b360383a-108d-4ae3-96bd-aecde1c3c30b\"}"

Which has an allowed version, an actual 'api' key and also has attribution_destination moved within shared_info.

Mismatch between API response and specification for `debug_privacy_epsilon` field

The response I get from the getJob API doesn't include debug_privacy_epsilon as a double but a string.
e.g.

{
    ...
    "job_parameters": {
        "debug_privacy_epsilon": "64.0",
        ...
    }
    ...
}

The API specifications in https://github.com/privacysandbox/aggregation-service/blob/main/docs/api.md state that we should expect a double value. It would be helpful if either the specifications or the API response is changed to match the other.

Build Feature: GCP Build to upload zips to GCS

Hello,

I'm trying to build and deploy images based on the steps here:
https://github.com/privacysandbox/aggregation-service/blob/2b3d5c450d0be4e2ce0f4cb49444f3f049508917/build-scripts/gcp/cloudbuild.yaml

This uploads the compiled JAR files to the bucket, however I can not use these directly in cloud functions and have to download them, zip them, and reupload them (this is automaticely done for users in terraform). Ideally I'd like to skip this step and was hoping to be able to directly upload those JAR files zipped.

Debugging support in the aggregation service: Feedback Requested

Hi,

The Aggregation service team is looking for your feedback to improve debugging support in the service.

Adtech can already get metrics for their jobs (status, errors, execution time etc.) from the Cloud metadata (DynamoDb in AWS and Spanner on GCP).

We are exploring other metrics, traces and logs that can provide a better understanding of the job processing within the Trusted Execution Environment without impacting privacy. We are considering providing CPU and memory metrics and total execution time traces for the adtech deployment and will benefit from your feedback on other metrics that adtech may find useful.

We are also considering adding useful logs which can give information about the job processing for debugging purposes such as ‘Job at data reading stage’ etc.. This is subject to review and approval considering user privacy.

Your inputs will be reviewed by the Privacy Sandbox team. We welcome any feedback on debugging Aggregation Service jobs.

Thank you!

Aggregation service deployment using user provided vpc fails

When specifying "enable_user_provided_vpc = true", creation of the environment following the instructions at https://github.com/privacysandbox/aggregation-service/tree/main#set-up-your-deployment-environment
fails with error:
Out of index vpc[0], 182: dynamodb_vpc_endpoint_id = module.vpc[0].dynamodb_vpc_endpoint_id

At file: terraform/aws/applications/operator-service/main.tf
Lines 182 & 183 refers to module.vpc[0]
While module.vpc is not set when "enable_user_provided_vpc = true"
module "vpc" {
count = var.enable_user_provided_vpc ? 0 : 1

Unable to copy AMI image to my AWS account

The documentation states that:

Note: The prebuilt Amazon Machine Image (AMI) for the aggregation service is only available in the us-east-1 region. If you like to deploy the aggregation service in a different region you need to copy the released AMI to your account or build it using our provided scripts.

Ref. https://github.com/privacysandbox/aggregation-service#download-terraform-scripts-and-prebuilt-dependencies

When I try to copy the AMI to my account I'm getting the following error:

Failed to copy ami-036942f537f7a7c2b
You do not have permission to access the storage of this ami

Can you give me some guidance or tell me if it's a configuration error?

Context:

image

Clarifications on aggregation service batches + enhanced debugging possibilities

Hello aggregation service team,

We (Criteo) would like to seek clarification on a couple of points to ensure we have a comprehensive understanding of certain features.
Your insights will greatly assist us in optimizing our utilization of the platform:

  1. Batch Size Limit (30k reports):
    Could you kindly provide more details about the batch size limit of 30,000?
    We are a little unsure as to how this limit behaves: it is our understanding that the aggregation service will expect loads of up to tens (even hundreds) of thousands of reports. However when we provide it with batches of 50k+ reports, our aggregations fail.
    Is the limit of 30k a limit that is to be enforced per avro file within the batch? Per batch overall?
    If it is per overall batch, is there any kind of suggestion on your side to aggregate batches of more than 30k reports?
    If we need to split these larger aggregations over several smaller requests, that will greatly increase the noise levels we see in our final results, and would work against the idea of the aggregation service, which encourages adtechs to aggregate as many reports as possible to increase privacy.
    Understanding the specifics of this limit should greatly help us in tailoring our processes more effectively.

  2. Debug Information on Privacy Budget Exhaustion:
    We've been considering ways to enhance our debugging capabilities, especially in situations where the privacy budget is exhausted. Would it be possible to obtain more detailed debug information in such cases, specifically regarding the occurrence of duplicates? We believe that having for instance the report_ids of the duplicates wouldn't compromise privacy, and would significantly contribute to our troubleshooting efforts.

Could you provide encrypted sample report for testing?

I have the aggregation service set up, but our system to produce encrypted reports is not ready to go yet. This repo's sampledata directory has a sample report, but it is unencrypted and so only works with the local testing tool only, not with AWS Nitro Enclaves.

Could you provide, either in the repo or in a zip file in this thread, an encrypted sample output.avro and accompanying domain.avrp that we can use to test our AWS aggregation service to make sure everything is running properly?

aggregation-service-artifacts-build CodeBuild build AMI fails on yum lock

Running the step "Building artifacts" from https://github.com/privacysandbox/aggregation-service/blob/main/build-scripts/aws/README.md#building-artifacts
To build the artifacts on region: eu-west-1

The CodeBuild failed with the below error:

amazon-ebs.sample-ami: Loaded plugins: extras_suggestions, langpacks, priorities, update-motd

754 | ==> amazon-ebs.sample-ami: Existing lock /var/run/yum.pid: another copy is running as pid 3465.
755 | ==> amazon-ebs.sample-ami: Another app is currently holding the yum lock; waiting for it to exit...

Decode the final `output.json` from the `LocalTestingTool`

Hi,

I have managed to get the full flow running to aggregate debug reports in the browser and process them locally with the provided tool.

The final file output I have is:

[{"bucket": "d0ZHnRzgTJMAAAAAAAAAAA==", "metric": 195000}]

Which looks correct in terms of there should be a single key and the metric value is correct.

The issue I have is now decoding this bucket to get my original input data, I assumed the steps would be:

  • base64 decode
  • CBOR decode

But this causes the following error:

_cbor2.CBORDecodeEOF: premature end of stream (expected to read 23 bytes, got 15 instead)

Would really appreciate any help on how to get the input data back out of this bucket.

Best,
D

Error using LocalTestingTool_2.0.0.jar with sampledata

I am trying to follow the instructions in Testing locally using Local Testing Tool but when I run the following command with the sampledata:

java -jar LocalTestingTool_2.0.0.jar \
--input_data_avro_file sampledata/output_debug_reports.avro \
--domain_avro_file sampledata/output_domain.avro \
--output_directory .

I get the error below:

2023-10-31 12:21:57:506 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.WorkerPullWorkService - Aggregation worker started
2023-10-31 12:21:57:545 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.WorkerPullWorkService - Item pulled
2023-10-31 12:21:57:555 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor - Reports shards detected by blob storage client: [output_debug_reports.avro]
2023-10-31 12:21:57:566 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor - Reports shards to be used: [DataLocation{blobStoreDataLocation=BlobStoreDataLocation{bucket=/Users/jonaquino/projects/aggregation-service/sampledata, key=output_debug_reports.avro}}]
2023-10-31 12:21:57:566 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.domain.OutputDomainProcessor - Output domain shards detected by blob storage client: [output_domain.avro]
2023-10-31 12:21:57:567 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.domain.OutputDomainProcessor - Output domain shards to be used: [DataLocation{blobStoreDataLocation=BlobStoreDataLocation{bucket=/Users/jonaquino/projects/aggregation-service/sampledata, key=output_domain.avro}}]
2023-10-31 12:21:57:575 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor - Job parameters didn't have a report error threshold configured. Taking the default percentage value 10.000000
return_code: "REPORTS_WITH_ERRORS_EXCEEDED_THRESHOLD"
return_message: "Aggregation job failed early because the number of reports excluded from aggregation exceeded threshold."
error_summary {
  error_counts {
    category: "REQUIRED_SHAREDINFO_FIELD_INVALID"
    count: 1
    description: "One or more required SharedInfo fields are empty or invalid."
  }
  error_counts {
    category: "NUM_REPORTS_WITH_ERRORS"
    count: 1
    description: "Total number of reports that had an error. These reports were not considered in aggregation. See additional error messages for details on specific reasons."
  }
}
finished_at {
  seconds: 1698780117
  nanos: 679576000
}

CustomMetric{nameSpace=scp/worker, name=WorkerJobCompletion, value=1.0, unit=Count, labels={Type=Success}}
2023-10-31 12:21:57:732 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.WorkerPullWorkService - No job pulled.

aggregation-service-artifacts-build CodeBuild built the AMI at the wrong region

Running the step "Building artifacts" from https://github.com/privacysandbox/aggregation-service/blob/main/build-scripts/aws/README.md#building-artifacts
To build the artifacts on region: eu-west-1

The CodeBuild failedwith the below error:

836 | --> amazon-ebs.sample-ami: AMIs were created:
837 | us-east-1: ami-069b14bccedc04571
....
[Container] 2023/05/09 15:34:31 Running command bash build-scripts/aws/set_ami_to_public.sh set_ami_to_public_by_prefix aggregation-service-enclave_$(cat VERSION) $AWS_DEFAULT_REGION $AWS_ACCOUNT_ID
841 |  
842 | An error occurred (InvalidAMIID.Malformed) when calling the ModifyImageAttribute operation: Invalid id: "" (expecting "ami-...")
843 |  
844 | An error occurred (MissingParameter) when calling the ModifySnapshotAttribute operation: Value () for parameter snapshotId is invalid. Parameter may not be null or empty.
845

The reason is that it created the ami on us-east-1 instead of eu-west-1

Aggregation service, ARA browser retries and duplicate reports

The way the browser and adtech's servers interact over the network makes it inherently unavoidable that some reports will be received by the adtech but not considered as such by the browser (e.g. when a timeout happens) and hence retried and received several times by the adtech; as is mentioned in your documentation:

The browser is free to utilize techniques like retries to minimize data loss.

Sometimes, these duplicate reports reach upwards of hundreds of reports each day, for several days (sometimes several months) in a row, all having the same report_id.
The aggregation service runs the no-duplicates rule basing itself on a combination of information:

Instead, each aggregatable report will be assigned a shared ID. This ID is generated from the combined data points: API version, reporting origin, destination site, source registration time and scheduled report time. These data points come from the report's shared_info field.
The aggregation service will enforce that all aggregatable reports with the same ID must be included in the same batch. Conversely, if more than one batch is submitted with the same ID, only one batch will be accepted for aggregation and the others will be rejected.

As an adtech company, when trying to provide timely reporting to clients, it is paramount to try and use all of the available information (in this case, reports) in order to have our reporting be as precise as possible.
In this scenario, however, if we try to batch together all of our reports for a chosen client on a chosen day, even by deduplicating all of the chosen day's reports through the report_id (or the overall shared_info) field, we may have a batch accepted on day 1, and then all subsequent batches for the next month be rejected because they all contain that same shared_info-based id.
This means that we have to check further back in the data for possible duplicate reports. To be able to implement this check in an efficient manner we would benefit from a more precise description of the retry policy, namely for how long the retries can happen.

I guess the questions this issue raises are as follows:

  • In what scenarii does a browser go for the aforementioned retries?
  • Is there a time limit for those retries (i.e. a date after the original report when the browser no longer retries sending a report)?
  • If there is not, could you please advise on a way for adtech companies to efficiently filter out duplicate reports without having to process all of their available reports for duplicate shared info values?
  • Also, the described problem of "duplicate retried" reports, but not only, makes us believe that adtechs would benefit from a modification to the way the AS handles duplicates. Indeed if the AS gracefully dropped the duplicates from the aggregation instead of failing the batch altogether, we wouldn't necessarily need to filter out such reports from a batch. Could this possibility be considered on your side?

Job status is always RECEIVED

Hi team,

Our aggregation service is deployed successfully. But after creating a job, the job status is always RECEIVED. Do you have some clues about that ? our projectId is ecs-1709881683838
image
image

Thanks a lot~~

Configure CodeBuild Setup - getting error

When running the terraform code on step https://github.com/privacysandbox/aggregation-service/blob/main/build-scripts/aws/README.md#configure-codebuild-setup
I got the following error:

│ Error: error creating S3 bucket ACL for aggregation-service-artifacts: AccessControlListNotSupported: The bucket does not allow ACLs

To resolve this error i had to add to: build-scripts/aws/terraform/codebuild.tf
The following resource:

resource "aws_s3_bucket_ownership_controls" "artifacts_output_ownership_controls" {
  bucket = aws_s3_bucket.artifacts_output.id

  rule {
    object_ownership = "BucketOwnerEnforced"
  }
}

Clarification on aggregated report batching and privacy budget exhaustion

Hi aggregation service team, we(Adform) are facing issues Privacy budget Exhaustion issue due to duplicate reports. We are following the batching criteria
mentioned at

and

Based on the above rules, we tried to reverse engineer the batch data to check if we do have any duplicate reports across all our batch data but we couldn't find any .

We also looked at #35 and cross verified our assumption with the code as well.

Is there any other way we can have more debug information as to across which batches we have these duplicate reports with the same key.

Can you please provide any information on how to proceed with debugging this issue.

Aggregation service setup notes, snags & suggestions.

Hello All!

Having spent the past few days on trying to get the AS live, I have been jotting down various questions, suggestions & bugs which I think could be a great addition to the documentation and workflow.

Full architecture diagram

Maybe for those who use terraform in their project this is not required, but we do not use terraform and essentially followed the instructions to get all the resources built. I have since had to traverse the GCP console to try and understand what the scripts created. A high level overview diagram with the main data flows, table names etc would be extremely useful.

Resource naming

Similar to the point above, the terraform scripts are spread over many files so it is not clear exactly what will be created. I think it would be great to have a single file config showing all the names of the resources as they are very obscure in the context of our overall infra, for example prod-jobmd, is a name of a newly created Cloud Spanner instance, which is a pretty unhelpful name. At the very least everything should be prefixed with aggregation-service, or even better allow users to transparently set this as a first step.

Resource costs

It would be good to have an understanding of the cost of the full set up at idle, and maybe have some suggestions for development and staging setups which can minimise costs by using more serverless infra for example.

Cloud Function / Run

I would suggest to drop the use of cloud functions and migrate fully to cloud run, the docs seems to use these interchangeable and although they sort of are (gen2 functions are powered by cloud run), I think this can cause extra confusion. There is also a small typo on the endpoint:

This is the value in the docs

https://<environment>-<region>-frontend-service-<cloud-funtion-id>-uc.a.run.app/v1alpha/createJob

But -uc. was -ew. in my case, so this does not seem to a value which can be hardcoded in the docs in this manner.

Known errors and solutions

Running the jobs stores a nice error in the DB, which is awesome! But even with this nice error it would be great to have a document to show common errors and their solutions. For example my latest error is:

{"errorSummary":{"errorCounts":[{"category":"DECRYPTION_KEY_NOT_FOUND","count":"445","description":"Could not find decryption key on private key endpoint."},{"category":"NUM_REPORTS_WITH_ERRORS","count":"445","description":"Total number of reports that had an error. These reports were not considered in aggregation. See additional error messages for details on specific reasons."}]},"finishedAt":"2024-04-30T13:17:24.233681575Z","returnCode":"REPORTS_WITH_ERRORS_EXCEEDED_THRESHOLD","returnMessage":"Aggregation job failed early because the number of reports excluded from aggregation exceeded threshold."}

Which is very clear - but still does not leave me any paths open to try and rectify the issue apart from troubling people over email or in this repo :)

Some missing configuration

This was addressed in #48 but needs to be added to the repo.

Show data conversion flows

There are quite a few flows in which data must be converted from one format to another, for example some hashed string into a byte array, whilst it is possible to figure this out given some disparate pieces of information available in the repository it would be very useful to have a few examples for various platforms, eg:

-- Convert hashes to domain avro for processing.
CAST(FROM_HEX(SUBSTR(reports.hashed_key, 3)) AS BYTES) AS bucket

I hope you do not mind if I keep updating this issue as I hopeful near completion of getting the service up!

All the best!
D

403 errors when deploying aggregation-service

Hi team,

I’m trying to set up our deployment environment. But I encountered this error. Could you please help to look at it ? Thanks a lot !!!

These are the roles of our service accounts. Do I need to add some additional role permissions?
our projectId: ecs-1709881683838
image001

Error: Error creating function: googleapi: Error 403: Could not create Cloud Run service dev-us-west2-worker-scale-in. Permission ‘iam.serviceaccounts.actAs’ denied on service account [worker-sa-aggregation-service@microsites-sa.iam.gserviceaccount.com](mailto:worker-sa-aggregation-service@microsites-sa.iam.gserviceaccount.com) (or it may not exist).
│
│   with module.job_service.module.autoscaling.google_cloudfunctions2_function.worker_scale_in_cloudfunction,
│   on ../../coordinator-services-and-shared-libraries/operator/terraform/gcp/modules/autoscaling/workerscalein.tf line 35, in resource “google_cloudfunctions2_function” “worker_scale_in_cloudfunction”:
│   35: resource “google_cloudfunctions2_function” “worker_scale_in_cloudfunction” {
│
╵
╷
│ Error: Error creating function: googleapi: Error 403: Could not create Cloud Run service dev-us-west2-frontend-service. Permission ‘iam.serviceaccounts.actAs’ denied on service account [[email protected]](mailto:[email protected]) (or it may not exist).
│
│   with module.job_service.module.frontend.google_cloudfunctions2_function.frontend_service_cloudfunction,
│   on ../../coordinator-services-and-shared-libraries/operator/terraform/gcp/modules/frontend/main.tf line 43, in resource “google_cloudfunctions2_function” “frontend_service_cloudfunction”:
│   43: resource “google_cloudfunctions2_function” “frontend_service_cloudfunction” {
│
╵
╷
│ Error: Error creating instance template: googleapi: Error 409: The resource ‘projects/ecs-1709881683838/global/instanceTemplates/dev-collector’ already exists, alreadyExists
│
│   with module.job_service.module.worker.google_compute_instance_template.collector,
│   on ../../coordinator-services-and-shared-libraries/operator/terraform/gcp/modules/worker/collector.tf line 49, in resource “google_compute_instance_template” “collector”:
│   49: resource “google_compute_instance_template” “collector” {

Missing required properties: jobKey

Hello,

While executing /createJob request with following payload
Please see an example below:
{ "job_request_id": "Job-1010", "input_data_blob_prefix": "reports/inputs/input.avro", "input_data_bucket_name": "test-android-sandbox", "output_data_blob_prefix": "reports/output/result_1.avro", "output_data_bucket_name": "test-android-sandbox", "job_parameters": { "output_domain_blob_prefix": "reports/domains/domain.avro", "output_domain_bucket_name": "test-android-sandbox", "debug_privacy_epsilon": 30 } }
The response of this request will be 202

When executing /getJob?job_request_id=Job-1010

{ "job_status": "IN_PROGRESS", "request_received_at": "2023-06-12T15:14:17.891601Z", "request_updated_at": "2023-06-12T15:14:23.222830Z", "job_request_id": "Job-1010", "input_data_blob_prefix": "reports/inputs/input.avro", "input_data_bucket_name": "test-android-sandbox", "output_data_blob_prefix": "reports/output/result_1.avro", "output_data_bucket_name": "test-android-sandbox", "postback_url": "", "result_info": { "return_code": "", "return_message": "", "error_summary": { "error_counts": [], "error_messages": [ "Missing required properties: jobKey" ] }, "finished_at": "1970-01-01T00:00:00Z" }, "job_parameters": { "debug_privacy_epsilon": "30", "output_domain_bucket_name": "test-android-sandbox", "output_domain_blob_prefix": "reports/domains/domain.avro" }, "request_processing_started_at": "2023-06-12T15:14:23.133071Z" }

The error is Missing required properties: jobKey
The job stays in status IN_PROGRESS

When running same /createJob request without the job_request_id property -
the response from /createJob will be:

{ "code": 3, "message": "Missing required properties: jobRequestId\r\n in: {\n \"input_data_blob_prefix\": \"reports/inputs/input.avro\",\n \"input_data_bucket_name\": \"test-android-sandbox\",\n \"output_data_blob_prefix\": \"reports/output/result_1.avro\",\n \"output_data_bucket_name\": \"test-android-sandbox\",\n \"job_parameters\": {\n \"output_domain_blob_prefix\": \"reports/domains/domain.avro\",\n \"output_domain_bucket_name\": \"test-android-sandbox\"\n }\n}", "details": [ { "reason": "JSON_ERROR", "domain": "", "metadata": {} } ] }

Confused about the output_domain.avro

Hi aggregation-service team,

I'm really confused about the file "output_domain.avro" used for producing a summary report locally. In your nodejs example(code), how can I generate a "output_domain.avro" for the aggregation report ?

Here is your sample doc: https://github.com/privacysandbox/aggregation-service/blob/main/docs/collecting.md#collecting-and-batching-aggregatable-reports

{
    "bucket": "\u0005Y"
}

Will this "output_domain.avro" work for your nodejs example ?

If convenient, could you explain what this domain file is generated according to ? Thanks a lot !!

GCP Build container fails to build due to hanging apt-get install

Tried to kick off a build of the build container using the git hash for v2.4.2 and got the error below

I believe its due to a missing "-y" on apt-get install here:
https://github.com/privacysandbox/aggregation-service/blame/22c2a42ea98b88e5dd3451446db2b7a152760274/build-scripts/gcp/build-container/Dockerfile#L63

Google Ldap: evgenyy@ if you want to reach out internally

Step 9/12 : RUN     echo "deb [signed-by=/usr/share/keyrings/cloud.google.asc] https://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list &&     curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | tee /usr/share/keyrings/cloud.google.asc &&     apt-get update && apt-get install google-cloud-cli &&     apt-get -y autoclean && apt-get -y autoremove
 ---> Running in e691327d6e48
deb [signed-by=/usr/share/keyrings/cloud.google.asc] https://packages.cloud.google.com/apt cloud-sdk main
�[91m  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    L�[0m�[91meft  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0�[0m�[91m
100  2659  100  2659    0     0  42686      0 --:--:-- --:--:-- --:--:-- 42887
�[0m-----BEGIN PGP PUBLIC KEY BLOCK-----
...
-----END PGP PUBLIC KEY BLOCK-----
Hit:1 https://download.docker.com/linux/debian bookworm InRelease
Hit:2 http://deb.debian.org/debian bookworm InRelease
Hit:3 http://deb.debian.org/debian bookworm-updates InRelease
Get:4 https://packages.cloud.google.com/apt cloud-sdk InRelease [6361 B]
Hit:5 http://deb.debian.org/debian-security bookworm-security InRelease
Get:6 https://packages.cloud.google.com/apt cloud-sdk/main amd64 Packages [629 kB]
Fetched 636 kB in 1s (1239 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  google-cloud-cli-anthoscli
Suggested packages:
  google-cloud-cli-app-engine-java google-cloud-cli-app-engine-python
  google-cloud-cli-pubsub-emulator google-cloud-cli-bigtable-emulator
  google-cloud-cli-datastore-emulator kubectl
The following NEW packages will be installed:
  google-cloud-cli google-cloud-cli-anthoscli
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 106 MB of archives.
After this operation, 609 MB of additional disk space will be used.
Do you want to continue? [Y/n] Abort.
The command '/bin/sh -c echo "deb [signed-by=/usr/share/keyrings/cloud.google.asc] https://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list &&     curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | tee /usr/share/keyrings/cloud.google.asc &&     apt-get update && apt-get install google-cloud-cli &&     apt-get -y autoclean && apt-get -y autoremove' returned a non-zero code: 1
ERROR
ERROR: build step 0 "gcr.io/cloud-builders/docker" failed: step exited with non-zero status: 1

feedback on deploying aggregation service

I got an api gateway error The API with ID my-api-id doesn’t include a route with path /* having an integration arn:aws:lambda:us-east-1:my-aws-account-id:function:stg-create-job. on aws console, after deploying aggregation service using terraform.

I changed Source ARN of lambda's permisson from arn:aws:execute-api:us-east-1:my-aws-account-id:my-api-id/*/** to arn:aws:execute-api:us-east-1:my-aws-account-id:my-api-id/*/*/v1alpha/getJob, and it solved the error.
https://github.com/privacysandbox/control-plane-shared-libraries/blob/9efe5591acc18e46263399d9785432a146d9675c/operator/terraform/aws/modules/frontend/api_gateway.tf#L62

Private Aggregation API - no metrics in summarised report

Hello,

I'm currently experimenting with the Private Aggregation API and I'm struggling to validate that my final output is correct

From my worklet, I perform the following histogram contribution:

privateAggregation.contributeToHistogram({ bucket: BigInt(1369), value: 128 });

Which is correctly triggering a POST request with the following body:

 {
  aggregation_service_payloads: [
    {
      debug_cleartext_payload: 'omRkYXRhgaJldmFsdWVEAAAAgGZidWNrZXRQAAAAAAAAAAAAAAAAAAAFWWlvcGVyYXRpb25paGlzdG9ncmFt',
      key_id: 'bca09245-2ef0-4fdf-a4fa-226306fc2a09',
      payload: 'RVd7QRTTUmPp0i1zBev+4W8lJK8gLIIod6LUjPkfbxCOHsQLBW/jRn642YZ2HYpYkiMK9+PprU5CUi9W7TwJToQ4UXiUbJUgYwliqBFC+aAcwsKJ3Hg46joHZXV5E0ZheeFTqqvLtiJxlVpzFcWd'
    }
  ],
  debug_key: '777',
  shared_info: '{"api":"shared-storage","debug_mode":"enabled","report_id":"aaa889f1-2adc-4796-9e46-c652a08e18ca","reporting_origin":"http://adtech.localhost:3000","scheduled_report_time":"1698074105","version":"0.1"}'
}

I've setup a small node.js server handling requests on /.well-known/private-aggregation/debug/report-shared-storage basically doing this:

  const encoder = avro.createFileEncoder(
    `${REPORT_UPLOAD_PATH}/debug}/aggregation_report_${Date.now()}.avro`,
    reportType
  );

  reportContent.aggregation_service_payloads.forEach((payload) => {
    console.log(
    "Decoded data from debug_cleartext_payload:",
    readDataFromCleartextPayload(payload.debug_cleartext_payload)
    );

    encoder.write({
      payload: convertPayloadToBytes(payload.debug_cleartext_payload),
      key_id: payload.key_id,
      shared_info: reportContent.shared_info,
    });
  });

  encoder.end();

As you can see at this point I'm printing the decoded data on console and I can see as expected:
Decoded data from debug_cleartext_payload: { value: 128, bucket: 1369 }

However, now I'm trying to generate a summary report with the local test tool by running the following command:

java -jar LocalTestingTool_2.0.0.jar --input_data_avro_file aggregation_report_1698071597075.avro --domain_avro_file output_domain.avro --no_noising --json_output --output_directory ./results

No matther what value I've passed as payload of the contributeToHistogram method, I always got 0 on the metric field:

[ {
  "bucket" : "MTM2OQ==", // 1369 base64 encoded
  "metric" : 0
} ]

Am I doing something wrong ?

Apart of this issue, I wonder how it would work in real life application, currently this example is handling one report at a time which is sent instantly because of being in debug_mode, but in real situation, how are we supposed to process a big amount of reports at once ? Can we pass a list of files to the --input_data_avro_file ? Should we batch the reports prior to converting it to avro based on the shared_info data? If yes, based on which field?

Thank you by advance !

How to copy AMI to another region?

In the AWS instructions, there are two options for using the AMI in a region other than us-east-1:

If you like to deploy the aggregation service in a different region you need to copy the released AMI to your account or build it using our provided scripts.

I have been having a lot of trouble building the AMI using the provided scripts, so I would like to try simply copying the AMI (the first option), but I don't see instructions for this. What is the AMI name and where do I get it from? Do I need to change any parameters to point to the new region? What step should I move on to after copying the AMI?

Could you add instructions for copying the AMI and subsequent steps?

Error building AMI: VPCIdNotSpecified: No default VPC for this user

I am trying to follow the instructions to build the AMI because I want it in a different region than us-east-1.

But when I run

aws codebuild start-build --project-name aggregation-service-artifacts-build --region us-west-2

I get this error:

Build 'amazon-ebs.sample-ami' errored after 936 milliseconds 511 microseconds: VPCIdNotSpecified: No default VPC for this user
status code: 400, request id: fffa8013-121f-4855-a665-70e36030a4e7x

Questions

  1. Is having a default VPC a prerequisite for this build to work?
  2. Is it possible to get the build to work using another VPC, or is using the defaut VPC the recommended way?

A Cloud Migration Tool for Aggregation Service: Feedback Requested

Hi all!

The Aggregation service team is currently exploring options for adtechs who may want to migrate from one cloud provider to another. This gives adtechs flexibility in using a cloud provider of their choice to optimize for cost or other business needs. Our proposed migration solution would enable adtechs to re-encrypt their reports from a source cloud provider (let’s call this Cloud A) to a destination cloud provider (let’s call this Cloud B) and enable them to use Cloud B to process reports originally encrypted for Cloud A as part of the migration. After migration is completed, use of Cloud A for processing reports will be disabled and the adtech will only be able to use Cloud B to process their reports.

In the short-term, this solution will support migration of aggregation service jobs from AWS to GCP and vice versa. As we support more cloud options in the future, this solution would be extensible to moving from any supported cloud provider to another.

Depiction of the re-encryption flow:

image

For any adtechs considering a migration, we encourage completing this migration before third-party cookie deprecation to take advantage of feature benefits such as:

  • Apples to apples comparison using additional budget: We will allow adtechs to process the same report on both Cloud A and Cloud B during migration.
  • Flexible migration windows: We will not enforce a timeline by which adtechs need to complete migration.

After third-party cookie deprecation, we plan to continue to support cloud migration with the re-encryption feature, but may not be able to give the additional benefits outlined above to preserve privacy.

We welcome any feedback on this proposal.

Thank you!

Why must we supply a github_personal_access_token when building the AMI?

In the instructions for building the AMI (Building aggregation service artifacts), part of the instructions is to put a github_personal_access_token in codebuild.auto.tfvars.

Can you provide more information on this token?

  1. What scopes are required? All of them?
  2. Why do we need to supply a GitHub Personal Access Token? Is it to read something from GtHub?
  3. I feel uncomfortable putting this sensitive token in AWS where anyone in my company can access it.

Aggregation Service: AWS worker build issue and workaround

Hi Aggregation Service testers,

We have discovered an issue that broke the AWS worker build, caused by an incompatible Docker engine version upgrade. We are planning to release a new patch next week. Meanwhile, if you encounter issues building AWS worker, you can use the following workaround:

  • Create a new patch at <repo_root>/build_defs/shared_libraries/pin_pkr_docker.patch with the following content:
diff --git a/operator/worker/aws/setup_enclave.sh b/operator/worker/aws/setup_enclave.sh
index e4bd30371..8bf2e0fb1 100644
--- a/operator/worker/aws/setup_enclave.sh
+++ b/operator/worker/aws/setup_enclave.sh
@@ -19,7 +19,7 @@ sudo yum install -y https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/late
 #
 # Builds enclave image inside the /home/ec2-user directory as part of automatic
 # AMI generation.
-sudo yum install docker -y
+sudo yum install docker-24.0.5-1.amzn2023.0.3 -y
 sudo systemctl enable docker
 sudo systemctl start docker
 
  • Add the new patch to list of patches under shared_libraries rules in the WORKSPACE file. The shared_libraries rule should now become:
git_repository(
    name = "shared_libraries",
    patch_args = [
        "-p1",
    ],
    remote = "https://github.com/privacysandbox/coordinator-services-and-shared-libraries",
    patches = [
        "//build_defs/shared_libraries:coordinator.patch",
        "//build_defs/shared_libraries:gcs_storage_client.patch",
        "//build_defs/shared_libraries:dependency_update.patch",
        "//build_defs/shared_libraries:key_cache_ttl.patch",
        "//build_defs/shared_libraries:pin_pkr_docker.patch",
    ],
    tag = COORDINATOR_VERSION,
    workspace_file = "@shared_libraries_workspace//file",
)

Thank you!

Aggregation Service with Private Aggregation API

When an aggregatable report is created by sendHistogramReport() (i.e. called inside reportWin function) it contains shared info without attribution_destination nor source_registration_time. This seems to be logical as these keys are strictly related with attribution logic. Example:

"shared_info": "{\"api\":\"fledge\",\"debug_mode\":\"enabled\",\"report_id\":\"9ae1a0d0-8cf5-4951-b752-e932bf0f7705\",\"reporting_origin\":\"https://fledge-eu.creativecdn.com\",\"scheduled_report_time\":\"1668771714\",\"version\":\"0.1\"}"

More readable form:

{
 "api": "fledge",
 "debug_mode": "enabled",
 "report_id": "9ae1a0d0-8cf5-4951-b752-e932bf0f7705",
 "reporting_origin": "https://fledge-eu.creativecdn.com",
 "scheduled_report_time": "1668771714",
 "version": "0.1"
}

(note: version 0.1, values for: privacy_budget_key, attribution_destination, source_registration_time are missing)

In the same time Aggregation service expect to have both attribution_destination and source_registration_time for shared info.version==0.1 (since aggregation service version 0.4):
see SharedInfo.getPrivacyBudgetKey()

Tested on chrome:

  • 108.0.5359.48
  • 110.0.5433.0
  • 107.0.5304.110
    And local aggregation service 0.4

The following exception were printed:
CustomMetric{nameSpace=scp/worker, name=WorkerJobError, value=1.0, unit=Count, labels={Type=JobHandlingError}}
2022-11-22 09:10:54:120 +0100 [WorkerPullWorkService] ERROR com.google.aggregate.adtech.worker.WorkerPullWorkService - Exception occurred in worker
com.google.aggregate.adtech.worker.JobProcessor$AggregationJobProcessException: java.util.concurrent.ExecutionException: java.util.NoSuchElementException: No value present
at com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor.process(ConcurrentAggregationProcessor.java:400)
at com.google.aggregate.adtech.worker.WorkerPullWorkService.run(WorkerPullWorkService.java:145)
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$2.run(AbstractExecutionThreadService.java:67)
at com.google.common.util.concurrent.Callables.lambda$threadRenaming$3(Callables.java:103)
at java.base/java.lang.Thread.run(Thread.java:1589)
Caused by: java.util.concurrent.ExecutionException: java.util.NoSuchElementException: No value present
at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:588)
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:567)
at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:113)
at com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor.process(ConcurrentAggregationProcessor.java:295)
... 4 more
Caused by: java.util.NoSuchElementException: No value present
at java.base/java.util.Optional.get(Optional.java:143)
at com.google.aggregate.adtech.worker.model.SharedInfo.getPrivacyBudgetKey(SharedInfo.java:161)
at com.google.aggregate.adtech.worker.aggregation.engine.AggregationEngine.accept(AggregationEngine.java:88)
at com.google.aggregate.adtech.worker.aggregation.engine.AggregationEngine.accept(AggregationEngine.java:49)

Discussion - debugging support for summary reports

We are working on adding the possibility to generate debug summary reports from encrypted aggregatable reports with the AWS based aggregation service. This capability will be time-limited and be phased out at a later time.

We would like to hear from you what capabilities you'd like to see in these debug summary reports.

Some ideas we are considering:

  • return unnoised metric and noise that would have been applied to the metric with the given epsilon
  • return metrics that have not been listed in the output domain with an annotation hinting to the omission

Questions:

  • What helps you to understand the inner workings of the system better and helps you to develop experiments using the attribution reporting APIs?
  • How are you currently generating your output domains and what tools would help you to simplify this process?

Ability to query key aggregate

Hello,

One interesting evolution of the aggregation service would be to enable querying aggregate of keys. I think this was mentioned in the aggregate attribution API at a time when the aggregation was supposed to be performed by MPC rather than TEEs.
In other words, I would love to be able to query a bit mask (eg for a 8 bit key, 01100*01 would be 01100101 and 01100001).
This would enable a greater flexibility for decoding (ie chosing which encoded variables to get depending on the number of reports), and negate the need to adapt the encoding depending on the expected traffic to the destination website.
Thanks!
P.S. I can cross-post on https://github.com/WICG/attribution-reporting-api if needed.

Aggregation job failing in AWS with error DECRYPTION_KEY_NOT_FOUND

I am able to trigger the aggregation job with /createJob endpoint deployed via terraform in aws. While running the /getJob with the request id, I am getting below error:

"result_info": { "return_code": "REPORTS_WITH_ERRORS_EXCEEDED_THRESHOLD", "return_message": "Aggregation job failed early because the number of reports excluded from aggregation exceeded threshold.", "error_summary": { "error_counts": [ { "category": "DECRYPTION_KEY_NOT_FOUND", "count": 1, "description": "Could not find decryption key on private key endpoint." }, { "category": "NUM_REPORTS_WITH_ERRORS", "count": 1, "description": "Total number of reports that had an error. These reports were not considered in aggregation. See additional error messages for details on specific reasons." } ], "error_messages": [] }, "finished_at": "2024-05-0

I could see @ydennisy also had similar issue but could not find the solution for it.

Unable to test noise addition using the local aggregation service tool

Hi, I work in the Google Ad Traffic Quality Team. I am using the local aggregation service tool to simulate noise on locally generated aggregatable reports. However, due to the contribution budget limits, I am unable to create multiple aggregatable reports that will correctly represent my data. What is the best way for me to test this locally, can I manually create an aggregatable report with very high values (corresponding to a raw summary report) for testing?

Could someone help me validate if I am collecting the reports correctly (attribution-report NODE JS version)

Hello everyone, I'm currently trying to create a version of attribution-reporting in NODE JS so far so good, I managed to complete the entire journey (trigger interactions with creatives, conversion on the final website, generate event and aggregable reports)

But I got to this part where I must store the aggregatable reports before sending them to the aggregation services, I wanted to know if anyone else did this step of collecting the reports in NODE JS

Below is the code responsible for collecting and storing the reports (I took the documentation code written in GO as a reference)

*Spoiler: Each report record I receive generates an .avro file

const avro = require('avsc');

const REPORTS_AVRO_SCHEMA = {
    "name": "AvroAggregatableReport",
    "type": "record",
    "fields": [
        { "name": "payload", "type": "bytes" },
        { "name": "key_id", "type": "string" },
        { "name": "shared_info", "type": "string" }
    ]
};
const RECORD_SCHEMA = avro.Type.forSchema(REPORTS_AVRO_SCHEMA);

const registerAggregateReport = (req, res) => {
    try {
        // const report = req.body;
        // Example to illustrate what the request body would be
        const report = {
            "aggregation_coordinator_origin": "https://publickeyservice.msmt.aws.privacysandboxservices.com",
            "aggregation_service_payloads": [
                {
                    "key_id": "bbe6351f-5619-4c98-84b2-4a74fa1ae254",
                    "payload": "7K9SQLdROKqITmnrkgIDulfEXDAR76XUP4vc6uzxPwDycQql3AhR3dxeXdEw2gbUaIAldnu33RSN4SAFcFFKgDQkvnhFzPoxJjO2Yfw4osJ1S0Odp0smu0rC5k5GuG4oIu9YQofCPNmSD7KRVJ9Y6Lucz3BXoI3RQhpQkO31RDyxVJdBbJ8JiS2KBtu8naUf5Z+/mNNKp39ObsNbo7kQKI0TwyRJDSJKqv42Yi3ctoAhOT0eaaUtMfho67i9XaEtVnh8wB4Mi+nzlAfVsGIavP6aXWDe44IgKZvTS/zEKjI68+nzWkyfdRNOf7jtb2XnoB7k5iM+Yu9Ayk5ic/aT1eA1iPEzLvW/tNLcohne3UL2DefZoTLb5l9aludA7Qlf0g+kW9nuvUSmHBuTjE/fTY5s9uRExHH+b2Hjm2sL9DyrFZUFqcl/KLS+McgOT8I0ZTpPRmr+njW8+4b01Hsc2MpY3KKAn1jUDUE45pGbhj/Gqlb1ikJO9nNKS/nnWJgR7+3P8JEpHC2fkfEase4+vrNxZujWolYfTUxswJpiEZs1+fCOroEyyEY6Zjvx5qLbk+7wMNqCeCltDPA6c8WtAPtMreIUvKbco6XUUzaGSnvWLz6/WJqCxG4hjPOfcYAWXIwSboqvNyBHrRr4H5V7C0unSkIjd0j/GeB3ywgnKEqiihuvZ5PPw+O5aYqJdaR3QEFZtpLj+3Uv4OGn2+CvU1thV0A0H1XViP846Tfmb0jVejN1+ih+VO5cf/7T2TPz6oGO9sa6qitWtll5vhwxVyG3vniCo3xghGnUcHSP5ogfp6qgDGSgsGFqSvdiuOpQU+MG/HrCDUjvce0GoXJP6674UcurGxR9UKAnVwZyKRIj/q9qzUgxhWEFC3ssADMmxhZBs3X+rrAxKfhXD12MfuUluRTCzpCKZ9/YapnJQYjngGx7GIkfW6tw8eSCC8yO41vWyHGRz4nKlgNeQkwYafGPzXqUXjyEyiupMUlmSsU/zT52wdCQYLJbQg7xhNuLebb8qh9LW07jMho4Vo9DBP9l463uqA8hcZnJ"
                }
            ],
            "shared_info": "{\"api\":\"attribution-reporting\",\"attribution_destination\":\"https://cliente.com\",\"report_id\":\"4d82121f-7d62-4fa4-bda4-a70c9e850089\",\"reporting_origin\":\"https://attribution.ads.uol.com.br\",\"scheduled_report_time\":\"1714764978\",\"source_registration_time\":\"0\",\"version\":\"0.1\"}"
        }
          

        report.aggregation_service_payloads.map(payload => {
            const payloadBytes = Buffer.from(payload.payload, 'base64');
            const record = {
                payload: payloadBytes,
                key_id: payload.key_id,
                shared_info: report.shared_info,
            };

            const outputFilename = `./reports/output_reports_${Date.now()}.avro`;
            const encoder = avro.createFileEncoder(outputFilename, RECORD_SCHEMA);
            encoder.write(record);
            encoder.end()
        });
        res.status(200).send('Report received successfully.');
    } catch (e) {
        console.error('Error processing report:', e);
        res.status(400).send('Failed to process report.');
    }
};

module.exports = {
    registerAggregateReport
}

*English is not my native language so take it easy

As the death of third-party cookies is something that will affect everyone, it would be nice to have references in more commonly used languages ​​such as NodeJs, Java, etc., I hope this post can contribute in some way to this

Consider migration from origin to site enrollment for Aggregation service: Feedback requested

Hi all!

We are currently exploring migration from origin enrollment to site enrollment for the Aggregation Service (current form using origin here) for the following reasons:

  • Consistency with the client API enrollment that uses site
  • Ease of onboarding for adtechs, so they don't have to enroll each origin individually

As a follow up to this proposal, we would like to support multiple origins in a batch of aggregatable reports. Do adtechs have a preference or blocking concern with either specifying a list of origins or the site in the createJob request?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.