Comments (9)
Assuming based on the screenshot that this is a GCP environment, correct? Does this happen on every job or intermittent jobs?
Jobs can get stuck with the "RECEIVED" status when the instances within the managed instance group (MIG) are not running or have crashed. If service account onboarding has not been completed, the MIG could be in an unhealthy state. Can you confirm that the service account has been onboarded?
To check the MIGโs health in the cloud console:
- Navigate to Compute Engine > Instance Groups
- Select your instance group then select the errors tab
To check the MIG's health using gcloud CLI:
gcloud compute instance-groups managed list-errors <MIG_NAME> --region=<REGION>
from trusted-execution-aggregation-service.
Hi @chasinandrew ,
Yes, it is a GCP environment and this issue happens on each job.
I checked the MIG's health, and there exists no error, only one warning.
BTW, we have 4 worker VM instances, and I found the 403 error
in three
of them. Some of them seem unstable
that keep restarting. Could this be the key point ?
Thank you for quickly replying and it really helps !!!
from trusted-execution-aggregation-service.
No problem! This could be happening because of the unstable VMs. To help us replicate this could you provide the following info:
- Which aggregation service version do you have deployed?
- Can you please provide the terraform deployment parameters if they're available?
- Can you send the JSON in plaintext or file form with the request and response?
- If available, can you send the avro report and output_domain.avro?
from trusted-execution-aggregation-service.
Hi @chasinandrew ,
1.We used the latest repo(https://github.com/privacysandbox/aggregation-service) to deploy. So is the version v2.4.2 ?
2. The deployment parameters: dev.auto.tfvars.txt
3. Request & response: request&response.txt
4. Due that comment doesn't support to attach an avro file, I upload avro files to my github repo.
avro report
output_domain.avro
Our google cloud link is https://console.cloud.google.com/home/dashboard?project=ecs-1709881683838. but I don't know if you have the permission to access it.
Thank you for helping to delve into the issue~
from trusted-execution-aggregation-service.
Thanks @yanghuang1028! This 403 error can happen when onboarding is incomplete. Can you please fill out this onboarding form to register your domain and service account?
from trusted-execution-aggregation-service.
@chasinandrew We filled out the form a few weeks ago, and your team sent a email to us.
![image](https://private-user-images.githubusercontent.com/48471702/329747776-4a75a8af-d550-4f2e-a345-d114830d6afa.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTgxMDgzNjMsIm5iZiI6MTcxODEwODA2MywicGF0aCI6Ii80ODQ3MTcwMi8zMjk3NDc3NzYtNGE3NWE4YWYtZDU1MC00ZjJlLWEzNDUtZDExNDgzMGQ2YWZhLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MTElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjExVDEyMTQyM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTk4N2RiN2Y1M2VkNjM1OGU3MmMyNzNhYzY5NWU3OTk1M2I5ZGNlN2FkOTYwZDZlYTg5YWVjYThmMDU1MGE2ZjgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.-6MXctIpmGcz1poirnRuzGE_1mcMhWy1dnvBfOheM58)
Oh, I see. We used a different
service account to do this deployment. Could you help us to update the worker service account ?
our new worker service account is sa-worker-aggregation-service@ecs-1709881683838.iam.gserviceaccount.com
BTW, we just registered the domain in the production environment. If we do not register the domain of the staging environment, can the aggregation service correctly handle the reports from the staging environment(we can manually change chrome's settings to receive the reports from staging env now)?
our staging reporting site is https://adservice-1.stratus.qa.ebay.com/
Thanks again!
from trusted-execution-aggregation-service.
Hi @yanghuang1028, I recommend to communicate this information through our support email alias. I'll be hiding your previous comment to avoid having that information in the public.
@chasinandrew please move support conversations around onboarding to email.
Re your question on prod vs staging: Your service account is connected to the site that is onboarded --> if the same service account (in the same GCP project) is used to process your reports you'll be able to process them in staging / prod. If a different account is used a separate onboarding request will be required.
from trusted-execution-aggregation-service.
Thanks for protecting our private infomation!
The separate onboarding request is completed, and the job can be processed now. However, the job threw a TRANSACTION_MANAGER_RETRIES_EXCEEDED error when processing.
{
"job_status": "FINISHED",
"request_received_at": "2024-05-16T01:19:59.234435Z",
"request_updated_at": "2024-05-16T01:29:35.184066241Z",
"job_request_id": "test05",
"input_data_blob_prefix": "output/output_regular_reports_2024-04-24T02:38:04-07:00.avro",
"input_data_bucket_name": "tracking_tf_state_bucket",
"output_data_blob_prefix": "output/summary_report.avro",
"output_data_bucket_name": "tracking_tf_state_bucket",
"postback_url": "",
"result_info": {
"return_code": "PRIVACY_BUDGET_ERROR",
"return_message": "com.google.aggregate.adtech.worker.exceptions.AggregationJobProcessException: Exception while consuming privacy budget. Exception message: TRANSACTION_MANAGER_RETRIES_EXCEEDED \n com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor.consumePrivacyBudgetUnits(ConcurrentAggregationProcessor.java:466) \n com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor.process(ConcurrentAggregationProcessor.java:329) \n com.google.aggregate.adtech.worker.WorkerPullWorkService.run(WorkerPullWorkService.java:142)\nThe root cause is: com.google.scp.operator.cpio.distributedprivacybudgetclient.TransactionEngine$TransactionEngineException: TRANSACTION_MANAGER_RETRIES_EXCEEDED \n com.google.scp.operator.cpio.distributedprivacybudgetclient.TransactionEngineImpl.proceedToNextPhase(TransactionEngineImpl.java:100) \n com.google.scp.operator.cpio.distributedprivacybudgetclient.TransactionEngineImpl.executeDistributedPhase(TransactionEngineImpl.java:196) \n com.google.scp.operator.cpio.distributedprivacybudgetclient.TransactionEngineImpl.executeCurrentPhase(TransactionEngineImpl.java:138)",
"error_summary": {
"error_counts": [],
"error_messages": []
},
"finished_at": "2024-05-16T01:29:35.113618072Z"
},
"job_parameters": {
"output_domain_blob_prefix": "domain/output_local_domain.avro",
"output_domain_bucket_name": "tracking_tf_state_bucket",
"attribution_report_to": "https://adservice-1.stratus.qa.ebay.com"
},
"request_processing_started_at": "2024-05-16T01:20:00.743721759Z"
}
The reports and domain.avro files are as followed:
avro report
output_domain.avro
BTW, where can I see the detail logs of each job processing on google cloud console ? I can't find it anywhere. Thanks a lot !
from trusted-execution-aggregation-service.
The job can be processed now, thanks a lot!
from trusted-execution-aggregation-service.
Related Issues (20)
- A Cloud Migration Tool for Aggregation Service: Feedback Requested
- Aggregation Service: AWS worker build issue and workaround HOT 1
- Debugging support in the aggregation service: Feedback Requested HOT 1
- Mismatch between API response and specification for `debug_privacy_epsilon` field HOT 1
- Update Docs HOT 2
- Confused about the output_domain.avro HOT 2
- 403 errors when deploying aggregation-service HOT 8
- Invalid value for member: issue when trying to deploy Aggregation Service to GCP HOT 4
- GCP Build container fails to build due to hanging apt-get install HOT 3
- Build Feature: GCP Build to upload zips to GCS HOT 2
- Aggregation service setup notes, snags & suggestions. HOT 1
- Clarification on aggregated report batching and privacy budget exhaustion HOT 6
- Feedback on consolidating Coordinator Services
- Could someone help me validate if I am collecting the reports correctly (attribution-report NODE JS version) HOT 4
- Aggregation job failing in AWS with error DECRYPTION_KEY_NOT_FOUND HOT 4
- Staging environments PRIVACY_BUDGET_AUTHORIZATION_ERROR HOT 2
- How to generate output_domain.avro when the values โโthat make up your bucket are dynamic (example: creative id) HOT 5
- getting service error without explanation when using aggregation service HOT 1
- Job status is always RECEIVED (Terraform AWS) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from trusted-execution-aggregation-service.