Comments (5)
Thanks for reporting this. It may be a bug on our side. To unblock you, could you temporarily set the base_image without /
locally? Thanks.
from python-aiplatform.
@abcdefgs0324 Thank you for giving me the idea of renaming the image name locally! It turned out /
was not the root cause. After I gave a short name without /
to nvidia/cuda:11.1.1-devel-ubuntu20.04
, the error happened to my environment. I'll get back to you once I can narrow down the cases where the error happens. Thanks
from python-aiplatform.
Hi @abcdefgs0324,
Is there any suggestion how can a user set up base_image
with nvidia/pytorch on board?
Two options below produce the error described in the original post:
- base_image="us-docker.pkg.dev/vertex-ai/prediction/pytorch-gpu.2-0:latest"
- base_image="pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime"
from python-aiplatform.
What is the purpose of this image? Is it meant to be a base for CPR or just an example?
from python-aiplatform.
@abcdefgs0324 Thank you for giving me the idea of renaming the image name locally! It turned out
/
was not the root cause. After I gave a short name without/
tonvidia/cuda:11.1.1-devel-ubuntu20.04
, the error happened to my environment. I'll get back to you once I can narrow down the cases where the error happens. Thanks
Did you end up figuring out what caused this? I am having the same issue. I wish there was a way to view the internal output from Docker to be able to debug this.
EDIT--
I was able to grab the debug info locally by changing the following in site-packages/google/cloud/aiplatform/docker_utils/local_util.py
, line 59 :
Change _logger.info(line)
to print(line)
for line in out:
print(line)
# _logger.info(line)
There is probably a way to instantiate a logger locally to catch these lines? I am not too familiar with logging, but I did notice that the logging does work when this runs in the cloud.
Anyway, my problem was just a problem with the requirements.txt
file.
from python-aiplatform.
Related Issues (20)
- tests.system.aiplatform.test_dataset.TestDataset: test_get_new_dataset_and_import failed HOT 1
- Warning: a recent release failed
- tests.system.aiplatform.test_model_upload.TestModelUploadAndUpdate: test_upload_and_deploy_xgboost_model failed HOT 1
- tests.system.aiplatform.test_experiments.TestExperiments: test_get_time_series_data_frame_batch_read_success failed HOT 1
- tests.system.vertex_ray.test_cluster_management.TestClusterManagement: test_cluster_management[2.9] failed HOT 2
- tests.system.vertexai.test_reasoning_engines.TestReasoningEngines: test_langchain_template failed HOT 4
- Gemini Batch Prediction API throws "INTERNAL" error HOT 3
- Getting AttributeError when importing cloud_profiler from google cloud aiplatform HOT 2
- tests.system.vertex_ray.test_ray_data.TestRayData: test_ray_data[2.9] failed HOT 2
- 401 Deadline exception using Imagen Python API
- tests.system.aiplatform.test_language_models.TestLanguageModels: test_text_generation[grpc] failed HOT 1
- tests.system.aiplatform.test_language_models.TestLanguageModels: test_chat_model_async[grpc] failed HOT 1
- tests.unit.aiplatform.test_metadata.TestExperiments: test_get_experiment_df failed HOT 4
- tests.system.vertex_ray.test_job_submission_dashboard.TestJobSubmissionDashboard: test_job_submission_dashboard[2.9] failed HOT 6
- vertex-aiplatform 1.54 throwing runtime exception: Gapic client context issue . This can occur due to parallelization HOT 2
- tests.system.vertexai.test_batch_prediction.TestBatchPrediction: test_batch_prediction_with_gcs_input failed HOT 1
- Allow overwrite_table to be set in vertex_ray.data.write_bigquery()
- Batch prediction for Gemini: Failed to import data
- Palm2 TextGeneration parameter logit_bias not working
- Gemini Batch Prediction Request Job failed: code: 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-aiplatform.