Git Product home page Git Product logo

mlops-with-vertex-ai's People

Contributors

kardiff18 avatar ksalama avatar lvaylet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mlops-with-vertex-ai's Issues

02-Experimentation: Create classifier fails w/layer requires matching shapes

ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 2), (None, 4), (7,), (None, 3), (None, 1), (None, 1), (6,), (None, 3), (None, 3), (None, 1), (None, 10)]

Input Features
dropoff_grid_xf <dtype: 'int64'>: [0, 0, 0]
euclidean_xf <dtype: 'float32'>: [0.669279932975769, -0.8318284749984741, -0.8318284749984741]
loc_cross_xf <dtype: 'int64'>: [0, 0, 0]
payment_type_xf <dtype: 'int64'>: [2, 0, 0]
pickup_grid_xf <dtype: 'int64'>: [0, 0, 0]
trip_day_of_week_xf <dtype: 'int64'>: [0, 6, 2]
trip_day_xf <dtype: 'int64'>: [8, 30, 1]
trip_hour_xf <dtype: 'int64'>: [1, 13, 6]
trip_miles_xf <dtype: 'float32'>: [2.3255326747894287, -0.22459185123443604, -0.4029441475868225]
trip_month_xf <dtype: 'int64'>: [3, 1, 0]
trip_seconds_xf <dtype: 'float32'>: [0.9550504088401794, -0.2630620300769806, -0.24356801807880402]
target: [0, 0, 0]

Access Denied in BQ

Hi,

when I run bigquery statement in the first notebook example (01-dataset-management), I get the following error

Forbidden: 403 POST https://bigquery.googleapis.com/bigquery/v2/projects/<project_id>/jobs?prettyPrint=false: Access Denied: Project <project_id>: User does not have bigquery.jobs.create permission in project <project_id>.

bq command from the notebook terminal works

I followed all the required steps from README.me

I got only this errors but I don't think they are correlated

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tfx 1.2.0 requires google-cloud-aiplatform<0.8,>=0.5.0, but you have google-cloud-aiplatform 1.4.2 which is incompatible.
tfx 1.2.0 requires google-cloud-bigquery<2.21,>=1.28.0, but you have google-cloud-bigquery 2.26.0 which is incompatible.
tfx-bsl 1.2.0 requires google-cloud-bigquery<2.21,>=1.28.0, but you have google-cloud-bigquery 2.26.0 which is incompatible.
tensorflow 2.5.1 requires grpcio~=1.34.0, but you have grpcio 1.41.1 which is incompatible.
tensorflow-transform 1.2.0 requires google-cloud-bigquery<2.21,>=1.28.0, but you have google-cloud-bigquery 2.26.0 which is incompatible.
tensorflow-model-analysis 0.33.0 requires google-cloud-bigquery<2.21,>=1.28.0, but you have google-cloud-bigquery 2.26.0 which is incompatible.
tensorflow-data-validation 1.2.0 requires google-cloud-bigquery<2.21,>=1.28.0, but you have google-cloud-bigquery 2.26.0 which is incompatible.

`Concatenate` layer requires inputs with matching shapes

Problem

03-training-formalization.ipynb

ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concatenation axis. Received: input_shape=[(None, 2), (None, 4), (7,), (None, 3), (None, 1), (None, 1), (6,), (None, 3), (None, 3), (None, 1), (None, 10)]

At:

_train_module_file = 'src/model_training/runner.py'

trainer = tfx.components.Trainer(
    module_file=_train_module_file,
    examples=transform.outputs['transformed_examples'],
    schema=schema_importer.outputs['result'],
    base_model=latest_model_resolver.outputs['latest_model'],
    transform_graph=transform.outputs['transform_graph'],
    hyperparameters=hyperparams_gen.outputs['hyperparameters'],
)

context.run(trainer, enable_cache=False)

Output:

running bdist_wheel
running build
running build_py
creating build
creating build/lib
copying trainer.py -> build/lib
copying runner.py -> build/lib
copying defaults.py -> build/lib
copying task.py -> build/lib
copying model.py -> build/lib
copying exporter.py -> build/lib
copying data.py -> build/lib
installing to /tmp/tmpaet8vft8
running install
running install_lib
copying build/lib/task.py -> /tmp/tmpaet8vft8
copying build/lib/model.py -> /tmp/tmpaet8vft8
copying build/lib/data.py -> /tmp/tmpaet8vft8
copying build/lib/runner.py -> /tmp/tmpaet8vft8
copying build/lib/defaults.py -> /tmp/tmpaet8vft8
copying build/lib/trainer.py -> /tmp/tmpaet8vft8
copying build/lib/exporter.py -> /tmp/tmpaet8vft8
running install_egg_info
running egg_info
creating tfx_user_code_Trainer.egg-info
writing tfx_user_code_Trainer.egg-info/PKG-INFO
writing dependency_links to tfx_user_code_Trainer.egg-info/dependency_links.txt
writing top-level names to tfx_user_code_Trainer.egg-info/top_level.txt
writing manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt'
reading manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt'
writing manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt'
Copying tfx_user_code_Trainer.egg-info to /tmp/tmpaet8vft8/tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12-py3.7.egg-info
running install_scripts
creating /tmp/tmpaet8vft8/tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12.dist-info/WHEEL
creating '/tmp/tmpv9boslhs/tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12-py3-none-any.whl' and adding '/tmp/tmpaet8vft8' to it
adding 'data.py'
adding 'defaults.py'
adding 'exporter.py'
adding 'model.py'
adding 'runner.py'
adding 'task.py'
adding 'trainer.py'
adding 'tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12.dist-info/METADATA'
adding 'tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12.dist-info/WHEEL'
adding 'tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12.dist-info/top_level.txt'
adding 'tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12.dist-info/RECORD'
removing /tmp/tmpaet8vft8
/opt/conda/lib/python3.7/site-packages/setuptools/command/install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  setuptools.SetuptoolsDeprecationWarning,
WARNING:absl:Examples artifact does not have payload_format custom property. Falling back to FORMAT_TF_EXAMPLE
WARNING:absl:Examples artifact does not have payload_format custom property. Falling back to FORMAT_TF_EXAMPLE
WARNING:absl:Examples artifact does not have payload_format custom property. Falling back to FORMAT_TF_EXAMPLE
WARNING: Ignoring invalid distribution -umpy (/opt/conda/lib/python3.7/site-packages)
Processing /tmp/tmp31cqmx79/tfx_user_code_Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12-py3-none-any.whl
WARNING: Ignoring invalid distribution -umpy (/opt/conda/lib/python3.7/site-packages)
Installing collected packages: tfx-user-code-Trainer
Successfully installed tfx-user-code-Trainer-0.0+5bd9ac13044cc46c88b21ccdcff2b74d85a1c15be76527e33008964fa7da7d12
WARNING: Ignoring invalid distribution -umpy (/opt/conda/lib/python3.7/site-packages)
WARNING: Ignoring invalid distribution -umpy (/opt/conda/lib/python3.7/site-packages)
WARNING: Ignoring invalid distribution -umpy (/opt/conda/lib/python3.7/site-packages)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_10107/3433963696.py in <module>
     10 )
     11 
---> 12 context.run(trainer, enable_cache=False)

~/.local/lib/python3.7/site-packages/tfx/orchestration/experimental/interactive/notebook_utils.py in run_if_ipython(*args, **kwargs)
     29       # __IPYTHON__ variable is set by IPython, see
     30       # https://ipython.org/ipython-doc/rel-0.10.2/html/interactive/reference.html#embedding-ipython.
---> 31       return fn(*args, **kwargs)
     32     else:
     33       logging.warning(

~/.local/lib/python3.7/site-packages/tfx/orchestration/experimental/interactive/interactive_context.py in run(self, component, enable_cache, beam_pipeline_args)
    162         telemetry_utils.LABEL_TFX_RUNNER: runner_label,
    163     }):
--> 164       execution_id = launcher.launch().execution_id
    165 
    166     return execution_result.ExecutionResult(

~/.local/lib/python3.7/site-packages/tfx/orchestration/launcher/base_component_launcher.py in launch(self)
    201                          copy.deepcopy(execution_decision.input_dict),
    202                          execution_decision.output_dict,
--> 203                          copy.deepcopy(execution_decision.exec_properties))
    204 
    205     absl.logging.info('Running publisher for %s',

~/.local/lib/python3.7/site-packages/tfx/orchestration/launcher/in_process_component_launcher.py in _run_executor(self, execution_id, input_dict, output_dict, exec_properties)
     72     # output_dict can still be changed, specifically properties.
     73     executor.Do(
---> 74         copy.deepcopy(input_dict), output_dict, copy.deepcopy(exec_properties))

~/.local/lib/python3.7/site-packages/tfx/components/trainer/executor.py in Do(self, input_dict, output_dict, exec_properties)
    176     # Train the model
    177     absl.logging.info('Training model.')
--> 178     run_fn(fn_args)
    179 
    180     # Note: If trained with multi-node distribution workers, it is the user

/tmp/tmp7ilh_lit/runner.py in run_fn(fn_args)
     51         hyperparams=hyperparams,
     52         log_dir=log_dir,
---> 53         base_model_dir=fn_args.base_model,
     54     )
     55 

~/home/repositories/git/oonisim/python-programs/courses/mlops-with-vertex-ai/src/model_training/trainer.py in train(train_data_dir, eval_data_dir, tft_output_dir, hyperparams, log_dir, base_model_dir)
     57     tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir)
     58 
---> 59     classifier = model.create_binary_classifier(tft_output, hyperparams)
     60     if base_model_dir:
     61         try:

~/home/repositories/git/oonisim/python-programs/courses/mlops-with-vertex-ai/src/model_training/model.py in create_binary_classifier(tft_output, hyperparams)
     83         )
     84 
---> 85     return _create_binary_classifier(feature_vocab_sizes, hyperparams)

~/home/repositories/git/oonisim/python-programs/courses/mlops-with-vertex-ai/src/model_training/model.py in _create_binary_classifier(feature_vocab_sizes, hyperparams)
     62             pass
     63 
---> 64     joined = keras.layers.Concatenate(name="combines_inputs")(layers)
     65     feedforward_output = keras.Sequential(
     66         [

/opt/conda/lib/python3.7/site-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

/opt/conda/lib/python3.7/site-packages/keras/layers/merge.py in build(self, input_shape)
    517       ranks = set(len(shape) for shape in shape_set)
    518       if len(ranks) != 1:
--> 519         raise ValueError(err_msg)
    520       # Get the only rank for the set.
    521       (rank,) = ranks

ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concatenation axis. Received: input_shape=[(None, 2), (None, 4), (7,), (None, 3), (None, 1), (None, 1), (6,), (None, 3), (None, 3), (None, 1), (None, 10)]

Environment

tfx                                   1.8.0

Soluition

import tfx.v1

Add Hyperparameter tuning

Show how to submit a hyperparameter tuning job to Vertex Training in the experimentation notebook

module 'tfx.dsl.components' has no attribute 'common'

03-training-formalization.ipynb

schema_importer = tfx.dsl.components.common.importer.Importer(
    source_uri=RAW_SCHEMA_DIR,
    artifact_type=tfx.types.standard_artifacts.Schema,
    reimport=False
)

context.run(schema_importer)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_8454/2100786518.py in <module>
      1 #import tfx.v1
      2 
----> 3 schema_importer = tfx.dsl.components.common.importer.Importer(
      4 #schema_importer = tfx.v1.dsl.Importer(
      5     source_uri=RAW_SCHEMA_DIR,

AttributeError: module 'tfx.dsl.components' has no attribute 'common'

Environment

tfx                                   1.8.0

Soluition

import tfx.v1

Model Versioning using Vertex AI Model

How do I register the model and load the different versions of the model using Vertex AI model.

For instance, I have trained scikit-learn model and I want to register it as v1(automatically as v1). After sometime, I have retrained my model with new features and I want to register my model with new version(automatically as v2).

Now I want to load my model like aiplatform.Model.load_model("model_name", "model_version") (not the current feature)

How Can I do this using Vertex AI Models

Getting started instructions fail at "sudo apt-get install google-cloud-sdk"

Step 6 in the getting started instructions fails due to google-cloud-sdk trying to overwrite a LICENSE file installed by google-cloud-cli. This seems like it probably isn't your problem, I think you'll have to go chase the SDK folks to get it fixed.

$ sudo apt-get install google-cloud-sdk
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Suggested packages:
  google-cloud-sdk-app-engine-java google-cloud-sdk-app-engine-python google-cloud-sdk-pubsub-emulator google-cloud-sdk-bigtable-emulator google-cloud-sdk-datastore-emulator
The following NEW packages will be installed:
  google-cloud-sdk
0 upgraded, 1 newly installed, 0 to remove and 21 not upgraded.
Need to get 0 B/153 MB of archives.
After this operation, 755 MB of additional disk space will be used.
(Reading database ... 138034 files and directories currently installed.)
Preparing to unpack .../google-cloud-sdk_444.0.0-0_all.deb ...
Unpacking google-cloud-sdk (444.0.0-0) ...
dpkg: error processing archive /var/cache/apt/archives/google-cloud-sdk_444.0.0-0_all.deb (--unpack):
 trying to overwrite '/usr/share/google-cloud-sdk/LICENSE', which is also in package google-cloud-cli 438.0.0-0
dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)
Errors were encountered while processing:
 /var/cache/apt/archives/google-cloud-sdk_444.0.0-0_all.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

I tried apt-get update and apt-get upgrade which got me to cli version 444, but that had the same problem:

$ sudo apt-get install google-cloud-sdk
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Suggested packages:
  google-cloud-sdk-app-engine-java google-cloud-sdk-app-engine-python google-cloud-sdk-pubsub-emulator google-cloud-sdk-bigtable-emulator google-cloud-sdk-datastore-emulator
The following NEW packages will be installed:
  google-cloud-sdk
0 upgraded, 1 newly installed, 0 to remove and 1 not upgraded.
Need to get 0 B/153 MB of archives.
After this operation, 755 MB of additional disk space will be used.
(Reading database ... 138797 files and directories currently installed.)
Preparing to unpack .../google-cloud-sdk_444.0.0-0_all.deb ...
Unpacking google-cloud-sdk (444.0.0-0) ...
dpkg: error processing archive /var/cache/apt/archives/google-cloud-sdk_444.0.0-0_all.deb (--unpack):
 trying to overwrite '/usr/share/google-cloud-sdk/LICENSE', which is also in package google-cloud-cli 444.0.0-0
dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)
Errors were encountered while processing:
 /var/cache/apt/archives/google-cloud-sdk_444.0.0-0_all.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

DataTransformer ModuleNotFoundError: No module named 'user_module_0' error

ModuleNotFoundError: No module named 'user_module_0' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 649, in do_work work_executor.execute() File "/usr/local/lib/python3.7/site-packages/dataflow_worker/executor.py", line 179, in execute op.start() File "apache_beam/runners/worker/operations.py", line 710, in apache_beam.runners.worker.operations.DoOperation.start File "apache_beam/runners/worker/operations.py", line 712, in apache_beam.runners.worker.operations.DoOperation.start File "apache_beam/runners/worker/operations.py", line 713, in apache_beam.runners.worker.operations.DoOperation.start File "apache_beam/runners/worker/operations.py", line 311, in apache_beam.runners.worker.operations.Operation.start File "apache_beam/runners/worker/operations.py", line 317, in apache_beam.runners.worker.operations.Operation.start File "apache_beam/runners/worker/operations.py", line 659, in apache_beam.runners.worker.operations.DoOperation.setup File "apache_beam/runners/worker/operations.py", line 660, in apache_beam.runners.worker.operations.DoOperation.setup File "apache_beam/runners/worker/operations.py", line 292, in apache_beam.runners.worker.operations.Operation.setup File "apache_beam/runners/worker/operations.py", line 306, in apache_beam.runners.worker.operations.Operation.setup File "apache_beam/runners/worker/operations.py", line 799, in apache_beam.runners.worker.operations.DoOperation._get_runtime_performance_hints File "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", line 294, in loads return dill.loads(s) File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 275, in loads return load(file, ignore, **kwds) File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 270, in load return Unpickler(file, ignore=ignore, **kwds).load() File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 472, in load obj = StockUnpickler.load(self) File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 826, in _import_module return __import__(import_name) ModuleNotFoundError: No module named 'user_module_0'

Running the transform component of this repository in Vertex AI hits the following error (in the 04-pipeline-deployment.ipynb notebook). Does anyone have a quick fix for this? Have tried specifying setup.py and "save_main_session": True so far with no luck.

Error: 02 - ML Experimentation with Custom Model

Hi I'm running the tutorial with TFX 1.4.0 and TF 2.7.0.

When I run the cell
classifier = model.create_binary_classifier(tft_output, hyperparams) classifier.summary()

I get the error:


ValueError Traceback (most recent call last)
in
----> 1 classifier = model.create_binary_classifier(tft_output, hyperparams)
2 classifier.summary()

~/mlops-with-vertex-ai/src/model_training/model.py in create_binary_classifier(tft_output, hyperparams)
83 )
84
---> 85 return _create_binary_classifier(feature_vocab_sizes, hyperparams)

~/mlops-with-vertex-ai/src/model_training/model.py in _create_binary_classifier(feature_vocab_sizes, hyperparams)
62 pass
63
---> 64 joined = keras.layers.Concatenate(name="combines_inputs")(layers)
65 feedforward_output = keras.Sequential(
66 [

/opt/conda/lib/python3.7/site-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
65 except Exception as e: # pylint: disable=broad-except
66 filtered_tb = _process_traceback_frames(e.traceback)
---> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
69 del filtered_tb

/opt/conda/lib/python3.7/site-packages/keras/layers/merge.py in build(self, input_shape)
514 ranks = set(len(shape) for shape in shape_set)
515 if len(ranks) != 1:
--> 516 raise ValueError(err_msg)
517 # Get the only rank for the set.
518 (rank,) = ranks

ValueError: A Concatenate layer requires inputs with matching shapes except for the concatenation axis. Received: input_shape=[(None, 2), (None, 4), (7,), (None, 3), (None, 1), (None, 1), (6,), (None, 3), (None, 3), (None, 1), (None, 10)]

Python markupsafe dependency error in 03-training-formalization.ipynb

The latest Python 3 package markupsafe is not compatible with from tfx.orchestration.experimental.interactive.interactive_context import InteractiveContext:

import ml_metadata as mlmd
from ml_metadata.proto import metadata_store_pb2
from tfx.orchestration.experimental.interactive.interactive_context import InteractiveContext

Error:

ImportError Traceback (most recent call last)
/tmp/ipykernel_1/3645963042.py in
1 import ml_metadata as mlmd
2 from ml_metadata.proto import metadata_store_pb2
----> 3 from tfx.orchestration.experimental.interactive.interactive_context import InteractiveContext
4
5 connection_config = metadata_store_pb2.ConnectionConfig()

~/.local/lib/python3.7/site-packages/tfx/orchestration/experimental/interactive/interactive_context.py in
35
36 import absl
---> 37 import jinja2
38 import nbformat
39 from tfx import types

~/.local/lib/python3.7/site-packages/jinja2/init.py in
10 from .bccache import FileSystemBytecodeCache
11 from .bccache import MemcachedBytecodeCache
---> 12 from .environment import Environment
13 from .environment import Template
14 from .exceptions import TemplateAssertionError

~/.local/lib/python3.7/site-packages/jinja2/environment.py in
23 from .compiler import CodeGenerator
24 from .compiler import generate
---> 25 from .defaults import BLOCK_END_STRING
26 from .defaults import BLOCK_START_STRING
27 from .defaults import COMMENT_END_STRING

~/.local/lib/python3.7/site-packages/jinja2/defaults.py in
1 # -- coding: utf-8 --
2 from ._compat import range_type
----> 3 from .filters import FILTERS as DEFAULT_FILTERS # noqa: F401
4 from .tests import TESTS as DEFAULT_TESTS # noqa: F401
5 from .utils import Cycler

~/.local/lib/python3.7/site-packages/jinja2/filters.py in
11 from markupsafe import escape
12 from markupsafe import Markup
---> 13 from markupsafe import soft_unicode
14
15 from ._compat import abc

ImportError: cannot import name 'soft_unicode' from 'markupsafe' (/home/jupyter/.local/lib/python3.7/site-packages/markupsafe/init.py)

The workaround is to install an older version by pip install markupsafe==2.0.1 and restart kernel

01-dataset-management.ipynb fails at "Create Vertex Dataset resource" with "no attribute 'SUPPORTED_REGIONS'"

In 01-dataset-management.ipynb at "Create Vertex Dataset resource" I get the following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/var/tmp/ipykernel_23085/3064911949.py in <module>
      1 vertex_ai.init(
      2     project=PROJECT,
----> 3     location=REGION
      4 )

/opt/conda/lib/python3.7/site-packages/google/cloud/aiplatform/initializer.py in init(self, project, location, experiment, experiment_description, staging_bucket, credentials, encryption_spec_key_name)
     97             self._project = project
     98         if location:
---> 99             utils.validate_region(location)
    100             self._location = location
    101         if staging_bucket:

/opt/conda/lib/python3.7/site-packages/google/cloud/aiplatform/utils/__init__.py in validate_region(region)
    272 
    273     region = region.lower()
--> 274     if region not in constants.SUPPORTED_REGIONS:
    275         raise ValueError(
    276             f"Unsupported region for Vertex AI, select from {constants.SUPPORTED_REGIONS}"

AttributeError: module 'google.cloud.aiplatform.constants' has no attribute 'SUPPORTED_REGIONS'

It's mentioned here: googleapis/python-aiplatform#1106

Sounds like it used to be in constants.SUPPORTED_REGIONS and now it's in constants.base.SUPPORTED_REGION.

A doubt regarding the TFX pipeline associated with Continuous Training

Hi @ksalama.

Thank you very much for this amazing resource. It's a mini-book in itself.

I am referring to the statement written just below the continuous training section:

The end-to-end TFX training pipeline implementation is in the src/pipelines directory, which covers the following steps:

Are these pipelines demonstrated in any of the notebooks?

google.api_core.exceptions.PermissionDenied: 403 Permission denied on resource project {PROJECT}.

hi, i followed your tutorials and everything was fine until Notebook 6, When I tried to Run create endpoint section.

!python build/utils.py \ --mode=create-endpoint\ --project={PROJECT}\ --region=us-central1\ --endpoint-display-name={ENDPOINT_DISPLAY_NAME}
I had this error:
google.api_core.exceptions.PermissionDenied: 403 Permission denied on resource project {PROJECT}.
Any suggestions?

Version an already trained custom model on Model Registry

Hi, everyone!

I trained a model using scikit-learn, and saved it as a pickle file, stored in a GCS bucket. I was wondering... how can I version this model using model registry since it is already trained?

In my case, I have the option of using Cloud Run for this (a cloud run container that runs a retraining task every week), but I want to start doing it on Vertex AI.

After reading some articles about Vertex AI Model Registry, I concluded that the model must be firstly trained using a custom training job, and, just after that, we can begin its versioning on Model Registry. Is this correct?

Error while creating model

In notebook 02, cell:

classifier = model.create_binary_classifier(tft_output, hyperparams)
classifier.summary()

The following error appears:

Concatenate layer requires inputs with matching shapes except for the concatenation axis. Received: input_shape=[(None, 2), (None, 4), (7,), (None, 3), (None, 1), (None, 1), (6,), (None, 3), (None, 3), (None, 1), (None, 10)]

Python IndexError execution error in 03-training-formalization.ipynb

Python error encountered executing the following line at [Extract train and eval splits]:

sql_query = datasource_utils.get_training_source_query(

sql_query = datasource_utils.get_training_source_query(
    PROJECT, REGION, DATASET_DISPLAY_NAME, ml_use='UNASSIGNED', limit=5000)

Observed error:

IndexError Traceback (most recent call last)
/tmp/ipykernel_1/1584844956.py in
1 print(DATASET_DISPLAY_NAME)
2 sql_query = datasource_utils.get_training_source_query(
----> 3 PROJECT, REGION, DATASET_DISPLAY_NAME, ml_use='UNASSIGNED', limit=5000)
4
5 output_config = example_gen_pb2.Output(

~/mlops-with-vertex-ai/src/common/datasource_utils.py in get_training_source_query(project, region, dataset_display_name, ml_use, limit)
55 dataset = vertex_ai.TabularDataset.list(
56 filter=f"display_name={dataset_display_name}", order_by="update_time"
---> 57 )[-1]
58 bq_source_uri = dataset.gca_resource.metadata["inputConfig"]["bigquerySource"][
59 "uri"

IndexError: list index out of range

I can't find .list method for google.cloud.aiplatform's TabularDataset in datasource_utils.py

Give CloudBuild access to BQ and Vertex AI

Update the Terraform scripts to give the CloudBuild service account access to BigQuery and Vertex AI.
This is required for the execution of the CI/CD step that executes and e2e pipeline test and deploys the model to Vertex AI.

Missing gcloud command or terraform code to create cloud build triggers

Users like to have a way to create cloud build triggers in the Terraform folder or with gcloud commands. I found tensorflow transform is using Apache beam runner in the following cloud build file.

# Compile the pipeline.
- name: '$_CICD_IMAGE_URI'
  entrypoint: 'python'
  args: ['build/utils.py',
          '--mode', 'compile-pipeline',
          '--pipeline-name', '$_PIPELINE_NAME'
          ]
  dir: 'mlops-with-vertex-ai'
  env: 
  - 'PROJECT=$_PROJECT'  
  - 'REGION=$_REGION'
  - 'MODEL_DISPLAY_NAME=$_MODEL_DISPLAY_NAME'
  - 'DATASET_DISPLAY_NAME=$_DATASET_DISPLAY_NAME'  
  - 'GCS_LOCATION=$_GCS_LOCATION' 
  - 'TFX_IMAGE_URI=$_TFX_IMAGE_URI' 
  - 'BEAM_RUNNER=$_BEAM_RUNNER'
  - 'TRAINING_RUNNER=$_TRAINING_RUNNER'
  id: 'Compile Pipeline'
  waitFor: ['Local Test E2E Pipeline']

So far I gather an example from a notebook's section. But I can't find how to create the cloud build trigger.

  1. Run the training pipeline using Vertex Pipelines
    Set the pipeline configurations for the Vertex AI run
    os.environ["DATASET_DISPLAY_NAME"] = DATASET_DISPLAY_NAME
    os.environ["MODEL_DISPLAY_NAME"] = MODEL_DISPLAY_NAME
    os.environ["PIPELINE_NAME"] = PIPELINE_NAME
    os.environ["PROJECT"] = PROJECT
    os.environ["REGION"] = REGION
    os.environ["GCS_LOCATION"] = f"gs://{BUCKET}/{DATASET_DISPLAY_NAME}"
    os.environ["TRAIN_LIMIT"] = "85000"
    os.environ["TEST_LIMIT"] = "15000"
    os.environ["BEAM_RUNNER"] = "DataflowRunner"
    os.environ["TRAINING_RUNNER"] = "vertex"
    os.environ["TFX_IMAGE_URI"] = f"gcr.io/{PROJECT}/{DATASET_DISPLAY_NAME}:{VERSION}"
    os.environ["ENABLE_CACHE"] = "1"

Error: googleapi: Error 409: The requested bucket name is not available. The bucket namespace is shared by all users of the system. Please select a different name and try again a different name and try again.

I literally created a bucket with a random BTC wallet address to make it unique, which I was told to do by the GCP guide, and still i am hitting the same error while executing gcs-bucket.tf

google_storage_bucket.bc1qxy2kgdygjrsqtzq2n0yrf2493p83kkfjhx0wlh: Creating... ╷ │
Error: googleapi: Error 409: The requested bucket name is not available. The bucket namespace is shared by all users of the system. Please select a different name and try again., conflict │ │ with google_storage_bucket.bc1qxy2kgdygjrsqtzq2n0yrf2493p83kkfjhx0wlh, │ on gcs-bucket.tf line 17, in resource "google_storage_bucket" "bc1qxy2kgdygjrsqtzq2n0yrf2493p83kkfjhx0wlh": │ 17: resource "google_storage_bucket" "bc1qxy2kgdygjrsqtzq2n0yrf2493p83kkfjhx0wlh" { │

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.