microsoft / mcw-machine-learning Goto Github PK

MCW Machine Learning

License: MIT License

mcw-machine-learning's Introduction

This workshop is archived and is no longer being maintained. Content is read-only.

Machine Learning

Trey Research is looking to provide the next generation experience for connected car manufacturers by enabling them to utilize AI to decide when to pro-actively reach out to the customer through alerts delivered directly to the car's in-dash information and entertainment head unit. For their proof-of-concept (PoC), they would like to focus on two maintenance related scenarios.

In the first scenario, Trey Research recently instituted new regulations defining what parts are compliant or out of compliance. Rather than rely on their technicians to assess compliance, they would like to automatically assess the compliance based on component notes already entered by authorized technicians. Specifically, they are looking to leverage Deep Learning technologies with Natural Language Processing techniques to scan through vehicle specification documents to find compliance issues with new regulations. Then each car is evaluated for out compliance components.

In the second scenario, Trey Research would like to predict the likelihood of battery failure based on the time series-based telemetry data that the car provides. The data contains details about how the battery performs when the vehicle is started, how it is charging while running, and how well it is holding its charge, among other factors. If they detect a battery failure is imminent within the next 30 days, they would like to send an alert.

February 2021

Target audience

Data Engineers
Data Scientist
AI Engineers
Software Engineers

Abstracts

Workshop

In this workshop, you will gain a better understanding of how to combine Azure Databricks with Azure Machine Learning to build, train and deploy machine learning and deep learning models. You will learn how to train a forecasting model against time-series data, without any code, by using automated machine learning, and how to interpret trained machine learning models. You will also learn how to use MLflow for managing experiments run directly on the Azure Databricks cluster and how MLflow can seamlessly log metric and training artifacts in your Azure Machine Learning workspace. You will create a recurrent neural network (RNN) model using PyTorch in Azure Databricks that can be used to forecast against time-series data and train a Natural Language Processing (NLP) text classification model based on Long Short-Term Memory (LSTM) recurrent neural network and Keras.

At the end of this workshop, you will be able to design a solution better understanding the capabilities of leveraging the Azure Machine Learning service and Azure Databricks.

Whiteboard design session

In this whiteboard design session, you will work with a group to design and implement a solution that combines Azure Databricks with Azure Machine Learning to build, train, and deploy machine learning and deep learning models. You will learn how to prepare data for training and use automated machine learning and model lifecycle management from training to deployment (in batch and real-time inferencing scenarios). You will also learn to build deep learning models for Natural Language Processing (NLP) in text classification and forecasting against time-series data and address the model interpretability problem. Finally, you will learn how to use MLflow for managing experiments run directly on the Azure Databricks cluster and how MLflow can seamlessly log metrics and training artifacts in your Azure Machine Learning workspace. In the process, you will also get to compare data with PyTorch and Keras for deep learning.

At the end of this workshop, you will have a deeper understanding of the capabilities and implementation solutions when leveraging Azure Machine Learning and Azure Databricks.

Hands-on lab

In this lab, you will use Azure Databricks in combination with Azure Machine Learning to build, train and deploy desired models. You will learn how to train a forecasting model against time-series data, without any code, by using automated machine learning, and how to interpret trained machine learning models. You will also learn how to use MLflow for managing experiments run directly on the Azure Databricks cluster and how MLflow can seamlessly log metric and training artifacts in your Azure Machine Learning workspace. You will create a recurrent neural network (RNN) model using PyTorch in Azure Databricks that can be used to forecast against time-series data and train a Natural Language Processing (NLP) text classification model based on Long Short-Term Memory (LSTM) recurrent neural network and Keras.

At the end of this lab, you will be better able to build solutions leveraging Azure Machine Learning and Azure Databricks.

Azure services and related products

Azure Databricks
Azure Machine Learning
Azure Machine Learning Automated Machine Learning
Azure Storage
IoT Hub
PyTorch
Power BI

Azure solutions

Machine Learning

Related references

Microsoft Cloud Workshop

mcw-machine-learning's People

Contributors

Stargazers

Watchers

Forkers

shirolkar nclaudiuf daovo cloudops-cherif lucianoteixeiras ujjwalmsft reinharderino jag-chichria charoen3 satonaoki ceteongvanness lidagh ram-msft magaudef nenves dexter9209 iamseacat outifaout kokohar sammydeprez datarelish ignaciofls marcelocauduro karen-lopes sidneyocirqueira-zz mcauduro qs2ag alodha100 iamjiawei1030 belsys vmaruthi robinaggarwal liqsword gutenbergalmeida spektrasystems btho733 bhaskers-blu-org2 ankitshah009 ritaab taffywrinkle claudiusgonzo maheshadba vladiliescu hitesh2462 jeffresh wongamanda sahibarneja xctpro saimachi cheahengteong niraj5aug cdiadhiou muhammadmoizulhaq terrychang1015 turretin sundayayandele element824 jasonhorner cloudlabs-mcw wbdatafocus karndeepsingh codess-aus asener1 msworkshop wiiki0807 nag9s whoiscnu lokeshwarvangala solliancenet bpkapkar tonee84 2mileslab didacloud dipankar98228 dazmost azureandsecurityotaku ricauduro rommelnatano

mcw-machine-learning's Issues

cannot import name 'TabularExplainer' from 'interpret.ext.blackbox'

I tried to run project from official documentation and it does not work due to issue.
https://github.com/microsoft/MLOps/tree/616368d8afbeeed7a9191ffedc56eaf5aed81b43

I saw you closed the same issue 7 days ago, but it does not work

AutoMLRun in the Databricks notebook generates error

In the notebook "Scoring streaming data", when I run cell 13's automl_run = AutoMLRun(existing_experiment, run_id) I get following error:

ErrorResponseException Traceback (most recent call last)
/local_disk0/pythonVirtualEnvDirs/virtualEnv-920d03a1-cd34-4e96-8a87-639440323338/lib/python3.7/site-packages/azureml/_restclient/workspace_client.py in _execute_with_arguments(self, func, args_list, *args, **kwargs)
86 else:
---> 87 return self._call_api(func, *args_list, **kwargs)
88 except ErrorResponseException as e:

/local_disk0/pythonVirtualEnvDirs/virtualEnv-920d03a1-cd34-4e96-8a87-639440323338/lib/python3.7/site-packages/azureml/_restclient/clientbase.py in _call_api(self, func, *args, **kwargs)
225 else:
--> 226 return self._execute_with_base_arguments(func, *args, **kwargs)
227

/local_disk0/pythonVirtualEnvDirs/virtualEnv-920d03a1-cd34-4e96-8a87-639440323338/lib/python3.7/site-packages/azureml/_restclient/clientbase.py in _execute_with_base_arguments(self, func, *args, **kwargs)
278 return ClientBase._execute_func_internal(
--> 279 back_off, total_retry, self._logger, func, _noop_reset, *args, **kwargs)
280

machine learning sdk version issue

In Task 1 when we create an experiment we are getting sdk version as 1.1.0rc0

, as per screenshots the sdk version is 1.0.74.1 https://github.com/microsoft/MCW-Machine-Learning/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Machine%20Learning.md#task-2-review-the-experiment-run-results, after this run there is a step in stream scoring notebook ,

We tried replacing sdk version 1.0.83 used in cmd 1 of stream scoring notebook with the version of machine learning experiment version but still we are getting issue

the name of workspace is 'ws' not 'workspace' in notebook of Data Preparation

AnalysisException: Incompatible format detected

When I run on the 3.0 Deep Learning with Time Series
Reach until cmd44

reloaded_df = spark.read.format("delta").load(output_folder)
display(reloaded_df)

and it show the following error:AnalysisException: Incompatible format detected.

AnalysisException                         Traceback (most recent call last)
<command-1129672552436094> in <module>
----> 1 reloaded_df = spark.read.format("delta").load(output_folder)
      2 display(reloaded_df)

/databricks/spark/python/pyspark/sql/readwriter.py in load(self, path, format, schema, **options)
    176         self.options(**options)
    177         if isinstance(path, basestring):
--> 178             return self._df(self._jreader.load(path))
    179         elif path is not None:
    180             if type(path) != list:

/databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py in __call__(self, *args)
   1303         answer = self.gateway_client.send_command(command)
   1304         return_value = get_return_value(
-> 1305             answer, self.gateway_client, self.target_id, self.name)
   1306 
   1307         for temp_arg in temp_args:

/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
    131                 # Hide where the exception came from that shows a non-Pythonic
    132                 # JVM exception message.
--> 133                 raise_from(converted)
    134             else:
    135                 raise

/databricks/spark/python/pyspark/sql/utils.py in raise_from(e)

AnalysisException: Incompatible format detected.

You are trying to read from `/tmp/streaming/4a964217-549b-4473-bc5b-62aa656d0ada/output` using Databricks Delta, but there is no
transaction log present. Check the upstream job to make sure that it is writing
using format("delta") and that you are trying to read from the table base path.

To disable this check, SET spark.databricks.delta.formatCheck.enabled=false
To learn more about Delta, see https://docs.microsoft.com/azure/databricks/delta/index

Training data is saved as xls instead of csv

FYI, the Training data is saved as xls instead of csv when one chooses save link as with the step below:
https://github.com/microsoft/MCW-Machine-Learning/blob/master/Hands-on%20lab/Before%20the%20HOL%20-%20Machine%20Learning.md#task-6-download-training-data-to-your-local-machine

Cheers.

screenshot need to be updated

In exercise 1 task 1 step 12 , screenshot and instructions need to be updated as Group by column is now replaced with Time series identifier

Databricks AML pre-req library failure

Hello,
I received this via email from Clement Le Roux. Please review and advise.

As I walked through the MCW-Machine Learning workshop, I hit an issue when doing Before Hands-on-Lab activity.
https://github.com/microsoft/MCW-Machine-Learning
When trying to install Databricks AML pre-requesite library, it failed.
After different try, I finally succeeded by installing azureml-sdk[automl_databricks ], and not azureml-sdk[automl_databricks,explain].
Then I was able to successfully run notebooks with appropriate import.
When looking at Azure ML documentation, it sounds like azureml-sdk[automl_databricks ] has to be installed alone:
https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-environment#azure-databricks
Warning
No other SDK extras can be installed. Choose only one of the preceding options [databricks] or [automl_databricks].
Maybe an update is needed on documentation.
Anyway, this is really great material I am going to use on the field, so thank you guys working on that 😉

Kind regards

March 2020 – content update

Suggestion content goes here

5.0 Model Interpretability failing

HI , while running the cmd 21 we are getting the error :

Can you please take a look into this as we have workshop scheduled for this lab

Issue in 4.0 Deep Learning with Text, cmd-4

While running the cmd-4 we are getting an error as module 'keras.utils.generic_utils' has no attribute 'populate_dict_with_module_objects'.

Can you please take a look into this.

December 2019 - content update

This workshop is scheduled for a content update. Please review the current workshop, any open issues, and provide update suggestions for review. Thanks!

cannot import name 'get_db_profile_from_uri' from 'mlflow.utils.uri'

while running cmd 25 in exercise 4 , we are getting error as cannot import name 'get_db_profile_from_uri' from 'mlflow.utils.uri'

Exercise 1 : Creating a forecast model using automated machine learning

In Exercixe 1 > task 3 , Once we click on Deploy Best Model, it is directly giving the tab to deploy the Model not just to Register Model. Azure portal UI got updated, Can you please make the required changes as soon as possible in the guide as well ?
Because we have scheduled a Machine Learning workshop on 23rd August,2019.

September 2020 Updates

SDK version update from 1.0.74 to 1.0.83

In Stream Scoring notebook cmd 2 sdk version needs to be changed from 1.0.74 to 1.0.83 as experiment will be using 1.0.83 version and if we run 1.0.74 in notebook it will throw an error

in exercise1 task 1 step 13 , we have to add Battery_ID in group by column
Screenshots need to be updated as this option is not available in Additional configuration window this option is available after this window

Stream Scoring notebook error

In cmd 14 when we run the cell , we are getting error as

Can you please take a look into this

ReadMe page

ReadMe page needs to be updated; everything else in the test/fix was good.

Guidance: The Intro at the top of the page is supposed to be a teaser, just a couple of short paragraphs. NOT a copy/pasted the full intro.

stream scoring notebook

While running cmd 23 in the notebook i am getting this error

Stream Scoring: ImportError: cannot import name 'easter' from 'holidays'

In Exercise 3 (Stream Scoring notebook), the above error appears in cell 13 (experiment run id obfuscated afterwards):

The error is caused by this line:
best_run, best_model = automl_run.get_output()

The same notebook was able to run successully a few weeks ago, but now it crashes. I instructed this workshop on Friday 31 January 2020, and all the participants were able to reproduce the issue. Here is the full stack trace from notebook output:

ImportError                               Traceback (most recent call last)
<command-708075521530493> in <module>
      5 
      6 automl_run = AutoMLRun(existing_experiment, run_id)
----> 7 best_run, best_model = automl_run.get_output()

/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/azureml/train/automl/run.py in get_output(self, iteration, metric, return_onnx_model, return_split_onnx_model, **kwargs)
    493             )
    494         else:
--> 495             fitted_model = self._download_automl_model(curr_run)
    496 
    497         return curr_run, fitted_model

/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/azureml/train/automl/run.py in _download_automl_model(self, curr_run)
    605             import azureml.train.automl.runtime
    606             with open(self.local_model_path, "rb") as model_file:
--> 607                 fitted_model = pickle.load(model_file)
    608         except ImportError as e:
    609             # Check to see if importing azureml.train.automl.runtime specifically failed

/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/fbprophet/__init__.py in <module>
      6 # of patent rights can be found in the PATENTS file in the same directory.
      7 
----> 8 from fbprophet.forecaster import Prophet
      9 
     10 __version__ = '0.5'

/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/fbprophet/forecaster.py in <module>
     18 
     19 from fbprophet.diagnostics import prophet_copy
---> 20 from fbprophet.make_holidays import get_holiday_names, make_holidays_df
     21 from fbprophet.models import prophet_stan_model
     22 from fbprophet.plot import (plot, plot_components, plot_forecast_component,

/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/fbprophet/make_holidays.py in <module>
     14 import pandas as pd
     15 
---> 16 import fbprophet.hdays as hdays_part2
     17 import holidays as hdays_part1
     18 

/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/fbprophet/hdays.py in <module>
     14 
     15 from convertdate.islamic import from_gregorian, to_gregorian
---> 16 from holidays import WEEKEND, HolidayBase, easter, rd
     17 from lunardate import LunarDate
     18 

ImportError: cannot import name 'easter' from 'holidays' (/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/holidays/__init__.py)```

Error on notebook

Hello, I have an error on notebook 'Model Explainability'. When I execute this step I get an error that say ‘need torch' ->

So I installed on my cluster torch but I have again an error ->

Facing issue in Stream Scoring Notebook

In exercise 3, while running the cell Create and Run Experiment in the Stream Scoring notebook, I am getting the following error.

Exercise 2: Understanding the automated ML generated forecast model using model explainability

/databricks/python/lib/python3.5/site-packages/azureml/explain/model/dataset/dataset_wrapper.py:177: FutureWarning: in the future, boolean array-likes will be handled as a boolean array index
categorical_col_names = list(np.array(list(tmp_dataset))[(tmp_dataset.applymap(type) == str).all(0)])
AttributeError: 'SystemException' object has no attribute 'from_exception'

AttributeError Traceback (most recent call last)
in ()
1 from azureml.train.automl.automlexplainer import explain_model
2
----> 3 shap_values, expected_values, sorted_global_importance_values, sorted_global_importance_names, _ , _ = explain_model(best_model, X_train, X_test, best_run=best_run, y_train=y_train)
4
5 #Overall feature importance

/databricks/python/lib/python3.5/site-packages/azureml/train/automl/automlexplainer.py in explain_model(fitted_model, X_train, X_test, best_run, features, y_train, **kwargs)
166 message = "[RunId:{}]Explain model function met import error. Error message:{}".format(run_id, e)
167 logger.warning(message)
--> 168 raise SystemException(message).from_exception(e)

AttributeError: 'SystemException' object has no attribute 'from_exception'

UserErrorException: UserErrorException

I have uncomment and first line and second line of the code

from azureml.core.authentication import InteractiveLoginAuthentication
interactive_auth = InteractiveLoginAuthentication(tenant_id="xxxx")

# Connect to the Azure ML Workspace
ws = Workspace(subscription_id, resource_group, workspace_name)

# Get default datastore to upload prepared data
datastore = ws.get_default_datastore()

but when run it I get a UserErrorException: UserErrorException as following

---------------------------------------------------------------------------
UserErrorException                        Traceback (most recent call last)
<command-1129672552436141> in <module>
      3 
      4 # Connect to the Azure ML Workspace
----> 5 ws = Workspace(subscription_id, resource_group, workspace_name)
      6 
      7 # Get default datastore to upload prepared data

/databricks/python/lib/python3.7/site-packages/azureml/core/workspace.py in __init__(self, subscription_id, resource_group, workspace_name, auth, _location, _disable_service_check, _workspace_id, sku, tags)
    203         if not _disable_service_check:
    204             auto_rest_workspace = _commands.get_workspace(
--> 205                 auth, subscription_id, resource_group, workspace_name)
    206             self._workspace_autorest_object = auto_rest_workspace
    207 

/databricks/python/lib/python3.7/site-packages/azureml/_project/_commands.py in get_workspace(auth, subscription_id, resource_group_name, workspace_name)
    381     """
    382     try:
--> 383         workspaces = auth._get_service_client(AzureMachineLearningWorkspaces, subscription_id).workspaces
    384         return WorkspacesOperations.get(
    385             workspaces,

/databricks/python/lib/python3.7/site-packages/azureml/core/authentication.py in _get_service_client(self, client_class, subscription_id, subscription_bound, base_url)
    174         if subscription_id:
    175             all_subscription_list, tenant_id = self._get_all_subscription_ids()
--> 176             self._check_if_subscription_exists(subscription_id, all_subscription_list, tenant_id)
    177 
    178         if not base_url:

/databricks/python/lib/python3.7/site-packages/azureml/core/authentication.py in _check_if_subscription_exists(self, subscription_id, subscription_id_list, tenant_id)
    566     def _check_if_subscription_exists(self, subscription_id, subscription_id_list, tenant_id):
    567         super(InteractiveLoginAuthentication, self)._check_if_subscription_exists(subscription_id,
--> 568                                                                                   subscription_id_list, tenant_id)
    569 
    570     def _get_ambient(self, cloud):

/databricks/python/lib/python3.7/site-packages/azureml/core/authentication.py in _check_if_subscription_exists(self, subscription_id, subscription_id_list, tenant_id)
    259                                      "authentication mechanisms in azureml-sdk.".format(tenant_id,
    260                                                                                         subscription_id,
--> 261                                                                                         subscription_id_list))
    262 
    263 def _login_on_failure_decorator(lock_to_use):

microsoft / mcw-machine-learning Goto Github PK

mcw-machine-learning's Introduction

This workshop is archived and is no longer being maintained. Content is read-only.

Machine Learning

Target audience

Abstracts

Workshop

Whiteboard design session

Hands-on lab

Azure services and related products

Azure solutions

Related references

mcw-machine-learning's People

Contributors

Stargazers

Watchers

Forkers

mcw-machine-learning's Issues

In the notebook "Scoring streaming data", when I run cell 13's automl_run = AutoMLRun(existing_experiment, run_id) I get following error:

and it show the following error:AnalysisException: Incompatible format detected.

Recommend Projects

Recommend Topics

Recommend Org