Git Product home page Git Product logo

mcw-machine-learning's Introduction

This workshop is archived and is no longer being maintained. Content is read-only.

Machine Learning

Trey Research is looking to provide the next generation experience for connected car manufacturers by enabling them to utilize AI to decide when to pro-actively reach out to the customer through alerts delivered directly to the car's in-dash information and entertainment head unit. For their proof-of-concept (PoC), they would like to focus on two maintenance related scenarios.

In the first scenario, Trey Research recently instituted new regulations defining what parts are compliant or out of compliance. Rather than rely on their technicians to assess compliance, they would like to automatically assess the compliance based on component notes already entered by authorized technicians. Specifically, they are looking to leverage Deep Learning technologies with Natural Language Processing techniques to scan through vehicle specification documents to find compliance issues with new regulations. Then each car is evaluated for out compliance components.

In the second scenario, Trey Research would like to predict the likelihood of battery failure based on the time series-based telemetry data that the car provides. The data contains details about how the battery performs when the vehicle is started, how it is charging while running, and how well it is holding its charge, among other factors. If they detect a battery failure is imminent within the next 30 days, they would like to send an alert.

February 2021

Target audience

  • Data Engineers
  • Data Scientist
  • AI Engineers
  • Software Engineers

Abstracts

Workshop

In this workshop, you will gain a better understanding of how to combine Azure Databricks with Azure Machine Learning to build, train and deploy machine learning and deep learning models. You will learn how to train a forecasting model against time-series data, without any code, by using automated machine learning, and how to interpret trained machine learning models. You will also learn how to use MLflow for managing experiments run directly on the Azure Databricks cluster and how MLflow can seamlessly log metric and training artifacts in your Azure Machine Learning workspace. You will create a recurrent neural network (RNN) model using PyTorch in Azure Databricks that can be used to forecast against time-series data and train a Natural Language Processing (NLP) text classification model based on Long Short-Term Memory (LSTM) recurrent neural network and Keras.

At the end of this workshop, you will be able to design a solution better understanding the capabilities of leveraging the Azure Machine Learning service and Azure Databricks.

Whiteboard design session

In this whiteboard design session, you will work with a group to design and implement a solution that combines Azure Databricks with Azure Machine Learning to build, train, and deploy machine learning and deep learning models. You will learn how to prepare data for training and use automated machine learning and model lifecycle management from training to deployment (in batch and real-time inferencing scenarios). You will also learn to build deep learning models for Natural Language Processing (NLP) in text classification and forecasting against time-series data and address the model interpretability problem. Finally, you will learn how to use MLflow for managing experiments run directly on the Azure Databricks cluster and how MLflow can seamlessly log metrics and training artifacts in your Azure Machine Learning workspace. In the process, you will also get to compare data with PyTorch and Keras for deep learning.

At the end of this workshop, you will have a deeper understanding of the capabilities and implementation solutions when leveraging Azure Machine Learning and Azure Databricks.

Hands-on lab

In this lab, you will use Azure Databricks in combination with Azure Machine Learning to build, train and deploy desired models. You will learn how to train a forecasting model against time-series data, without any code, by using automated machine learning, and how to interpret trained machine learning models. You will also learn how to use MLflow for managing experiments run directly on the Azure Databricks cluster and how MLflow can seamlessly log metric and training artifacts in your Azure Machine Learning workspace. You will create a recurrent neural network (RNN) model using PyTorch in Azure Databricks that can be used to forecast against time-series data and train a Natural Language Processing (NLP) text classification model based on Long Short-Term Memory (LSTM) recurrent neural network and Keras.

At the end of this lab, you will be better able to build solutions leveraging Azure Machine Learning and Azure Databricks.

Azure services and related products

  • Azure Databricks
  • Azure Machine Learning
  • Azure Machine Learning Automated Machine Learning
  • Azure Storage
  • IoT Hub
  • PyTorch
  • Power BI

Azure solutions

Machine Learning

Related references

mcw-machine-learning's People

Contributors

ciprianjichici avatar codingbandit avatar dawnmariedesjardins avatar emilysaeli avatar microsoftopensource avatar msftgits avatar roxanagoidaci avatar saimachi avatar sammydeprez avatar satonaoki avatar shirolkar avatar timahenning avatar vladiliescu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mcw-machine-learning's Issues

AutoMLRun in the Databricks notebook generates error

In the notebook "Scoring streaming data", when I run cell 13's automl_run = AutoMLRun(existing_experiment, run_id) I get following error:

ErrorResponseException Traceback (most recent call last)
/local_disk0/pythonVirtualEnvDirs/virtualEnv-920d03a1-cd34-4e96-8a87-639440323338/lib/python3.7/site-packages/azureml/_restclient/workspace_client.py in _execute_with_arguments(self, func, args_list, *args, **kwargs)
86 else:
---> 87 return self._call_api(func, *args_list, **kwargs)
88 except ErrorResponseException as e:

/local_disk0/pythonVirtualEnvDirs/virtualEnv-920d03a1-cd34-4e96-8a87-639440323338/lib/python3.7/site-packages/azureml/_restclient/clientbase.py in _call_api(self, func, *args, **kwargs)
225 else:
--> 226 return self._execute_with_base_arguments(func, *args, **kwargs)
227

/local_disk0/pythonVirtualEnvDirs/virtualEnv-920d03a1-cd34-4e96-8a87-639440323338/lib/python3.7/site-packages/azureml/_restclient/clientbase.py in _execute_with_base_arguments(self, func, *args, **kwargs)
278 return ClientBase._execute_func_internal(
--> 279 back_off, total_retry, self._logger, func, _noop_reset, *args, **kwargs)
280

machine learning sdk version issue

In Task 1 when we create an experiment we are getting sdk version as 1.1.0rc0
image
, as per screenshots the sdk version is 1.0.74.1 https://github.com/microsoft/MCW-Machine-Learning/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Machine%20Learning.md#task-2-review-the-experiment-run-results, after this run there is a step in stream scoring notebook ,
image
We tried replacing sdk version 1.0.83 used in cmd 1 of stream scoring notebook with the version of machine learning experiment version but still we are getting issue

AnalysisException: Incompatible format detected

When I run on the 3.0 Deep Learning with Time Series
Reach until cmd44

reloaded_df = spark.read.format("delta").load(output_folder)
display(reloaded_df)

and it show the following error:AnalysisException: Incompatible format detected.

AnalysisException                         Traceback (most recent call last)
<command-1129672552436094> in <module>
----> 1 reloaded_df = spark.read.format("delta").load(output_folder)
      2 display(reloaded_df)

/databricks/spark/python/pyspark/sql/readwriter.py in load(self, path, format, schema, **options)
    176         self.options(**options)
    177         if isinstance(path, basestring):
--> 178             return self._df(self._jreader.load(path))
    179         elif path is not None:
    180             if type(path) != list:

/databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py in __call__(self, *args)
   1303         answer = self.gateway_client.send_command(command)
   1304         return_value = get_return_value(
-> 1305             answer, self.gateway_client, self.target_id, self.name)
   1306 
   1307         for temp_arg in temp_args:

/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
    131                 # Hide where the exception came from that shows a non-Pythonic
    132                 # JVM exception message.
--> 133                 raise_from(converted)
    134             else:
    135                 raise

/databricks/spark/python/pyspark/sql/utils.py in raise_from(e)

AnalysisException: Incompatible format detected.

You are trying to read from `/tmp/streaming/4a964217-549b-4473-bc5b-62aa656d0ada/output` using Databricks Delta, but there is no
transaction log present. Check the upstream job to make sure that it is writing
using format("delta") and that you are trying to read from the table base path.

To disable this check, SET spark.databricks.delta.formatCheck.enabled=false
To learn more about Delta, see https://docs.microsoft.com/azure/databricks/delta/index

screenshot need to be updated

In exercise 1 task 1 step 12 , screenshot and instructions need to be updated as Group by column is now replaced with Time series identifier
image

Databricks AML pre-req library failure

Hello,
I received this via email from Clement Le Roux. Please review and advise.

As I walked through the MCW-Machine Learning workshop, I hit an issue when doing Before Hands-on-Lab activity.
https://github.com/microsoft/MCW-Machine-Learning
When trying to install Databricks AML pre-requesite library, it failed.
After different try, I finally succeeded by installing azureml-sdk[automl_databricks ], and not azureml-sdk[automl_databricks,explain].
Then I was able to successfully run notebooks with appropriate import.
When looking at Azure ML documentation, it sounds like azureml-sdk[automl_databricks ] has to be installed alone:
https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-environment#azure-databricks
Warning
No other SDK extras can be installed. Choose only one of the preceding options [databricks] or [automl_databricks].
Maybe an update is needed on documentation.
Anyway, this is really great material I am going to use on the field, so thank you guys working on that πŸ˜‰

Kind regards

Issue in 4.0 Deep Learning with Text, cmd-4

While running the cmd-4 we are getting an error as module 'keras.utils.generic_utils' has no attribute 'populate_dict_with_module_objects'.

image
Can you please take a look into this.

December 2019 - content update

This workshop is scheduled for a content update. Please review the current workshop, any open issues, and provide update suggestions for review. Thanks!

Exercise 1 : Creating a forecast model using automated machine learning

In Exercixe 1 > task 3 , Once we click on Deploy Best Model, it is directly giving the tab to deploy the Model not just to Register Model. Azure portal UI got updated, Can you please make the required changes as soon as possible in the guide as well ?
Because we have scheduled a Machine Learning workshop on 23rd August,2019.

SDK version update from 1.0.74 to 1.0.83

In Stream Scoring notebook cmd 2 sdk version needs to be changed from 1.0.74 to 1.0.83 as experiment will be using 1.0.83 version and if we run 1.0.74 in notebook it will throw an error
image

in exercise1 task 1 step 13 , we have to add Battery_ID in group by column
image Screenshots need to be updated as this option is not available in Additional configuration window this option is available after this window

ReadMe page

ReadMe page needs to be updated; everything else in the test/fix was good.

Guidance: The Intro at the top of the page is supposed to be a teaser, just a couple of short paragraphs. NOT a copy/pasted the full intro.

Stream Scoring: ImportError: cannot import name 'easter' from 'holidays'

In Exercise 3 (Stream Scoring notebook), the above error appears in cell 13 (experiment run id obfuscated afterwards):

Error

The error is caused by this line:
best_run, best_model = automl_run.get_output()

The same notebook was able to run successully a few weeks ago, but now it crashes. I instructed this workshop on Friday 31 January 2020, and all the participants were able to reproduce the issue. Here is the full stack trace from notebook output:

ImportError                               Traceback (most recent call last)
<command-708075521530493> in <module>
      5 
      6 automl_run = AutoMLRun(existing_experiment, run_id)
----> 7 best_run, best_model = automl_run.get_output()

/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/azureml/train/automl/run.py in get_output(self, iteration, metric, return_onnx_model, return_split_onnx_model, **kwargs)
    493             )
    494         else:
--> 495             fitted_model = self._download_automl_model(curr_run)
    496 
    497         return curr_run, fitted_model

/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/azureml/train/automl/run.py in _download_automl_model(self, curr_run)
    605             import azureml.train.automl.runtime
    606             with open(self.local_model_path, "rb") as model_file:
--> 607                 fitted_model = pickle.load(model_file)
    608         except ImportError as e:
    609             # Check to see if importing azureml.train.automl.runtime specifically failed

/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/fbprophet/__init__.py in <module>
      6 # of patent rights can be found in the PATENTS file in the same directory.
      7 
----> 8 from fbprophet.forecaster import Prophet
      9 
     10 __version__ = '0.5'

/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/fbprophet/forecaster.py in <module>
     18 
     19 from fbprophet.diagnostics import prophet_copy
---> 20 from fbprophet.make_holidays import get_holiday_names, make_holidays_df
     21 from fbprophet.models import prophet_stan_model
     22 from fbprophet.plot import (plot, plot_components, plot_forecast_component,

/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/fbprophet/make_holidays.py in <module>
     14 import pandas as pd
     15 
---> 16 import fbprophet.hdays as hdays_part2
     17 import holidays as hdays_part1
     18 

/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/fbprophet/hdays.py in <module>
     14 
     15 from convertdate.islamic import from_gregorian, to_gregorian
---> 16 from holidays import WEEKEND, HolidayBase, easter, rd
     17 from lunardate import LunarDate
     18 

ImportError: cannot import name 'easter' from 'holidays' (/local_disk0/pythonVirtualEnvDirs/virtualEnv-ce55e929-5c7f-400c-bcae-bb5fde908b92/lib/python3.7/site-packages/holidays/__init__.py)```

Error on notebook

Hello, I have an error on notebook 'Model Explainability'. When I execute this step I get an error that say β€˜need torch' ->

notebookerror1

So I installed on my cluster torch but I have again an error ->

notebookerror2

Exercise 2: Understanding the automated ML generated forecast model using model explainability

/databricks/python/lib/python3.5/site-packages/azureml/explain/model/dataset/dataset_wrapper.py:177: FutureWarning: in the future, boolean array-likes will be handled as a boolean array index
categorical_col_names = list(np.array(list(tmp_dataset))[(tmp_dataset.applymap(type) == str).all(0)])
AttributeError: 'SystemException' object has no attribute 'from_exception'

AttributeError Traceback (most recent call last)
in ()
1 from azureml.train.automl.automlexplainer import explain_model
2
----> 3 shap_values, expected_values, sorted_global_importance_values, sorted_global_importance_names, _ , _ = explain_model(best_model, X_train, X_test, best_run=best_run, y_train=y_train)
4
5 #Overall feature importance

/databricks/python/lib/python3.5/site-packages/azureml/train/automl/automlexplainer.py in explain_model(fitted_model, X_train, X_test, best_run, features, y_train, **kwargs)
166 message = "[RunId:{}]Explain model function met import error. Error message:{}".format(run_id, e)
167 logger.warning(message)
--> 168 raise SystemException(message).from_exception(e)

AttributeError: 'SystemException' object has no attribute 'from_exception'

UserErrorException: UserErrorException

I have uncomment and first line and second line of the code

from azureml.core.authentication import InteractiveLoginAuthentication
interactive_auth = InteractiveLoginAuthentication(tenant_id="xxxx")

# Connect to the Azure ML Workspace
ws = Workspace(subscription_id, resource_group, workspace_name)

# Get default datastore to upload prepared data
datastore = ws.get_default_datastore()

but when run it I get a UserErrorException: UserErrorException as following

---------------------------------------------------------------------------
UserErrorException                        Traceback (most recent call last)
<command-1129672552436141> in <module>
      3 
      4 # Connect to the Azure ML Workspace
----> 5 ws = Workspace(subscription_id, resource_group, workspace_name)
      6 
      7 # Get default datastore to upload prepared data

/databricks/python/lib/python3.7/site-packages/azureml/core/workspace.py in __init__(self, subscription_id, resource_group, workspace_name, auth, _location, _disable_service_check, _workspace_id, sku, tags)
    203         if not _disable_service_check:
    204             auto_rest_workspace = _commands.get_workspace(
--> 205                 auth, subscription_id, resource_group, workspace_name)
    206             self._workspace_autorest_object = auto_rest_workspace
    207 

/databricks/python/lib/python3.7/site-packages/azureml/_project/_commands.py in get_workspace(auth, subscription_id, resource_group_name, workspace_name)
    381     """
    382     try:
--> 383         workspaces = auth._get_service_client(AzureMachineLearningWorkspaces, subscription_id).workspaces
    384         return WorkspacesOperations.get(
    385             workspaces,

/databricks/python/lib/python3.7/site-packages/azureml/core/authentication.py in _get_service_client(self, client_class, subscription_id, subscription_bound, base_url)
    174         if subscription_id:
    175             all_subscription_list, tenant_id = self._get_all_subscription_ids()
--> 176             self._check_if_subscription_exists(subscription_id, all_subscription_list, tenant_id)
    177 
    178         if not base_url:

/databricks/python/lib/python3.7/site-packages/azureml/core/authentication.py in _check_if_subscription_exists(self, subscription_id, subscription_id_list, tenant_id)
    566     def _check_if_subscription_exists(self, subscription_id, subscription_id_list, tenant_id):
    567         super(InteractiveLoginAuthentication, self)._check_if_subscription_exists(subscription_id,
--> 568                                                                                   subscription_id_list, tenant_id)
    569 
    570     def _get_ambient(self, cloud):

/databricks/python/lib/python3.7/site-packages/azureml/core/authentication.py in _check_if_subscription_exists(self, subscription_id, subscription_id_list, tenant_id)
    259                                      "authentication mechanisms in azureml-sdk.".format(tenant_id,
    260                                                                                         subscription_id,
--> 261                                                                                         subscription_id_list))
    262 
    263 def _login_on_failure_decorator(lock_to_use):

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.