Git Product home page Git Product logo

azureml-greenai-txtsum's Introduction

Microsoft GreenAI: NLP Text Summarization (preview)

ARM Quickstart CLI Job License: MIT

azureml icon plus huggingface icon

This repo currently contains samples to fine-tune HuggingFace models for text summarization using Microsoft's Azure Machine Learning service. These samples could be adapted to fine-tune models for other NLP tasks or product scenarios.

What's available now?

  • AzureML v2 CLI examples for fine-tuning HuggingFace models
  • Quickstart ARM Templates for fine-tuning HuggingFace models
  • Fine-tuned HuggingFace models & results: https://huggingface.co/linydub

What's coming next?

  • Benchmarking and carbon accounting with MLflow and Azure Monitor Metrics (performance + resource metrics)
  • Interactive data visualization example with Azure Monitor Workbook
  • AML v2 CLI inference samples with ONNX Runtime and NVIDIA Triton (AML endpoint & deployment)
  • AML v2 CLI end-to-end pipeline samples
  • Repository documentation and detailed guide for the samples
  • More fine-tuned models and benchmark results

*More details about the project and future plans could be found here.

Contents

Directory Description
cloud Cloud-specific configuration code
docs Project docs & images
examples AzureML examples for sample tasks

Fine-tuning Samples

These samples showcase various methods to fine-tune HuggingFace models using AzureML. All of the samples include DeepSpeed, FairScale, CodeCarbon, MLflow integrations with no additional setup or code.

All logged training metrics are automatically reported to AzureML and MLflow. CodeCarbon also generates a emissions.csv file by default inside the outputs folder of the submitted run. To disable a package, ommit it from the environment's conda file.

*Sample script for retrieving and aggregating MLflow and resource usage data will be available next update.

Quickstart

Fine-tune a HuggingFace Model

Deploy to Azure Visualize

Fine-tune with DeepSpeed ZeRO Optimizations

Deploy to Azure Visualize

Hyperparameter Sweep with HyperDrive

Deploy to Azure Visualize

More advanced ARM Templates will be available here.

AzureML v2 CLI Examples

Fine-tuning samples using AML 2.0 CLI could be found here.

Inference Samples

Jupyter Notebooks

Notebook Description

Support/Feedback

Please file an issue through the repo or email me at [email protected]. Feedback is greatly appreciated ๐Ÿค—

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.