Git Product home page Git Product logo

abiramisukumaran / llmevaluationauto Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 774 KB

This lab evaluates a foundation model and fine tuned model based on Automatic Metrics of model results on the evaluation data and you will use the following Google Cloud products: Vertex AI Pipelines Vertex AI Evaluation Services Vertex AI Model Registry Vertex AI Endpoints

License: Apache License 2.0

Jupyter Notebook 100.00%

llmevaluationauto's Introduction

LLMEvaluationAuto

This lab evaluates a foundation model and fine tuned model based on Automatic Metrics of model results on the evaluation data and you will use the following Google Cloud products: Vertex AI Pipelines Vertex AI Evaluation Services Vertex AI Model Registry Vertex AI Endpoints

Model Evaluation with Vertex AI

LLMOps, or Large Language Model Operations, is an important methodology as organizations increasingly adopt large language models (LLMs) for a wide range of applications. LLMOps is the set of tools, processes, and best practices for managing the lifecycle of LLMs, from development and deployment to monitoring and maintenance. Vertex AI offers services to manage LLMOps pipelines as also mechanisms to evaluate the new models quality after every pipeline execution that you run.

In the realm of Generative AI, evaluation is a critical aspect of assessing the quality and relevance of the generated text. It involves examining the output from a generative language model to determine its coherence, accuracy, and alignment with the provided prompt. Model evaluation helps identify areas for improvement, optimize model performance, and ensure that the generated text meets the desired standards for quality and usefulness. Read more about it in the official documentation.

Objective

This lab teaches you how to evaluate a foundation model or a fine tuned model based on Automatic Metrics of model results on the evaluation data and you will use the following Google Cloud products:

Vertex AI Pipelines Vertex AI Evaluation Services Vertex AI Model Registry Vertex AI Endpoints

Use Case & Dataset

Using Generative AI we will evaluate a model that generates a suitable TITLE for a news BODY from BBC FULLTEXT DATA (Sourced from BigQuery Public Dataset bigquery-public-data.bbc_news.fulltext). The evaluation will cover both foundation (text-bison@002) and fine tuned (called "bbc-news-summary-tuned") models with the automatic metrics method. The sample evaluation dataset is published in this repo as a JSONL file.

Steps

Prepare data

Load Fine Tuned Model

Evaluate Fine Tuned Model

Load Base Model

Evaluate Base Model

Compare

Visualize in Google Cloud Console and in Cloud Storage Bucket

llmevaluationauto's People

Contributors

abiramisukumaran avatar

Stargazers

Arvind Bheri avatar KARUNAKARAN  VIJAYARAGHAVAN avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.