openhackathons-org / end-to-end-llm Goto Github PK
View Code? Open in Web Editor NEWThis repository is an AI Bootcamp material that consist of a workflow for LLM
License: Apache License 2.0
This repository is an AI Bootcamp material that consist of a workflow for LLM
License: Apache License 2.0
The current TRT-LLM Materials discusses the Hands-on aspects of getting from a Model to Deployment in a Triton server.
Given that TRT-LLM focuses on Performance, we could have a section that discusses the performance aspects of TRT-LLM and the various optimisations that are available to the end user.
cmd += ' -n 1 {} --model-repository={} --disable-auto-complete-config --backend-config=python,shm-region-prefix-name=prefix{} : '.format(tritonserver, model_repo, i)_
to
cmd += ' -n 1 {} --model-repository={} --disable-auto-complete-config --backend-config=python,shm-region-prefix-name=prefix{} : '.format( tritonserver, model_repo, str(i)+os.environ['USER'])_
An important aspect of deployment would be that the model needs to be served to a wide range of users. Understanding the throughout and latency and comparison with additional optimisation to the Vanilla deployment could be helpful to get a better picture of the Deployment requirements and perspective.
TRT-LLM does a great job in optimising the supported set of models. But a Notebook/ Section discussing the workflow and steps to integrate a custom model would be very helpful for custom integrations.
Deployment guide is stating the following:
_When you are inside the container, launch jupyter lab: jupyter-lab --no-browser --allow-root --ip=0.0.0.0 --port=8888 --NotebookApp.token="" --notebook-dir=/workspace.
Open the browser at http://localhost:8888 and click on the Start_here.ipynb notebook_
But when building the container there is no actual Start_here.ipynb (unless you go to archived/workspace, which indicates me that it is either deprecated or not well defined where i should look for the notebook).
In Nemo_primer.ipynb when doing import nemo.collections.asr as nemo_asr, import nemo.collections.nlp as nemo_nlp and
import nemo.collections.tts as nemo_tts I get the following error
ImportError: tokenizers>=0.11.1,!=0.11.3,<0.14 is required for a normal functioning of this module, but found tokenizers==0.15.2.
If I try to solve it by doing pip install tokenizers==0.13.1 I get this other error
File /usr/local/lib/python3.10/dist-packages/pytorch_lightning/_graveyard/utilities.py:25
17 def _get_gpu_memory_map() -> None:
18 # TODO: Remove in v2.0.0
19 raise RuntimeError(
20 "pytorch_lightning.utilities.memory.get_gpu_memory_map
was deprecated in v1.5 and is no longer supported"
21 " as of v1.9. Use pytorch_lightning.accelerators.cuda.get_nvidia_gpu_stats
instead."
22 )
---> 25 pl.utilities.memory.get_gpu_memory_map = _get_gpu_memory_map
AttributeError: partially initialized module 'pytorch_lightning' has no attribute 'utilities' (most likely due to a circular import)
It might be helpful to specify the desired package versions in the pip install inside the Dockerfile_nemo because it might be that
doing
RUN pip install lightning RUN pip install megatron.core RUN pip install --upgrade nemoguardrails RUN pip install openai RUN pip install ujson RUN pip install --upgrade --no-cache-dir gdown
is installing new and uncompatible versions of the libraries (I mean uncompatibles with the tutorials showed in the notebooks).
This feature request is about creating a content that demonstrate how to connect nemo guardrails to Llama-2-7b-chat TensorRT engine deployed on Triton Inference Server. This approach helps avoid the need for an Openai key and bypass NeMo-LLM Service when using NeMo guardrails to guard user prompts to/from the deployed model. You can use the LangChain framework to achieve the task.
The feature is required to complete the End-to-End LLM pipeline.
This feature request is required as part of an End-to-End pipeline. The process should include:
In workspace/jupyter_notebook/nemo/Multitask_Prompt_and_PTuning.ipynb, the code references megatron_gpt_prompt_learning_config.yaml, but I couldn't locate this file.
Is there a source where I can find the megatron_gpt_prompt_learning_config.yaml file?
Many unnecessary files and folders are included within the NeMo Guardrails lab, making navigation within the lab difficult. The lab should not have the entire clone repository but a folder containing only needed files, folders, and notebooks. The Deployment_Guide.md file should explicitly state the type of services and requirements (openai and nemo llm service) to run the lab.
NeMo container issues:
Start_Here.ipynb links conflict issues for different containers:
There is no version of TRT-LLM & Triton set , so there are version conflicts.
solved with #23
The readme file requires update to match the copyedited version.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.