aws-samples / aws-genai-llm-chatbot Goto Github PK

A modular and comprehensive solution to deploy a Multi-LLM and Multi-RAG powered chatbot (Amazon Bedrock, Anthropic, HuggingFace, OpenAI, Meta, AI21, Cohere) using AWS CDK on AWS

Home Page: https://aws-samples.github.io/aws-genai-llm-chatbot/

License: MIT No Attribution

JavaScript 3.51% HTML 0.08% TypeScript 68.38% Python 27.09% Shell 0.41% SCSS 0.52% Dockerfile 0.01%

cdk chatbot genai huggingface llm sagemaker semantic-search vectordb aurora pgvector

aws-genai-llm-chatbot's Introduction

Deploying a Multi-Model and Multi-RAG Powered Chatbot Using AWS CDK on AWS

This solution provides ready-to-use code so you can start experimenting with a variety of Large Language Models and Multimodal Language Models, settings and prompts in your own AWS account.

Supported model providers:

Amazon Bedrock
Amazon SageMaker self-hosted models from Foundation, Jumpstart and HuggingFace.
Third-party providers via API such as Anthropic, Cohere, AI21 Labs, OpenAI, etc. See available langchain integrations for a comprehensive list.

Additional Resources

Resource	Description
AWS Generative AI CDK Constructs	Open-source library extension of the AWS Cloud Development Kit (AWS CDK) aimed to help developers build generative AI solutions using pattern-based definitions for their architecture.
Project Lakechain	A powerful cloud-native, AI-powered, document (docs, images, audios, videos) processing framework built on top of the AWS CDK.

Roadmap

Roadmap is available through the GitHub Project

Authors

Contributors

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Changelog of the project.
License of the project.
Code of Conduct of the project.
CONTRIBUTING for more information.

Legal Disclaimer

You should consider doing your own independent assessment before using the content in this sample for production purposes. This may include (amongst other things) testing, securing, and optimizing the content provided in this sample, based on your specific quality control practices and standards.

aws-genai-llm-chatbot's People

Contributors

Stargazers

Watchers

Forkers

crajah genaipro octaviuspvn carcruz97 samhays bihartek jingyan devopsotrator sutoiku gotcha-labs thehoneymad jelly207 rooma15 napkin-dl mbednarzaws super-david-ramos malo94 akilagithub vikramshitole guyvani sabrinalameiras yarivlevy81 karancode imcuteani dt-dtran po-vincent massi-ang mohade09 krokoko fraser27 flolight gabrielle-ong alam-bilal krishna999 jeevanullas anhmike ishaan-jaff athewsey wangshuangkingcool superuser5 awsralph johannesgiorgis-galvanize marcelcastrobr fabiochiodini nicodecoker saragerion raul-arrabales gcjordi cloudsofchange joshmcqueen voicefoundry-cloud pvirk luorobin-a2z jawhnycooke vicrojo zhenghongpeng mahsanulnirjhor vinnie1909 schadem richardz17 mgvalverde basilmontycloud koushal2018 shsrams colyoonamaz egroup-ai amitkalawat debdayal-aws goozeyx veeragoni howlla peteryxu datnoor arlejeun smar10 cloudswb manishsk cliffpyles donatoaz subha-aws pariyat sunholo-data jgalego dairiley hantzley samuelbaruffi rrpedrosa ewave33 hjgraca dondehip cpollard0 happyxy jeffeland samish22 ssheff imtrahman speedyapm mshish thubz09 abelrugaju

aws-genai-llm-chatbot's Issues

Feature: label answers in the UI with the name of the LLM that was used genrating them

Adding labels the answers will help in sharing examples

The models.py script referenced in readme.md (step 3) doesn't exist.

https://github.com/aws-samples/aws-genai-llm-chatbot/blob/main/lib/model-interfaces/langchain/README.md

The models.py script referenced in readme.md step 3 used for creating adaptors for other FMs doesn't exist.

Support API Gateway based models

Hi team & thanks for the cool sample!

To help support workshop cases where where Bedrock is not yet available and temporary accounts have too-restrictive quotas to deploy LLMs directly... It would be useful if this sample could also support langchain.llms.AmazonAPIGateway based models?

For example we could then deploy a Falcon 40B SageMaker endpoint on one or two large-size instances, and use it across workshop/hackathon teams by sharing the API URL (and optionally auth token) with participants.

Allow overriding of the `ConversationalRetrievalChain` prompts

ConversationalRetrievalChain is used for RAG and we want to allow overriding of the Langchain default prompts. This solves issues with prompt specific patterns (eg Claude) but also for fine tuning of the prompts to achieve better results.

Incorrect link in README kendra section

The Kendra section mentions a "Click here" to find out how to enable it. This link is currently going to line 206 in the aws-genai-llm-chatbot-stack.ts which does not appear to be related to Kendra at all.

Container Image size is too big for lambda

Hi, When deploying this application with all the RAG stacks enabled, I get this error from cloudformation during the deployment

"Lambda function AwsGenaiLllmChatbotStack-OpenSearchVectorSearchInd-RvPI85ZArJDJ reached terminal FAILED state due to InvalidImage(SizeLimitExceeded: Uncompressed container imag
e size exceeds 10 GiB limit) and failed to stabilize"

I believe that we exceed the maximum size of lambda containers. This happens for the Aurora and Opensearch RAG sources.

Deploy to inf2 or trn1 ?

Hi,

Could you add a feature to be able to deploy to inf2 or trn1 instances ?

RAG OpenSearch deployment fails with "image size exceeds 10 GiB limit)

deploying RAG OpenSearch and got:

AwsGenaiLllmChatbotStack: creating CloudFormation changeset...
9:49:14 AM | CREATE_FAILED        | AWS::Lambda::Function                           | OpenSearchVectorSe...exDocument36B
9A569
Resource handler returned message: "Lambda function AwsGenaiLllmChatbotStack-OpenSearchVectorSearchInd-ncsjPLa7PulW reac
hed terminal FAILED state due to InvalidImage(SizeLimitExceeded: Uncompressed container image size exceeds 10 GiB limit)
and failed to stabilize" (RequestToken: 015e710a-ca90-5ada-4a74-c4b7ca44b586, HandlerErrorCode: NotStabilized)


 ❌  AwsGenaiLllmChatbotStack failed: Error: The stack named AwsGenaiLllmChatbotStack failed to deploy: UPDATE_ROLLBACK_COMPLETE: Resource handler returned message: "Lambda function AwsGenaiLllmChatbotStack-OpenSearchVectorSearchInd-ncsjPLa7PulW reached terminal FAILED state due to InvalidImage(SizeLimitExceeded: Uncompressed container image size exceeds 10 GiB limit) and failed to stabilize" (RequestToken: 015e710a-ca90-5ada-4a74-c4b7ca44b586, HandlerErrorCode: NotStabilized)
    at FullCloudFormationDeployment.monitorDeployment (/home/ec2-user/.nvm/versions/node/v18.18.0/lib/node_modules/aws-cdk/lib/index.js:467:10232)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Object.deployStack2 [as deployStack] (/home/ec2-user/.nvm/versions/node/v18.18.0/lib/node_modules/aws-cdk/lib/index.js:470:179911)
    at async /home/ec2-user/.nvm/versions/node/v18.18.0/lib/node_modules/aws-cdk/lib/index.js:470:163159

enhancement : Add Amplify Build Settings

It looks like you are doing testing/dev on Amplify. Would love to see a build file for it included as an option for deployment.

Get the source document(s) link when using RAG

Get the source document(s) reference when using Amazon Kendra RAG. This would be easier for users to know from which document(s) the answer comes from. And they could then read further the right document(s).

Error with OpenSearch Name Length

Not sure if others have seen this error but tried deploying the OpenSearch RAG and got this error.

Properties validation failed for resource RagEnginesOpenSearchVectorAccessPolicyrestapi7C36C959 with message: #/Name: failed validation constraint for keyword [pattern]

Reading around it looks like it may be case or max character related. Not sure, still testing.

Feature: Send the same prompt to multiple models at once

Add the possibility to select up to N models and send the same prompt to all the models simultaneously to ease the comparison the different models.

Using custom HuggingFace models

Reading the details here, it was my understanding that simply adding

// LLaMa-2 example from HuggingFace (https://huggingface.co/upstage/Llama-2-70b-instruct-v2)
const llama_2_70B_instruct_v2 = new SageMakerModel(this, 'Llama-2-70b-instruct-v2', {
  vpc: vpc.vpc,
  region: this.region,
  model: {
    type: DeploymentType.Container,
    modelId: 'upstage/Llama-2-70b-instruct-v2',
    container: ContainerImages.HF_PYTORCH_LLM_TGI_INFERENCE_LATEST,
    instanceType: 'ml.g5.24xlarge',
    // https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/custom-tgi-ecr/deploy.ipynb
    containerStartupHealthCheckTimeoutInSeconds: 600,
    env: {
      SM_NUM_GPUS: JSON.stringify(4),
      MAX_INPUT_LENGTH: JSON.stringify(12000),
      MAX_TOTAL_TOKENS: JSON.stringify(12001),
      HF_MODEL_QUANTIZE: 'gptq',
      TRUST_REMOTE_CODE: JSON.stringify(true),
      MAX_BATCH_PREFILL_TOKENS: JSON.stringify(12001),
      MAX_BATCH_TOTAL_TOKENS: JSON.stringify(12001),
      GPTQ_BITS: JSON.stringify(4),
      GPTQ_GROUPSIZE: JSON.stringify(128),
      DNTK_ALPHA_SCALER: JSON.stringify(0.25),
    },
  },
});
// Make model interface aware of the sagemaker endpoint and add the necessary permissions to the lambda function
langchainInterface.addSageMakerEndpoint({
  name: 'Llama-2-70b-instruct-v2',
  endpoint: llama_2_70B_instruct_v2.endpoint,
});

to aws-genai-llm-chatbot-stack.ts would be sufficient (similar to the amazon/FalconLite example).

However with this configuration I get the following error: The following resource(s) failed to create: [Llama270binstructv2upstageLlama270binstructv2035C66C7].

Any directions appreciated 🙏

Error when crawling website: ValueError: A string literal cannot contain NUL (0x00) characters.

[WARNING] 2023-10-04T15:08:35.577Z 718cab60-34cb-4d0d-a349-35064164bc36 Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.
[ERROR] ValueError: A string literal cannot contain NUL (0x00) characters.
Traceback (most recent call last):
  File "/opt/python/aws_lambda_powertools/logging/logger.py", line 442, in decorate
    return lambda_handler(event, context, *args, **kwargs)
  File "/var/task/index.py", line 30, in lambda_handler
    genai_core.websites.crawler.crawl_urls(
  File "/opt/python/genai_core/websites/crawler.py", line 58, in crawl_urls
    genai_core.chunks.add_chunks(
  File "/opt/python/genai_core/chunks.py", line 50, in add_chunks
    result = genai_core.aurora.chunks.add_chunks_aurora(
  File "/opt/python/genai_core/aurora/chunks.py", line 43, in add_chunks_aurora
    cursor.execute(

Integrate Anthropic Claude as an additional LLM

I believe we have an opportunity to elevate this project even further by considering integration with Anthropic Claude Language Model (LLM). With my prior experience, I can create the necessary code to seamlessly incorporate Anthropic Claude into our project. This integration would undoubtedly bolster our project's capabilities and offer our users a comprehensive and powerful experience.

Request to service failed for Sagemaker endpoint creation

Hi,

I'm trying to enable llama2 70b chat, I added the configuration below to aws-genai-llm-chatbot-stack.ts but the cloudformation fails during endpoint creation (model and config went fine) with the error: Request to service failed. If failure persists after retry, contact customer support.
It works fine if created from sagemaker studio.

  const llama2chat = new SageMakerModel(this, 'LLamaV270bChat', {
      vpc: vpc.vpc,
      region: this.region,
      model: {
        type: DeploymentType.ModelPackage,
        modelId: 'meta-textgeneration-llama-2-70b-f',
        instanceType: 'ml.g5.48xlarge',
        packages: (scope) =>
          new cdk.CfnMapping(scope, 'Llama2ChatPackageMapping', {
            lazy: true,
            mapping: {
              'eu-west-1': { arn: 'arn:aws:sagemaker:eu-west-1:985815980388:model-package/llama2-70b-f-v4-38616f307d8b3fcebba49bd88e19888c' }
            },
          }),
      },
    });
    // Make model interface aware of the sagemaker endpoint and add the necessary permissions to the lambda function
    langchainInterface.addSageMakerEndpoint({
      name: 'LLama2-70b-chat',
      endpoint: llama2chat.endpoint,
    });

Do you have any suggestion on what could be the issue? Thank you

Support streaming responses for SageMaker and Bedrock

Bedrock has now got Streaming responses - https://docs.aws.amazon.com/bedrock/latest/APIReference/API_InvokeModelWithResponseStream.html
SageMaker also now support Streaming response - https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_runtime_InvokeEndpointWithResponseStream.html

Feature request to provide this capability within this solution.

Changing the prompt in the Titan Adapter doesn't actually change the prompt.

The custom prompt defined in the Titan Adapter isn't actually being used by the model, cloudwatch logs still show that the model is using the default langchain prompt Use the following pieces of context to answer the users question. If you don't know the answer, just say that you don't know, don't try to make up an answer.

Integrate with PineCone on AWS

Could we look to leverage PineCone as a VectorDB? I have had great success with it in the past and we can host on AWS

Add Bedrock Titan Text Embeddings

Adding the latest titan text embeddings to the list of available embeddings

Workspace deletion

The RAG workspace doesn't get data from kendra confluence connector?

I can't access my documents that ingested to kendra using confluence connector in the same index that was created by the stack

Rollback is often stuck at the VPC resource deletion stage

Both with v1 and v2, the rollback procedure in the event of an error while deploying the CloudFormation stack often gets stuck in the VPC resource deletion step. I have to go and manually delete the resources from AWS Management Console. If I had to guess, it's due to the ordering of the deletion steps -- this was more noticeable in v1 where the VPC stack was separated from the rest of the stacks, and where I wasn't able to delete subnets before deleting the network interfaces attached to it.

P.S. I think it's great that v2 introduced a single stack: makes it much easier to manage.

Issue: RAG source undefined in chatbot when selecting the LLM before the RAG source

In the chatbot, when entering a question, then selecting the LLM to use and finally the RAG source, the chosen RAG source is not used. In the "run" action, viewable from the browser console, the ragSource is undefined. In contrast, when entering a question, then selecting the RAG source and finally the LLM, the ragSource is passed as normal. It therefore appears that there is a UI bug.

Amazon Linux 2 does not support Node JS 18

Amazon Linux 2 recommended in deployment guide for Cloud9 does not support Node JS 18 due to non upgradable GLIBC.

node: /lib64/libm.so.6: version GLIBC_2.27' not found (required by node)
node: /lib64/libc.so.6: version GLIBC_2.28' not found (required by node)

Any reasons why Ubuntu cannot be used for deployment here?

Feature Request: Bedrock Models

Can we also have sample of incoming Bedrock as FM source?

I would like the ability to just deploy a model adapter that lets me write my own inference code. So that I may just query any model I want over HTTP without any magic.

          I would like the ability to just deploy a model adapter that lets me write my own inference code. So that I may just query any model I want over HTTP without any magic.

Originally posted by @lemiesz in #5 (comment)

Feature request: Indicate if a document is being used for a response

When using the semantic search stack, it would be great to see if the uploaded documents are being used in a response (and which document was used to inform the answer).

Feature request for Polly integration

Thanks for sharing. Is there any plan to integrate with speech generation service such as Polly to enable voice chat in the future?

Lex Integration

Add an integration with Amazon Lex

Serve Query Responses from DynamoDB Conversation History

Can you provide an option to serve responses only from the DynamoDB Conversation History? This way a user can cost optimize their usage of the ml.g5.24xlarge instance?

External Kendra indices not visible in UI

After adding an External index via npm run create, the index is not visible in the UI

Provide best effort cost estimation for the solution

I greatly appreciate the dedication of individuals who contribute to this field. Recently, I decided to explore AWS using my personal account. Aware of potential costs associated with certain AWS services, I took the precaution of removing the bedrock and SageMaker-related modules, and deploying a 3P-model-only solution.

However, I encountered an unexpected situation when my account incurred substantial expenses related to the OpenSearch service. I was thinking OpenSearch Service supports a free tier trial. To my surprise, my bill statement showed charges for IndexingOCU and SearchOCU, each costing $0.24 per OCU-hour. These charges accumulated to more than $20 per day (You will be billed for a minimum of 4 OCUs (2x indexing includes primary and standby, and 2x search includes one replica for HA) for the first collection in an account), even when there was no actual utilization of the service.

While I acknowledge that gaining a deeper understanding of AWS fee schedules is advisable, I believe it's important to highlight this issue. It's worth noting for individuals who simply wish to experiment with AWS services that unexpected costs can arise. It would be highly beneficial if sample solutions could default to free-tier or low-cost configurations, enabling users to increase capacity as needed, thereby preventing inadvertent high expenses.

Thanks again for the great work.

`npm run build` failing on main branch

When running:

npm install
npm run build

the command fails with errors related to the user-interface react app

tiiuae-falcon7b-instruct endpoint creation fails

I get those errors

2023-07-12T16:57:37.051+02:00 | > File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 124, in serve_inner model = get_model(model_id, revision, sharded, quantize, trust_remote_code) File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/__init__.py", line 208, in get_model raise NotImplementedError("sharded is not supported for this model")
  | 2023-07-12T16:57:37.802+02:00 | NotImplementedError: sharded is not supported for this model #033[2m#033[3mrank#033[0m#033[2m=#033[0m0#033[0m
  | 2023-07-12T16:57:37.802+02:00 | #033[2m2023-07-12T14:57:37.591548Z#033[0m #033[31mERROR#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Shard 1 failed to start:
  | 2023-07-12T16:57:37.802+02:00 | Traceback (most recent call last): File "/opt/conda/bin/text-generation-server", line 8, in <module> sys.exit(app()) File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 67, in serve server.serve(model_id, revision, sharded, quantize, trust_remote_code, uds_path) File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 155, in serve asyncio.run(serve_inner(model_id, revision, sharded, quantize, trust_remote_code)) File "/opt/conda/lib/python3.9/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete return future.result() File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 124, in serve_inner model = get_model(model_id, revision, sharded, quantize, trust_remote_code) File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/__init__.py", line 208, in get_model raise NotImplementedError("sharded is not supported for this model")
  | 2023-07-12T16:57:37.802+02:00 | NotImplementedError: sharded is not supported for this model
  | 2023-07-12T16:57:37.802+02:00 | #033[2m2023-07-12T14:57:37.591613Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Shutting down shards
  | 2023-07-12T16:57:37.802+02:00Copy#033[2m2023-07-12T14:57:37.656495Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Shard 0 terminated | #033[2m2023-07-12T14:57:37.656495Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Shard 0 terminated
  | 2023-07-12T16:57:42.357+02:00 | Error: ShardCannotStart

my chatbot-stack.ts looks like so

import * as cdk from 'aws-cdk-lib';
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import { Construct } from 'constructs';

import { ChatBotBackendStack } from './chatbot-backend/chatbot-backend-stack';
import {
  LargeLanguageModel,
  ModelKind,
  ContainerImages,
} from './large-language-model';

export interface ChatBotStackProps extends cdk.StackProps {
  vpc: ec2.Vpc;
  semanticSearchApi: lambda.Function | null;
  maxParallelLLMQueries: number;
}

export class ChatBotStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props: ChatBotStackProps) {
    super(scope, id, {
      description: 'AWS LLM CHATBOT (uksb-1tupboc16)',
      ...props,
    });

    const { vpc, semanticSearchApi, maxParallelLLMQueries } = props;

    const largeLanguageModels = this.createLLMs({ vpc });
    new ChatBotBackendStack(this, 'ChatBotBackendStack', {
      vpc: vpc,
      semanticSearchApi,
      largeLanguageModels,
      maxParallelLLMQueries,
    });
  }

  createLLMs({ vpc }: { vpc: ec2.Vpc }) {
    const falcon7bInstruct = new LargeLanguageModel(
      this,
      'tiiuae-falcon7b-instruct',
      {
        vpc,
        region: this.region,
        model: {
          kind: ModelKind.Container,
          modelId: 'tiiuae/falcon-7b-instruct',
          container: ContainerImages.HF_PYTORCH_LLM_TGI_INFERENCE_LATEST,
          instanceType: 'ml.g5.2xlarge',
          env: {
            SM_NUM_GPUS: '4',
          },
        },
      }
    );
...
...
    return [falcon7bInstruct];
  }
}

I am deploying on eu-west-1

Thanks!

Feature: Fine-tuned Models in Bedrock

Can we add support of fine-tuned models?

The following code snippet could be added in list_bedrock_models():

        response_custom_models = bedrock.list_custom_models()
        bedrock_custom_models = response_custom_models.get("modelSummaries", [])        
        custom_models = [
            {
                "provider": "bedrock",
                "modelId": model["modelName"],
                "streaming": False,
                "type": "text-generation",
            }
            for model in bedrock_custom_models
            if (
                model["modelName"].startswith("<<fine-tuned-model-prefix>>")
            )
        ]
        models.extend(custom_models)

In addition, it should include an error handler for API requests.

Use existing SageMaker enpoints

Add an ability to use existing LLMs deployed on SageMaker rather than deploying new one.

Enhancement: Customization of Prompts after RAG Execution

Enhancement Proposal:

Allow customers the flexibility to customize their prompts post-RAG. This will enable a more dynamic way of interacting with the chatbot, allowing customers to potentially achieve better results through prompt engineering.

Benefits:

Prompt Engineering: Customers will have a hands-on approach to experiment with different prompts and refine their chatbot queries, potentially leading to more accurate or desirable responses.
User Customization: Offering this flexibility can help cater to a wider range of user requirements and use-cases.
Enhanced User Experience: It can make the chatbot feel more interactive and user-friendly.

Fix Bedrock Anthropic Integration

Given that the invocation of Anthropic has changed in Bedrock could you please fix it to make it work again?

"We are requiring all Amazon Bedrock customers consuming Anthropic’s Claude to send prompts to the model in the following syntax: “\n\nHuman: \n\nAssistant:”."

Now the invocation of Claude fails with this error:

"<class 'ValueError'>:Error raised by bedrock service: An error occurred (ValidationException) when calling the InvokeModel operation: Invalid prompt: prompt must end with "

Assistant:" turn"

Thank you very much.

`DocumentIndexing` fails for the Semantic Search Stack

I've tried this a few times: Git cloning the original repo and following the steps mentioned in the README, I get the following error:

11:34:24 PM | CREATE_FAILED        | AWS::CloudFormation::Stack                  | DocumentIndexingSt...ckResourceFCFB528D

where DocumentIndexing fails within the semantic search stack.

The only change I did is to set this line to true. Any ideas?

Chatbot UI is stuck at "Loading models..."

Screenshot attached below. Quickly inspecting Chat.tsx, I get the impression that the models variable is never populated.

EDIT: I'm using a custom AWS SageMaker hosted model (Falcon) as I don't have access to AWS Bedrock yet.

CloudFormation delete fails because of the VPC cannot be removed

I tried to deploy it on us-west-2 and it failed because of the Quota for the ml.g5.12xlarge.
After I increased the quota, I tried to deploy it again but apparently the CloudFormation fails because LogicalID "SharedVPCprivateSubnet1Subnet5A4C2616"
"The subnet 'subnet-***' has dependencies and cannot be deleted. (Service: Ec2, Status Code: 400, Request ID: ***)"

The deploy config has the FalconLite, Bedrock endpoint and the all three RAG options. As embeddings model: "intfloat/multilingual-e5-large"

Option to select the persona of chatbot

Can we have a pre-built list of personas :

Health Advisor
Finance Advisor
Insurance Advisor
etc

And specific Prompt mapped to the persona to get better results from the models and a way to tweak the Prompts.

Add Bedrock Cohere Command Text model

Add support for Bedrock Cohere Command Text model

RAG sources deployed but not recognized

I have two RAG sources deployed. When I go to the 'Files' tab to upload documents to perform RAG on my documents, I see a message that reads

No RAG sources deployed yet
You need to deploy a RAG source in your environment to perform RAG on your documents. You can find CDK constructs to deploy RAG source(s) in the Public repository

Open AI has been enabled and the models are visible in the Chatbot window. The two RAG sources I deployed are also visible in the same window. However the RAG sources are not visible when I go to upload a file.

Response length is being truncated - LLM titanxl- where can this be configured?

Responses are being truncated even using titan xl. Is there a way to configure the response length? It doesn't exceed Titan's token output length.

Stop sequence for Claude model

I had the Claude model give out Human: in the end of it as response. This was due to no stop sequence configured. I think it might be a good practice to have it built-in within the adaptor for Claude

Feature: add a visual flag to show when a document has been indexed by each enabled RAG solution

It would be useful to see which documents have been indexed by which RAG solution

Kendra not syncing data source

Steps to recreate:

Create a new workspace on Kendra with your index created during deployment
Upload documents and wait for documents to show processed
Run query on documents, no results will show in metadata
Go to Kendra index > Data Sources in AWS console
Data source sync history will show empty and no last run time

Things to note:

Kendra processed status is almost instant in workspace view but typically it takes a few minutes, depending on the document size.
v2 of the Kendra Rag would start the sync after you uploaded documents. Might be worth comparing v3 to v2.

Issues deploying on Linux with a different architecture

It seems that building the Docker images associated with some lambda functions fails on Linux using a different architecture than the target Lambda architecture on AWS.

The issue is associated with the DockerImageAsset CDK construct and integration of buildx would help.

aws-samples / aws-genai-llm-chatbot Goto Github PK

aws-genai-llm-chatbot's Introduction

Deploying a Multi-Model and Multi-RAG Powered Chatbot Using AWS CDK on AWS

Additional Resources

Roadmap

Authors

Contributors

License

Legal Disclaimer

aws-genai-llm-chatbot's People

Contributors

Stargazers

Watchers

Forkers

aws-genai-llm-chatbot's Issues

Recommend Projects

Recommend Topics

Recommend Org