Git Product home page Git Product logo

langroid's Introduction

Logo

PyPI - Version Pytest codecov Multi-Architecture DockerHub

Static Badge Open in Colab Discord Substack

Langroid is an intuitive, lightweight, extensible and principled Python framework to easily build LLM-powered applications, from ex-CMU and UW-Madison researchers. You set up Agents, equip them with optional components (LLM, vector-store and tools/functions), assign them tasks, and have them collaboratively solve a problem by exchanging messages. This Multi-Agent paradigm is inspired by the Actor Framework (but you do not need to know anything about this!).

Langroid is a fresh take on LLM app-development, where considerable thought has gone into simplifying the developer experience; it does not use Langchain.

🔥 See this Intro to Langroid blog post from the LanceDB team

We welcome contributions -- See the contributions document for ideas on what to contribute.

Are you building LLM Applications, or want help with Langroid for your company, or want to prioritize Langroid features for your company use-cases? Prasad Chalasani is available for consulting (advisory/development): pchalasani at gmail dot com.

Sponsorship is also accepted via GitHub Sponsors

Questions, Feedback, Ideas? Join us on Discord!

Quick glimpse of coding with Langroid

This is just a teaser; there's much more, like function-calling/tools, Multi-Agent Collaboration, Structured Information Extraction, DocChatAgent (RAG), SQLChatAgent, non-OpenAI local/remote LLMs, etc. Scroll down or see docs for more. See the Langroid Quick-Start Colab that builds up to a 2-agent information-extraction example using the OpenAI ChatCompletion API. See also this version that uses the OpenAI Assistants API instead.

🔥 just released! Example script showing how you can use Langroid multi-agents and tools to extract structured information from a document using only a local LLM (Mistral-7b-instruct-v0.2).

import langroid as lr
import langroid.language_models as lm

# set up LLM
llm_cfg = lm.OpenAIGPTConfig( # or OpenAIAssistant to use Assistant API 
  # any model served via an OpenAI-compatible API
  chat_model=lm.OpenAIChatModel.GPT4_TURBO, # or, e.g., "ollama/mistral"
)
# use LLM directly
mdl = lm.OpenAIGPT(llm_cfg)
response = mdl.chat("What is the capital of Ontario?", max_tokens=10)

# use LLM in an Agent
agent_cfg = lr.ChatAgentConfig(llm=llm_cfg)
agent = lr.ChatAgent(agent_cfg)
agent.llm_response("What is the capital of China?") 
response = agent.llm_response("And India?") # maintains conversation state 

# wrap Agent in a Task to run interactive loop with user (or other agents)
task = lr.Task(agent, name="Bot", system_message="You are a helpful assistant")
task.run("Hello") # kick off with user saying "Hello"

# 2-Agent chat loop: Teacher Agent asks questions to Student Agent
teacher_agent = lr.ChatAgent(agent_cfg)
teacher_task = lr.Task(
  teacher_agent, name="Teacher",
  system_message="""
    Ask your student concise numbers questions, and give feedback. 
    Start with a question.
    """
)
student_agent = lr.ChatAgent(agent_cfg)
student_task = lr.Task(
  student_agent, name="Student",
  system_message="Concisely answer the teacher's questions.",
  single_round=True,
)

teacher_task.add_sub_task(student_task)
teacher_task.run()

🔥 Updates/Releases

Click to expand
  • Apr 2024:

    • 0.1.236: Support for open LLMs hosted on Groq, e.g. specify chat_model="groq/llama3-8b-8192". See tutorial.
    • 0.1.235: Task.run(), Task.run_async(), run_batch_tasks have max_cost and max_tokens params to exit when tokens or cost exceed a limit. The result ChatDocument.metadata now includes a status field which is a code indicating a task completion reason code. Also task.run() etc can be invoked with an explicit session_id field which is used as a key to look up various settings in Redis cache. Currently only used to look up "kill status" - this allows killing a running task, either by task.kill() or by the classmethod Task.kill_session(session_id). For example usage, see the test_task_kill in tests/main/test_task.py
  • Mar 2024:

    • 0.1.216: Improvements to allow concurrent runs of DocChatAgent, see the test_doc_chat_agent.py in particular the test_doc_chat_batch(); New task run utility: run_batch_task_gen where a task generator can be specified, to generate one task per input.
    • 0.1.212: ImagePdfParser: support for extracting text from image-based PDFs. (this means DocChatAgent will now work with image-pdfs).
    • 0.1.194 - 0.1.211: Misc fixes, improvements, and features:
      • Big enhancement in RAG performance (mainly, recall) due to a fix in Relevance Extractor
      • DocChatAgent context-window fixes
      • Anthropic/Claude3 support via Litellm
      • URLLoader: detect file time from header when URL doesn't end with a recognizable suffix like .pdf, .docx, etc.
      • Misc lancedb integration fixes
      • Auto-select embedding config based on whether sentence_transformer module is available.
      • Slim down dependencies, make some heavy ones optional, e.g. unstructured, haystack, chromadb, mkdocs, huggingface-hub, sentence-transformers.
      • Easier top-level imports from import langroid as lr
      • Improve JSON detection, esp from weak LLMs
  • Feb 2024:

    • 0.1.193: Support local LLMs using Ollama's new OpenAI-Compatible server: simply specify chat_model="ollama/mistral". See release notes.
    • 0.1.183: Added Chainlit support via callbacks. See examples.
  • Jan 2024:

    • 0.1.175
      • Neo4jChatAgent to chat with a neo4j knowledge-graph. (Thanks to Mohannad!). The agent uses tools to query the Neo4j schema and translate user queries to Cypher queries, and the tool handler executes these queries, returning them to the LLM to compose a natural language response (analogous to how SQLChatAgent works). See example script using this Agent to answer questions about Python pkg dependencies.
      • Support for .doc file parsing (in addition to .docx)
      • Specify optional formatter param in OpenAIGPTConfig to ensure accurate chat formatting for local LLMs.
    • 0.1.157: DocChatAgentConfig has a new param: add_fields_to_content, to specify additional document fields to insert into the main content field, to help improve retrieval.
    • 0.1.156: New Task control signals PASS_TO, SEND_TO; VectorStore: Compute Pandas expression on documents; LanceRAGTaskCreator creates 3-agent RAG system with Query Planner, Critic and RAG Agent.
  • Dec 2023:

    • 0.1.154: (For details see release notes of 0.1.149 and 0.1.154).
      • DocChatAgent: Ingest Pandas dataframes and filtering.
      • LanceDocChatAgent leverages LanceDB vector-db for efficient vector search and full-text search and filtering.
      • Improved task and multi-agent control mechanisms
      • LanceRAGTaskCreator to create a 2-agent system consisting of a LanceFilterAgent that decides a filter and rephrase query to send to a RAG agent.
    • 0.1.141: API Simplifications to reduce boilerplate: auto-select an available OpenAI model (preferring gpt-4-turbo), simplifies defaults. Simpler Task initialization with default ChatAgent.
  • Nov 2023:

  • Oct 2023:

    • 0.1.107: DocChatAgent re-rankers: rank_with_diversity, rank_to_periphery (lost in middle).
    • 0.1.102: DocChatAgentConfig.n_neighbor_chunks > 0 allows returning context chunks around match.
    • 0.1.101: DocChatAgent uses RelevanceExtractorAgent to have the LLM extract relevant portions of a chunk using sentence-numbering, resulting in huge speed up and cost reduction compared to the naive "sentence-parroting" approach (writing out full sentences out relevant whole sentences) which LangChain uses in their LLMChainExtractor.
    • 0.1.100: API update: all of Langroid is accessible with a single import, i.e. import langroid as lr. See the documentation for usage.
    • 0.1.99: Convenience batch functions to run tasks, agent methods on a list of inputs concurrently in async mode. See examples in test_batch.py.
    • 0.1.95: Added support for Momento Serverless Vector Index
    • 0.1.94: Added support for LanceDB vector-store -- allows vector, Full-text, SQL search.
    • 0.1.84: Added LiteLLM, so now Langroid can be used with over 100 LLM providers (remote or local)! See guide here.
  • Sep 2023:

    • 0.1.78: Async versions of several Task, Agent and LLM methods; Nested Pydantic classes are now supported for LLM Function-calling, Tools, Structured Output.
    • 0.1.76: DocChatAgent: support for loading docx files (preliminary).
    • 0.1.72: Many improvements to DocChatAgent: better embedding model, hybrid search to improve retrieval, better pdf parsing, re-ranking retrieved results with cross-encoders.
    • Use with local LLama Models: see tutorial here
    • Langroid Blog/Newsletter Launched!: First post is here -- Please subscribe to stay updated.
    • 0.1.56: Support Azure OpenAI.
    • 0.1.55: Improved SQLChatAgent that efficiently retrieves relevant schema info when translating natural language to SQL.
  • Aug 2023:

  • July 2023:

🚀 Demo

Suppose you want to extract structured information about the key terms of a commercial lease document. You can easily do this with Langroid using a two-agent system, as we show in the langroid-examples repo. (See this script for a version with the same functionality using a local Mistral-7b model.) The demo showcases just a few of the many features of Langroid, such as:

  • Multi-agent collaboration: LeaseExtractor is in charge of the task, and its LLM (GPT4) generates questions to be answered by the DocAgent.
  • Retrieval augmented question-answering, with source-citation: DocAgent LLM (GPT4) uses retrieval from a vector-store to answer the LeaseExtractor's questions, cites the specific excerpt supporting the answer.
  • Function-calling (also known as tool/plugin): When it has all the information it needs, the LeaseExtractor LLM presents the information in a structured format using a Function-call.

Here is what it looks like in action (a pausable mp4 video is here).

Demo

⚡ Highlights

(For a more up-to-date list see the release section above)

  • Agents as first-class citizens: The Agent class encapsulates LLM conversation state, and optionally a vector-store and tools. Agents are a core abstraction in Langroid; Agents act as message transformers, and by default provide 3 responder methods, one corresponding to each entity: LLM, Agent, User.
  • Tasks: A Task class wraps an Agent, and gives the agent instructions (or roles, or goals), manages iteration over an Agent's responder methods, and orchestrates multi-agent interactions via hierarchical, recursive task-delegation. The Task.run() method has the same type-signature as an Agent's responder's methods, and this is key to how a task of an agent can delegate to other sub-tasks: from the point of view of a Task, sub-tasks are simply additional responders, to be used in a round-robin fashion after the agent's own responders.
  • Modularity, Reusabilily, Loose coupling: The Agent and Task abstractions allow users to design Agents with specific skills, wrap them in Tasks, and combine tasks in a flexible way.
  • LLM Support: Langroid supports OpenAI LLMs as well as LLMs from hundreds of providers (local/open or remote/commercial) via proxy libraries and local model servers such as LiteLLM that in effect mimic the OpenAI API.
  • Caching of LLM responses: Langroid supports Redis and Momento to cache LLM responses.
  • Vector-stores: LanceDB, Qdrant, Chroma are currently supported. Vector stores allow for Retrieval-Augmented-Generation (RAG).
  • Grounding and source-citation: Access to external documents via vector-stores allows for grounding and source-citation.
  • Observability, Logging, Lineage: Langroid generates detailed logs of multi-agent interactions and maintains provenance/lineage of messages, so that you can trace back the origin of a message.
  • Tools/Plugins/Function-calling: Langroid supports OpenAI's recently released function calling feature. In addition, Langroid has its own native equivalent, which we call tools (also known as "plugins" in other contexts). Function calling and tools have the same developer-facing interface, implemented using Pydantic, which makes it very easy to define tools/functions and enable agents to use them. Benefits of using Pydantic are that you never have to write complex JSON specs for function calling, and when the LLM hallucinates malformed JSON, the Pydantic error message is sent back to the LLM so it can fix it!

⚙️ Installation and Setup

Install langroid

Langroid requires Python 3.11+. We recommend using a virtual environment. Use pip to install langroid (from PyPi) to your virtual environment:

pip install langroid

The core Langroid package lets you use OpenAI Embeddings models via their API. If you instead want to use the sentence-transformers embedding models from HuggingFace, install Langroid like this:

pip install langroid[hf-embeddings]

If using zsh (or similar shells), you may need to escape the square brackets, e.g.:

pip install langroid\[hf-embeddings\]

or use quotes:

pip install "langroid[hf-embeddings]"
Optional Installs for using SQL Chat with a PostgreSQL DB

If you are using SQLChatAgent (e.g. the script examples/data-qa/sql-chat/sql_chat.py), with a postgres db, you will need to:

  • Install PostgreSQL dev libraries for your platform, e.g.
    • sudo apt-get install libpq-dev on Ubuntu,
    • brew install postgresql on Mac, etc.
  • Install langroid with the postgres extra, e.g. pip install langroid[postgres] or poetry add langroid[postgres] or poetry install -E postgres. If this gives you an error, try pip install psycopg2-binary in your virtualenv.

Set up environment variables (API keys, etc)

To get started, all you need is an OpenAI API Key. If you don't have one, see this OpenAI Page. Currently only OpenAI models are supported. Others will be added later (Pull Requests welcome!).

In the root of the repo, copy the .env-template file to a new file .env:

cp .env-template .env

Then insert your OpenAI API Key. Your .env file should look like this (the organization is optional but may be required in some scenarios).

OPENAI_API_KEY=your-key-here-without-quotes
OPENAI_ORGANIZATION=optionally-your-organization-id

Alternatively, you can set this as an environment variable in your shell (you will need to do this every time you open a new shell):

export OPENAI_API_KEY=your-key-here-without-quotes
Optional Setup Instructions (click to expand)

All of the following environment variable settings are optional, and some are only needed to use specific features (as noted below).

  • Qdrant Vector Store API Key, URL. This is only required if you want to use Qdrant cloud. The default vector store in our RAG agent (DocChatAgent) is LanceDB which uses file storage, and you do not need to set up any environment variables for that. Alternatively Chroma is also currently supported. We use the local-storage version of Chroma, so there is no need for an API key.
  • Redis Password, host, port: This is optional, and only needed to cache LLM API responses using Redis Cloud. Redis offers a free 30MB Redis account which is more than sufficient to try out Langroid and even beyond. If you don't set up these, Langroid will use a pure-python Redis in-memory cache via the Fakeredis library.
  • Momento Serverless Caching of LLM API responses (as an alternative to Redis). To use Momento instead of Redis:
    • enter your Momento Token in the .env file, as the value of MOMENTO_AUTH_TOKEN (see example file below),
    • in the .env file set CACHE_TYPE=momento (instead of CACHE_TYPE=redis which is the default).
  • GitHub Personal Access Token (required for apps that need to analyze git repos; token-based API calls are less rate-limited). See this GitHub page.
  • Google Custom Search API Credentials: Only needed to enable an Agent to use the GoogleSearchTool. To use Google Search as an LLM Tool/Plugin/function-call, you'll need to set up a Google API key, then setup a Google Custom Search Engine (CSE) and get the CSE ID. (Documentation for these can be challenging, we suggest asking GPT4 for a step-by-step guide.) After obtaining these credentials, store them as values of GOOGLE_API_KEY and GOOGLE_CSE_ID in your .env file. Full documentation on using this (and other such "stateless" tools) is coming soon, but in the meantime take a peek at this chat example, which shows how you can easily equip an Agent with a GoogleSearchtool.

If you add all of these optional variables, your .env file should look like this:

OPENAI_API_KEY=your-key-here-without-quotes
GITHUB_ACCESS_TOKEN=your-personal-access-token-no-quotes
CACHE_TYPE=redis # or momento
REDIS_PASSWORD=your-redis-password-no-quotes
REDIS_HOST=your-redis-hostname-no-quotes
REDIS_PORT=your-redis-port-no-quotes
MOMENTO_AUTH_TOKEN=your-momento-token-no-quotes # instead of REDIS* variables
QDRANT_API_KEY=your-key
QDRANT_API_URL=https://your.url.here:6333 # note port number must be included
GOOGLE_API_KEY=your-key
GOOGLE_CSE_ID=your-cse-id
Optional setup instructions for Microsoft Azure OpenAI(click to expand)

When using Azure OpenAI, additional environment variables are required in the .env file. This page Microsoft Azure OpenAI provides more information, and you can set each environment variable as follows:

  • AZURE_OPENAI_API_KEY, from the value of API_KEY
  • AZURE_OPENAI_API_BASE from the value of ENDPOINT, typically looks like https://your.domain.azure.com.
  • For AZURE_OPENAI_API_VERSION, you can use the default value in .env-template, and latest version can be found here
  • AZURE_OPENAI_DEPLOYMENT_NAME is the name of the deployed model, which is defined by the user during the model setup
  • AZURE_OPENAI_MODEL_NAME Azure OpenAI allows specific model names when you select the model for your deployment. You need to put precisly the exact model name that was selected. For example, GPT-3.5 (should be gpt-35-turbo-16k or gpt-35-turbo) or GPT-4 (should be gpt-4-32k or gpt-4).
  • AZURE_OPENAI_MODEL_VERSION is required if AZURE_OPENAI_MODEL_NAME = gpt=4, which will assist Langroid to determine the cost of the model

🐳 Docker Instructions

We provide a containerized version of the langroid-examples repository via this Docker Image. All you need to do is set up environment variables in the .env file. Please follow these steps to setup the container:

# get the .env file template from `langroid` repo
wget -O .env https://raw.githubusercontent.com/langroid/langroid/main/.env-template

# Edit the .env file with your favorite editor (here nano), and remove any un-used settings. E.g. there are "dummy" values like "your-redis-port" etc -- if you are not using them, you MUST remove them.
nano .env

# launch the container
docker run -it --rm  -v ./.env:/langroid/.env langroid/langroid

# Use this command to run any of the scripts in the `examples` directory
python examples/<Path/To/Example.py> 

🎉 Usage Examples

These are quick teasers to give a glimpse of what you can do with Langroid and how your code would look.

⚠️ The code snippets below are intended to give a flavor of the code and they are not complete runnable examples! For that we encourage you to consult the langroid-examples repository.

ℹ️ The various LLM prompts and instructions in Langroid have been tested to work well with GPT4. Switching to GPT3.5-Turbo is easy via a config flag (e.g., cfg = OpenAIGPTConfig(chat_model=OpenAIChatModel.GPT3_5_TURBO)), and may suffice for some applications, but in general you may see inferior results.

📖 Also see the Getting Started Guide for a detailed tutorial.

Click to expand any of the code examples below. All of these can be run in a Colab notebook: Open in Colab

Direct interaction with OpenAI LLM
import langroid.language_models as lm

mdl = lm.OpenAIGPT()

messages = [
  lm.LLMMessage(content="You are a helpful assistant",  role=lm.Role.SYSTEM), 
  lm.LLMMessage(content="What is the capital of Ontario?",  role=lm.Role.USER),
]

response = mdl.chat(messages, max_tokens=200)
print(response.message)
Interaction with non-OpenAI LLM (local or remote) Local model: if model is served at `http://localhost:8000`:
cfg = lm.OpenAIGPTConfig(
  chat_model="local/localhost:8000", 
  chat_context_length=4096
)
mdl = lm.OpenAIGPT(cfg)
# now interact with it as above, or create an Agent + Task as shown below.

If the model is supported by liteLLM, then no need to launch the proxy server. Just set the chat_model param above to litellm/[provider]/[model], e.g. litellm/anthropic/claude-instant-1 and use the config object as above. Note that to use litellm you need to install langroid with the litellm extra: poetry install -E litellm or pip install langroid[litellm]. For remote models, you will typically need to set API Keys etc as environment variables. You can set those based on the LiteLLM docs. If any required environment variables are missing, Langroid gives a helpful error message indicating which ones are needed. Note that to use langroid with litellm you need to install the litellm extra, i.e. either pip install langroid[litellm] in your virtual env, or if you are developing within the langroid repo, poetry install -E litellm.

pip install langroid[litellm]
Define an agent, set up a task, and run it
import langroid as lr

agent = lr.ChatAgent()

# get response from agent's LLM, and put this in an interactive loop...
# answer = agent.llm_response("What is the capital of Ontario?")
  # ... OR instead, set up a task (which has a built-in loop) and run it
task = lr.Task(agent, name="Bot") 
task.run() # ... a loop seeking response from LLM or User at each turn
Three communicating agents

A toy numbers game, where when given a number n:

  • repeater_task's LLM simply returns n,
  • even_task's LLM returns n/2 if n is even, else says "DO-NOT-KNOW"
  • odd_task's LLM returns 3*n+1 if n is odd, else says "DO-NOT-KNOW"

Each of these Tasks automatically configures a default ChatAgent.

import langroid as lr
from langroid.utils.constants import NO_ANSWER

repeater_task = lr.Task(
    name = "Repeater",
    system_message="""
    Your job is to repeat whatever number you receive.
    """,
    llm_delegate=True, # LLM takes charge of task
    single_round=False, 
)

even_task = lr.Task(
    name = "EvenHandler",
    system_message=f"""
    You will be given a number. 
    If it is even, divide by 2 and say the result, nothing else.
    If it is odd, say {NO_ANSWER}
    """,
    single_round=True,  # task done after 1 step() with valid response
)

odd_task = lr.Task(
    name = "OddHandler",
    system_message=f"""
    You will be given a number n. 
    If it is odd, return (n*3+1), say nothing else. 
    If it is even, say {NO_ANSWER}
    """,
    single_round=True,  # task done after 1 step() with valid response
)

Then add the even_task and odd_task as sub-tasks of repeater_task, and run the repeater_task, kicking it off with a number as input:

repeater_task.add_sub_task([even_task, odd_task])
repeater_task.run("3")
Simple Tool/Function-calling example

Langroid leverages Pydantic to support OpenAI's Function-calling API as well as its own native tools. The benefits are that you don't have to write any JSON to specify the schema, and also if the LLM hallucinates a malformed tool syntax, Langroid sends the Pydantic validation error (suitably sanitized) to the LLM so it can fix it!

Simple example: Say the agent has a secret list of numbers, and we want the LLM to find the smallest number in the list. We want to give the LLM a probe tool/function which takes a single number n as argument. The tool handler method in the agent returns how many numbers in its list are at most n.

First define the tool using Langroid's ToolMessage class:

import langroid as lr

class ProbeTool(lr.agent.ToolMessage):
  request: str = "probe" # specifies which agent method handles this tool
  purpose: str = """
        To find how many numbers in my list are less than or equal to  
        the <number> you specify.
        """ # description used to instruct the LLM on when/how to use the tool
  number: int  # required argument to the tool

Then define a SpyGameAgent as a subclass of ChatAgent, with a method probe that handles this tool:

class SpyGameAgent(lr.ChatAgent):
  def __init__(self, config: lr.ChatAgentConfig):
    super().__init__(config)
    self.numbers = [3, 4, 8, 11, 15, 25, 40, 80, 90]

  def probe(self, msg: ProbeTool) -> str:
    # return how many numbers in self.numbers are less or equal to msg.number
    return str(len([n for n in self.numbers if n <= msg.number]))

We then instantiate the agent and enable it to use and respond to the tool:

spy_game_agent = SpyGameAgent(
    lr.ChatAgentConfig(
        name="Spy",
        vecdb=None,
        use_tools=False, #  don't use Langroid native tool
        use_functions_api=True, # use OpenAI function-call API
    )
)
spy_game_agent.enable_message(ProbeTool)

For a full working example see the chat-agent-tool.py script in the langroid-examples repo.

Tool/Function-calling to extract structured information from text

Suppose you want an agent to extract the key terms of a lease, from a lease document, as a nested JSON structure. First define the desired structure via Pydantic models:

from pydantic import BaseModel
class LeasePeriod(BaseModel):
    start_date: str
    end_date: str


class LeaseFinancials(BaseModel):
    monthly_rent: str
    deposit: str

class Lease(BaseModel):
    period: LeasePeriod
    financials: LeaseFinancials
    address: str

Then define the LeaseMessage tool as a subclass of Langroid's ToolMessage. Note the tool has a required argument terms of type Lease:

import langroid as lr

class LeaseMessage(lr.agent.ToolMessage):
    request: str = "lease_info"
    purpose: str = """
        Collect information about a Commercial Lease.
        """
    terms: Lease

Then define a LeaseExtractorAgent with a method lease_info that handles this tool, instantiate the agent, and enable it to use and respond to this tool:

class LeaseExtractorAgent(lr.ChatAgent):
    def lease_info(self, message: LeaseMessage) -> str:
        print(
            f"""
        DONE! Successfully extracted Lease Info:
        {message.terms}
        """
        )
        return json.dumps(message.terms.dict())
    
lease_extractor_agent = LeaseExtractorAgent()
lease_extractor_agent.enable_message(LeaseMessage)

See the chat_multi_extract.py script in the langroid-examples repo for a full working example.

Chat with documents (file paths, URLs, etc)

Langroid provides a specialized agent class DocChatAgent for this purpose. It incorporates document sharding, embedding, storage in a vector-DB, and retrieval-augmented query-answer generation. Using this class to chat with a collection of documents is easy. First create a DocChatAgentConfig instance, with a doc_paths field that specifies the documents to chat with.

import langroid as lr
from langroid.agent.special import DocChatAgentConfig, DocChatAgent

config = DocChatAgentConfig(
  doc_paths = [
    "https://en.wikipedia.org/wiki/Language_model",
    "https://en.wikipedia.org/wiki/N-gram_language_model",
    "/path/to/my/notes-on-language-models.txt",
  ],
  vecdb=lr.vector_store.LanceDBConfig(),
)

Then instantiate the DocChatAgent (this ingests the docs into the vector-store):

agent = DocChatAgent(config)

Then we can either ask the agent one-off questions,

agent.llm_response("What is a language model?")

or wrap it in a Task and run an interactive loop with the user:

task = lr.Task(agent)
task.run()

See full working scripts in the docqa folder of the langroid-examples repo.

🔥 Chat with tabular data (file paths, URLs, dataframes)

Using Langroid you can set up a TableChatAgent with a dataset (file path, URL or dataframe), and query it. The Agent's LLM generates Pandas code to answer the query, via function-calling (or tool/plugin), and the Agent's function-handling method executes the code and returns the answer.

Here is how you can do this:

import langroid as lr
from langroid.agent.special import TableChatAgent, TableChatAgentConfig

Set up a TableChatAgent for a data file, URL or dataframe (Ensure the data table has a header row; the delimiter/separator is auto-detected):

dataset =  "https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
# or dataset = "/path/to/my/data.csv"
# or dataset = pd.read_csv("/path/to/my/data.csv")
agent = TableChatAgent(
    config=TableChatAgentConfig(
        data=dataset,
    )
)

Set up a task, and ask one-off questions like this:

task = lr.Task(
  agent, 
  name = "DataAssistant",
  default_human_response="", # to avoid waiting for user input
)
result = task.run(
  "What is the average alcohol content of wines with a quality rating above 7?",
  turns=2 # return after user question, LLM fun-call/tool response, Agent code-exec result
) 
print(result.content)

Or alternatively, set up a task and run it in an interactive loop with the user:

task = lr.Task(agent, name="DataAssistant")
task.run()

For a full working example see the table_chat.py script in the langroid-examples repo.


❤️ Thank you to our supporters

If you like this project, please give it a star ⭐ and 📢 spread the word in your network or social media:

Share on Twitter Share on LinkedIn Share on Hacker News Share on Reddit

Your support will help build Langroid's momentum and community.

Langroid Co-Founders

  • Prasad Chalasani (IIT BTech/CS, CMU PhD/ML; Independent ML Consultant)
  • Somesh Jha (IIT BTech/CS, CMU PhD/CS; Professor of CS, U Wisc at Madison)

langroid's People

Contributors

alfr3dok1ng avatar ameerarsala avatar ashishhoodaiitd avatar derekparks avatar mohannadcse avatar mturk24 avatar nasonz avatar nilspalumbo avatar pchalasani avatar pixelkaiser avatar rithwikbabu avatar sanders41 avatar zappaboy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

langroid's Issues

Improve human prompt in chats

Is your feature request related to a problem? Please describe.

Currently in chats, the human prompt is:

respond or q, x to exit current level, or hit enter to continue

and it is unclear what happens when the human actually "responds".
In most contexts, the response will overwrite the current pending message.

Describe the solution you'd like
We can improve the prompt by succinctly clarifying what the effect
of responding would be, E.g. Overwrite <agent_name> response...

Add ToolValidatorAgent

LLM sometimes does not use the expected syntax for a tool.
We can have a ToolValidatorAgent detect this and tell the LLM to try again.
Similar to how ValidatorAgent handles missing TO: Address

Rationalize auto-gen of json, non-json examples

Auto-generation of non-json message examples can be done in the base agent class,
since we don't need a specific one for each message-type.
Also need to decide whether to include in Agent or ChatAgent

reorg dependencies into core and extras

Organize pyproject.toml so that non-core dependencies are marked as optional,
and installing those requires special flags on poetry install.
See best practices on how.
Currently poetry install installs 200+ dependencies!
A large part are from sentence-transformer.
The github workflow actions tests should not include anything involving this lib,
which depends on torch, which can cause endless headaches as their versions change
e.g. I spent several days dealing with a test failure happening only on gh actions,
and not locally, because in the gh env torch 2.0.1's nvidia deps were not being installed.
It took a while to figure that this was a problem with torch 2.0.1 but not 2.0.0
We should not be dealing with these kinds of issues extraneous to our lib.
See also issue #56 on splitting tests into unit and integration.

Compute the cost of chat completion tasks

Describe the solution you'd like
Reporting cost + the total number of tokens + potentially saving for chat completion tasks.
Currently, OpenAI doesn't report these aspects like the completion API.

Additional context
The main logic is as follows:

  1. Compute the total tokens for each LLM response
    1.1. Leverage the attribute usage under the class ChatDocMetaData. Set this attribute response.metadata.usage = self.num_tokens(response.content) + self.chat_num_tokens(hist)
    1.2. Add usage attribute under the class LLMMessage. Then, for every LLM response, set this attribute.
    1.3. make sure to set the usage attribute when the response is converted to LLMMessage
  2. Aggregate the total tokens for ALL LLM responses
    2.1. iterate over the list message_history and get the value of the usage attribute, and sum them up. Should I exclude cached responses??
    2.2. I may need to add an attribute total_tokens to the class ChatAgent to report the total number of tokens
  3. Compute the cost

Extract python version

Here is my plan:
1- add a file identify_python_version.py under the directory llmagent/llmagent/parsing. This file contains several methods to identify the version based on various Python implementations, for example, a func to extract the version from pyproject.toml.
2- add test cases to verify each extraction func
3- Update the dummy function python_version member of the class DockerChatAgent.
@pchalasani: plz let me know when I can work on step 3 to avoid conflict

allow users to install with/without optional dependencies

Ideal workflow: when releasing new version of package: First build and publish to github using:

make all-patch 

Then poetry publish to pub to PyPI.
When a user wants to install just core langroid without "extras", they do:

pip install langroid

And if they want the hf-embeddings extras, they would do

pip install langroid[hf-embeddings]

which will install torch and sentence-transformer.

The idea is for those who only want to use OpenAI embedding models (via their API), they should not have to install these additional dependencies (and install torch often brings headaches with CUDA versions etc).

Currently the above workflow does not work, i.e. I am able to install the core, but unable to install the extras with the second command. The pyproject.toml dependencies may need to be fixed. I've tried various combinations of extras, optional=true etc.

Parsing issue after calling handle_message

I'm running the following test case

image

For some reason this method handle_message returns None, which leads to test case failure

My hunch is that adding the dockerfile definition misses up the parsing logic, which I suspect it assumes the value will be a single line, not multiple lines.

The content of the message looks like this

"Ok, thank you. \n                {\n                'request': 'validate_dockerfile',\n                'proposed_dockerfile':'\n                FROM ubuntu:latest\n                LABEL maintainer=blah\n                '\n                } \n                this is the Dockerfile!\n                "

In the meantime, I provided only one line string (not dockerfile definition) and the test case succeeded.

add mypy, ruff checks

look at best practices for these and incorporate them into Makefile,
and github wkfl actions:

  • ruff for "better" flake8 checks and auto-fixes
  • mypy for type-related and other checks.

Add example of google search + `DocChatAgent`

Showcase a use of the GoogleSeachTool

The current DocChatAgent example in examples/docqa/chat.py takes a URL from the user and lets the user ask questions about this. We can extend this by adding an Agent whose goal is to collect information on some topic, and it can use up to 3 links via a google search, and it passes those links (or each link one at a time) to a DocChatAgent, and asks it questions,...

Invalid Discord link in README.md

Both of the Discord links provided in the README.md file appear to be invalid or expired. I tried to access it, but was unsuccessful. Could you please check this and update the link if necessary?

Missing API key while using GoogleSearchTool

Describe the bug
I receive this error when I try [GoogleSearchTool example] (https://github.com/langroid/langroid/blob/main/langroid/agent/stateless_tools/google_search_tool.py).

2023-08-07 06:37:32 - WARNING - Encountered 403 Forbidden with reason "forbidden"
Agent: Error in tool/function-call web_search usage: <class 'googleapiclient.errors.HttpError'>: <HttpError 403 when requesting 
https://customsearch.googleapis.com/customsearch/v1?q=current+LLVM+version&num=5&alt=json returned "The request is missing a valid
API key.". Details: "[{'message': 'The request is missing a valid API key.', 'domain': 'global', 'reason': 'forbidden'}]">

To Reproduce
Run the above example python3 examples/basic/chat-search.py

Expected behavior
[GoogleSearchTool example] (https://github.com/langroid/langroid/blob/main/langroid/agent/stateless_tools/google_search_tool.py) expects some API keys. It's not clear where to get these keys from and no warning in advance about this need. So users know the problem isn't from the app itself.

Screenshots
image

Langroid doc links to the API are broken

Describe the bug
If you click any link in https://langroid.github.io/langroid/ and that link leads to the API Documentation, then this link is broken.

To Reproduce
Go to https://langroid.github.io/langroid/quick-start/chat-agent/ click the link [ChatAgent](https://langroid.github.io/reference/agent/chat_agent).

Screenshots
image

Desktop (please complete the following information):

  • Browser [chrome, safari]

make docs error in Ubuntu

I've tried the command make docs from Mac and Ubuntu machines. This command works without any problem from the Mac machine, but I receive this error from the Ubuntu machine

image

I tried to debug and strace the make process, removing this line @pkill -f "mkdocs serve" 2>/dev/null || true solves the problem.

Simpler linking from doc pages to API reference page

We need the ability to link from a doc page in say docs/intro.md to an API doc page or sub-page (e.g. the "anchor" link to a class definition in the API page). For example in the Pydantic docs here: https://docs.pydantic.dev/latest/usage/models/

Currently we are having to provide the full path to the API doc page/anchor, e.g.

/reference/language_models/base/#langroid.language_models.base.LLMMessage

which is not fun to do. In the Pydantic docs they manage to achieve the linkage using just the portion after the #.
We need to figure out what settings in mkdocs.yml and/or other places need to be adjusted to achieve this.
Note that we are using auto-generated API doc pages via the gen-files plugin, and docs/auto_docstring.py script.
Pydantic's source code has some other plugins like this https://github.com/pydantic/pydantic/blob/main/docs/plugins/griffe_doclinks.py
Maybe we can use these as well.

task.py block should be disabled after first try

A sub-task may return a result ChatDocument where the metadata.block specifies an entity that should
be blocked from responding to the message (in the parent task).
E.g. message_validator may add a recipient when LLM msg omitted a recipient, and then set block = LLM.
However the block should only apply for the first attempt by the entity.

Solve logging issue in Colab

Is your feature request related to a problem? Please describe.
Running agents from Google Colab is messy because the logging info is displayed in the terminal as well as the output together. As shown below.

image

Describe the solution you'd like
Can we create a flag to disable logging? Actually, by making the variable tsv_logger a member variable, which can then be set to None when a task is created. Where I already tried to disable that like this task.tsv_logger = None but didn't work.

Describe alternatives you've considered
Any other option to disable logging?

fix occasional pytest failure on `test_vector_stores`

This failure happens once in a while. Investigate and fix.

=================================== FAILURES ===================================
__________________________ test_vector_stores[vecdb1] __________________________

vecdb = <langroid.vector_store.qdrantdb.QdrantDB object at 0x7fe976d89d10>

    @pytest.mark.parametrize("vecdb", generate_vecdbs(openai_cfg))
    def test_vector_stores(vecdb: Union[ChromaDB, QdrantDB]):
        docs = [
            Document(content="hello", metadata=DocMetaData(id=1)),
            Document(content="world", metadata=DocMetaData(id=2)),
            Document(content="hi there", metadata=DocMetaData(id=3)),
        ]
>       vecdb.add_documents(docs)

tests/main/test_vector_stores.py:62: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
langroid/vector_store/qdrantdb.py:148: in add_documents
    self.client.upsert(
.venv/lib/python3.11/site-packages/qdrant_client/qdrant_client.py:748: in upsert
    return self._client.upsert(
.venv/lib/python3.11/site-packages/qdrant_client/qdrant_remote.py:1085: in upsert
    http_result = self.openapi_client.points_api.upsert_points(
.venv/lib/python3.11/site-packages/qdrant_client/http/api/points_api.py:12[42](https://github.com/langroid/langroid/actions/runs/5582569002/jobs/10201984745#step:6:43): in upsert_points
    return self._build_for_upsert_points(
.venv/lib/python3.11/site-packages/qdrant_client/http/api/points_api.py:668: in _build_for_upsert_points
    return self.api_client.request(
.venv/lib/python3.11/site-packages/qdrant_client/http/api_client.py:68: in request
    return self.send(request, type_)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <qdrant_client.http.api_client.ApiClient object at 0x7fe975abaa50>
request = <Request('PUT', 'https://6[44](https://github.com/langroid/langroid/actions/runs/5582569002/jobs/10201984745#step:6:45)cabc3-4141-4734-91f2-0cc3176514d4.us-east-1-0.aws.cloud.qdrant.io:6333/collections/test-openai/points?wait=true')>
type_ = <class 'qdrant_client.http.models.models.InlineResponse2006'>

    def send(self, request: Request, type_: Type[T]) -> T:
        response = self.middleware(request, self.send_inner)
        if response.status_code in [200, 201]:
            try:
                return parse_as_type(response.json(), type_)
            except ValidationError as e:
                raise ResponseHandlingException(e)
>       raise UnexpectedResponse.for_response(response)
E       qdrant_client.http.exceptions.UnexpectedResponse: Unexpected Response: 404 (Not Found)
E       Raw response content:
E       b'{"status":{"error":"Not found: Collection `test-openai` doesn\'t exist!"},"time":0.000010257}'

.venv/lib/python3.11/site-packages/qdrant_client/http/api_client.py:91: UnexpectedResponse
=============================== warnings summary ===============================
.venv/lib/python3.11/site-packages/onnxruntime/capi/_pybind_state.py:28
  /home/runner/work/langroid/langroid/.venv/lib/python3.11/site-packages/onnxruntime/capi/_pybind_state.py:28: DeprecationWarning: invalid escape sequence '\S'
    "(other than %SystemRoot%\System32), "

.venv/lib/python3.11/site-packages/fire/core.py:59
  /home/runner/work/langroid/langroid/.venv/lib/python3.11/site-packages/fire/core.py:59: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
    import pipes

tests/main/test_repo_chunking.py::test_repo_chunking
tests/main/test_repo_loader.py::test_repo_loader
  /home/runner/work/langroid/langroid/.venv/lib/python3.11/site-packages/github/MainClass.py:1[45](https://github.com/langroid/langroid/actions/runs/5582569002/jobs/10201984745#step:6:46): DeprecationWarning: Argument login_or_token is deprecated, please use auth=github.Auth.Token(...) instead
    warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/main/test_vector_stores.py::test_vector_stores[vecdb1] - qdrant_client.http.exceptions.UnexpectedResponse: Unexpected Response: 404 (Not Found)
Raw response content:
b'{"status":{"error":"Not found: Collection `test-openai` doesn\'t exist!"},"time":0.000010257}'
================== 1 failed, 79 passed, 4 warnings in [46](https://github.com/langroid/langroid/actions/runs/5582569002/jobs/10201984745#step:6:47).01s ===================
Error: Process completed with exit code 1.

Error randomly occurs when a task executed more than one time in Colab

Describe the bug
This error happens randomly after running the task. Seems LLM doesn't provide the required function/format for the tool that will be used.
To Reproduce
This error happens when I run the same task more than one time.

Expected behavior
Just wondering do we need to do validation, or if this is just an accepted behavior.

Screenshots
image

Desktop (please complete the following information):
From Google Colab

Integration with Azure OpenAI

Azure OpenAI is actually using OpenAI APIs for chatting and completion (doesn't use any specific SDK). However, its configuration uses additional settings like DEPLOYMENT_NAME, OPENAI_API_BASE, and OPENAI_API_VERSION https://github.com/Azure-Samples/openai/blob/main/Basic_Samples/Completions/config.json.

So the idea here is to create (1) AzureOpenAIGPTConfig subclass of OpenAIGPTConfig , where these additional config are incorporated (2) AzureOpenAIGPT subclass of OpenAIGPT to call the completion and chat APIs

fix pytest.yml workflow -- failing due to missing CUDA lib files

platform linux -- Python 3.11.3, pytest-7.3.1, pluggy-1.0.0
rootdir: /home/runner/work/llmagent/llmagent
configfile: pyproject.toml
plugins: anyio-3.6.2
collected 20 items / 5 errors

==================================== ERRORS ====================================
_________________ ERROR collecting tests/test_custom_agent.py __________________
.venv/lib/python3.11/site-packages/torch/__init__.py:168: in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/ctypes/__init__.py:376: in __init__
    self._handle = _dlopen(self._name, mode)
E   OSError: libcurand.so.10: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:
tests/test_custom_agent.py:2: in <module>
    from tests.configs import CustomAgentConfig
tests/configs.py:2: in <module>
    from llmagent.embedding_models.models import OpenAIEmbeddingsConfig
llmagent/embedding_models/models.py:3: in <module>
    from sentence_transformers import SentenceTransformer
.venv/lib/python3.11/site-packages/sentence_transformers/__init__.py:3: in <module>
    from .datasets import SentencesDataset, ParallelSentencesDataset
.venv/lib/python3.11/site-packages/sentence_transformers/datasets/__init__.py:1: in <module>
    from .DenoisingAutoEncoderDataset import DenoisingAutoEncoderDataset
.venv/lib/python3.11/site-packages/sentence_transformers/datasets/DenoisingAutoEncoderDataset.py:1: in <module>
    from torch.utils.data import Dataset
.venv/lib/python3.11/site-packages/torch/__init__.py:228: in <module>
    _load_global_deps()
.venv/lib/python3.11/site-packages/torch/__init__.py:189: in _load_global_deps
    _preload_cuda_deps(lib_folder, lib_name)
.venv/lib/python3.11/site-packages/torch/__init__.py:[15](https://github.com/langroid/llmagent/actions/runs/4937810870/jobs/8826870624#step:6:16)4: in _preload_cuda_deps
    raise ValueError(f"{lib_name} not found in the system path {sys.path}")
E   ValueError: libcublas.so.*[0-9] not found in the system path ['/home/runner/work/llmagent/llmagent', '/opt/hostedtoolcache/Python/3.11.3/x64/lib/python311.zip', '/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11', '/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/lib-dynload', '/home/runner/work/llmagent/llmagent/.venv/lib/python3.11/site-packages', '/home/runner/work/llmagent/llmagent']
______________ ERROR collecting tests/test_embedding_clusters.py _______________
.venv/lib/python3.11/site-packages/torch/__init__.py:[16](https://github.com/langroid/llmagent/actions/runs/4937810870/jobs/8826870624#step:6:17)8: in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/ctypes/__init__.py:376: in __init__
    self._handle = _dlopen(self._name, mode)
E   OSError: libcurand.so.10: cannot open shared object file: No such file or directory

Ask human to confirm dockerfile verification

This function is automatically triggered by LLM on every dockerfile proposed, but the user might be interested in modifying some stuff before the verification (given the fact for some repos building time can be slightly long).
So it's better to ask the user to confirm before starting the process.

Expose QdrantDBConfig in examples

Similar to issue #170 -- when people use their own Qdrant key, the QdrantDBConfig.url setting will need to be set to their address, otherwise the examples will fail.

Agent: unify various methods handling str messages

The Agent class are a few methods that are meant to "handle" string inputs:

  • handle_message: currently handles LLM msg, detects json (i.e. structured message), forwards to method if possible;
    in theory could be message from another Agent, or from human.
  • respond: forwards msg to LLM; in theory could be forwarded to another agent

We could consider into a more unified way of looking at these methods, possibly treating all entities (humans, agents, LLMs)
as essentially message-forwarders

task.py - unify pending_sender, pending_message

The task.py step() loop uses two variables to track the current message:

  • self.pending_message
  • self.pending_sender, which is the Responder that just sent the self.pending_message

We should find a good way to fold the latter into the former and just have
self.pending_message to keep track of.

Google Colab notebook: error on `pip install langroid`

× Building wheel for hnswlib (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Building wheel for hnswlib (pyproject.toml) ... error
  ERROR: Failed building wheel for hnswlib
Successfully built bs4 fire halo wget
Failed to build hnswlib
ERROR: Could not build wheels for hnswlib, which is required to install pyproject.toml-based projects

improve validation agent flow

The MessageValidatorAgent uses parent's parent's content, when correcting the LLM message. This is a bit awkward. It's better to use a new "attachment"
field in ChatDocument. Also, instead of using the metadata.block to prevent LLM from responding to its own message, it's better to explicitly control
the pending_sender in the parent task via a new parent_responder field in ChatDocument.metadata

expose RedisCacheConfig in examples

The default config has hostname set to our own azure address, but others who create their own account might have a different one. So in the various examples (or instructions) we need to expose this so they can set it to their hostname.
If they run the examples using the default config, they will fail.

split tests into unit (or core) and integration

See details in issue #55.
We want our unit tests to run smoothly in github actions,
and not be affected by extraneous dependencies like torch.
Specifically, tests related to sentence-transformer should be in integration,
and maybe also those related to specific vector-dbs.

Support parsing CSV

Is your feature request related to a problem? Please describe.
We need to support parsing CSV files whether through URL or Path.

Describe the solution you'd like
Similar to the PDF parser, but would use the retriever instead of DocChatAgent. This means:

  1. create csv_parser.py under the path langroid/parsing,
  2. modify the classes repo_loader.py and url_loader.py

Exceeding max context token length

I'm developing PDF chat app. The pdf_agent is a subclass of DocChatAgent. Here are its config:

def chat(config: DocChatAgent) -> None:
    config.vecdb = VectorStoreConfig(
        type="qdrant",
        collection_name="pdf-chat",
        replace_collection=True,
    )

    config.llm = OpenAIGPTConfig(
        chat_model=OpenAIChatModel.GPT3_5_TURBO,
    )
    config.max_context_tokens = 1000
    config.parsing = ParsingConfig(
        splitter=Splitter.TOKENS,
        chunk_size=100,
        max_chunks=10_000,
        min_chunk_chars=350,
        discard_chunk_chars=5,
        n_similar_docs=4,
    )
    pdf_agent = DocChatAgent(config)

    print("[blue]Welcome to the document chatbot!")
    print("[cyan]Enter x or q to quit, or ? for evidence")
    print(
        """
        [blue]Enter some PDFs:
        """.strip()
    )
    inputs = get_list_from_user()
    pdf_agent.config.doc_paths = inputs
    ingest_pdf(pdf_agent, inputs)
    topics_doc = pdf_agent.summarize_docs(
        instruction="""
        Give me a list of up to 3 main topics from the following text,
        in the form of short sentences.
        """,
    )

Here is the pdf parsing logic

def get_pdf_doc(pdf):
    reader = PdfReader(pdf)
    text = ""
    for page_num in range(len(reader.pages)):
        current_page = reader.pages[page_num]
        text += current_page.extract_text()

    return Document(content=text, metadata=DocMetaData(source=str(pdf)))

def ingest_pdf(doc_agent, paths):
    docs: List[Document] = []
    if len(paths) > 0:
        for p in paths:
            path_docs = get_pdf_doc(p)
            docs.append(path_docs)
    n_docs = len(docs)
    n_splits = doc_agent.ingest_docs(docs)
    if n_docs == 0:
        return
    n_paths = len(paths)
    print(
        f"""
    [green]I have processed the following {n_paths} paths 
    into {n_splits} parts:
    """.strip()
    )
    print("\n".join(paths))

But I receive this error

InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 
7111 tokens. Please reduce the length of the messages.
image

Implement clustering of embedding vectors

When doing "doc-chat" or "code-chat", we may first want to give the
user an overview of the "top k themes" in the doc (or code), so they have a
sort of "map" of the documents before even asking any questions.

We can do this by:

  • finding the top k densest clusters of the embedding vectors,
  • picking a representative vector from each cluster,
  • get the content corresponding to each representative vector,
  • pass this content to the LLM as part of the prompt, and ask it to summarize these,
  • present summary to user.

Specifying storage_location for local qdrant

Describe the bug
There are two issues:

  1. storage_location can be changed from different perspectives/objects. This can lead to confusion. For example, I can change it from doc_agent.config.vecdb.storage_path = ".mystorage/data" or doc_agent.vecdb.storage_path = ".mystorage/data". However, the latter shouldn't be allowed. While the former leads to the 2nd issue:
  2. At this stage, the config of the doc_agent has already been created, which means the qdrant client is already initialized based on the default values (e.g., storage_path=".qdrant/data/"). Therefore, the problem here is that when storage_path is changed later on, there is no mechanism to overwrite the storage_path. This will lead to inconsistency, because the use assumes the new location, while it hasn't been changed.

To Reproduce
Try the above approaches to modify storage_path immediately before this line, and then debug the code.

Expected behavior

  • Changing storage_path should be allowed only from one location at the beginning before creating the client

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.