daethyra / freestream Goto Github PK

Template repository for building Streamlit chatbots

Home Page: https://freestream.streamlit.app

License: Other

Python 100.00%

chatbot langchain large-language-models openai python question-answering rag retrieval-augmented-generation streamlit google-ai langgraph vertex-ai corrective-retrieval-augmented-generation agent-based claude claude-api

freestream's Introduction

FreeStream

Build your own personal chatbot interface with Streamlit!

TLDR:

A repository to help you get started building your own chatbots using LangChain
All conversation content is traced via LangSmith for developer evaluation
Hostable for free through Streamlit Community Cloud
API keys required
Pay-per-use ChatGPT style interface

Quickstart
- Installation
Description
- Key Concepts
- What can I do with FreeStream?
Functional Requirements
Non-Functional Requirements
License
LLM Providers' Privacy Policies

Quickstart

This app is hosted via Streamlit Community Cloud, here

Installation

This project uses poetry for dependency management to be consistent with Streamlit Community Cloud's deployment process.

Install poetry with:

pip install -U pip && pip install -U poetry

Then, install the project's dependencies in a virtual environment using poetry.

Run:

poetry install

You will need to set all required secrets, which require their own respective accounts. Make a copy of "template.secrets.toml" and rename it to "secrets.toml" in the root of the project. Fill out each field in the file.

Need API Keys?

API Platform	Link
Claude	https://console.anthropic.com/
Google	https://aistudio.google.com/app/apikey
Langchain	https://smith.langchain.com/
OpenAI	https://platform.openai.com/api-keys

You can then start the development server with hot reloading by running:

poetry run streamlit run ./freestream/🏡_Home.py

Description

I originally created this project as a chatbot for law and medical professionals, but I quickly realized a more flexible system would benefit everyone.

Key Concepts

Related to extending the capabilities of generative AI.

Concept	Definition
Large Language Model	A model that can generate text.
RAG	Retrieval Augmented Generation
C-RAG	Corrective-Retrieval Augmented Generation
Self-RAG	Self-reflective Retrieval Augmented Generation
ColBERT	Efficient BERT-Based Document Search
RAPTOR	Recursive Abstractive Processing for Tree-Organized Retrieval

What can I do with FreeStream?

FreeStream has two chatbots where you can interact with an LLM of your choosing, for example, GPT-4o or Claude Opus. You can very easily add more LLMs to the chatbot dictionary, like Llama 3 via Ollama, or Gemini-Pro through LangChain's ChatGoogleGenerativeAI. The original chatbot for this project was "RAGbot," which allows you to ask questions about your upload file(s). Curie, is a more for programming and self-learning purposes.

Functional Requirements

The application MUST...

Provide a user interface for chatting with large language models.
Have a retrieval augmented generative chatbot.
Provide a range of chatbot pages, differentiated by their prompt engineering.
Let the user "drop-in" their choice of LLM at any time during a conversation.
~~Allow users to perform image upscaling (PDF, JPEG, PNG) without limits.~~

Non-Functional Requirements

The application SHOULD...

Aim for 24/7 availability.
Prioritize ease of navigation
Feature a visually appealing, seamless interface.

License

LLM Providers' Privacy Policies

freestream's People

Contributors

Stargazers

Watchers

Forkers

mintiroso02 martincooperbiz nzhenev revenuerunning

freestream's Issues

Publish 'Streamlit' branch

Add LangSmith tracing to Streamlit branch

Done with #9

Refactor API key loading

Require users have their own API keys

`configure_retreiver` cache_resource -> cache_data

From https://docs.streamlit.io/library/advanced-features/caching

st.cache_data is the recommended way to cache computations that return data: loading a DataFrame from CSV, transforming a NumPy array, querying an API, or any other function that returns a serializable data object (str, int, float, DataFrame, array, list, …). It creates a new copy of the data at each function call, making it safe against mutations and race conditions. The behavior of st.cache_data is what you want in most cases – so if you're unsure, start with st.cache_data and see if it works!
st.cache_resource is the recommended way to cache global resources like ML models or database connections – unserializable objects that you don't want to load multiple times. Using it, you can share these resources across all reruns and sessions of an app without copying or duplication. Note that any mutations to the cached return value directly mutate the object in the cache (more details below).

Make LangSmith traces available upon response completion

When a response finishes generating, show a hyperlink
- static position
- dynamic self-updating
- out of the way, non-UI-intrusive

GIF background stops during 'Completion'

move the code block up above the if user_input line

Turn `configure_retriever` into a Tool for an Agent | (?)Refactor `configure_retriever` to a class(?) | `configure_retriever` is restrictive, Provides no access to inner functionality

Refactor `configure_retriever` to a class

Increase Modularity

Idea:

class RetrieverConfigurator:
    """
    A class for configuring a retriever object based on uploaded files.

    This class encapsulates the process of reading documents from uploaded files,
    splitting them into smaller chunks, creating embeddings for each chunk, and defining
    a retriever object that uses the FAISS vector database to search for similar documents.
    """

    def __init__(self, uploaded_files):
        """
        Initialize the RetrieverConfigurator object.

        Args:
            uploaded_files (list): A list of Streamlit uploaded file objects.
        """
        self.uploaded_files = uploaded_files
        self.docs = []
        self.chunks = []
        self.vectordb = None
        self.retriever = None

    def read_documents(self):
        """
        Reads the documents from the uploaded files.

        This method iterates over the uploaded files, writes them to temporary files,
        and loads the documents using the UnstructuredFileLoader.
        """
        temp_dir = tempfile.TemporaryDirectory()
        for file in self.uploaded_files:
            temp_filepath = os.path.join(temp_dir.name, file.name)
            with open(temp_filepath, "wb") as f:
                f.write(file.getvalue())
            loader = UnstructuredFileLoader(temp_filepath)
            self.docs.extend(loader.load())
            logger.info("Loaded document: %s", file.name)

    def split_documents(self):
        """
        Splits the loaded documents into smaller chunks.

        This method uses the RecursiveCharacterTextSplitter to split the documents
        into chunks based on the specified chunk size and overlap.
        """
        text_splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=75)
        self.chunks = text_splitter.split_documents(self.docs)

    def create_embeddings(self):
        """
        Creates embeddings for each chunk using the HuggingFace's MiniLM model.

        This method initializes the HuggingFaceEmbeddings with the specified model
        and generates embeddings for the chunks. The embeddings are then stored
        in a FAISS vector database.
        """
        model_kwargs = {"device": "cuda" if torch.cuda.is_available() else "cpu"}
        embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2", model_kwargs=model_kwargs)
        self.vectordb = FAISS.from_documents(self.chunks, embeddings)

    def define_retriever(self):
        """
        Defines a retriever object that uses the FAISS vector database.

        This method creates a retriever object from the FAISS vector database,
        configuring it with the specified search type and parameters.
        """
        self.retriever = self.vectordb.as_retriever(search_type="mmr", search_kwargs={"k": 3, "fetch_k": 7})

    def configure_retriever(self):
        """
        Configures and returns a retriever object for a given list of uploaded files.

        This method orchestrates the process of reading documents, splitting them into chunks,
        creating embeddings, and defining a retriever object. The configured retriever
        object is then returned.

        Returns:
            retriever (Retriever): A retriever object that can be used to search for similar documents.
        """
        self.read_documents()
        self.split_documents()
        self.create_embeddings()
        self.define_retriever()
        return self.retriever

configurator = RetrieverConfigurator(uploaded_files)
retriever = configurator.configure_retriever()

Cache embeddings model

Add VertexAI or ChatGoogleGenAI to `model_names`

Implement `StreamlitCallbackHandler`

`StreamlitCallbackHandler`

So this is interesting because I could probably delete the premade callback handler(s) from utility_funcs.py

I would need to do extensive testing, but it seems pretty easy, and that I'm only required to pass in one parameter, parent_container
We can see this exampled in the Streamlit-Agent Mrkl module by LangChainAI

LLM Reflectivity | RAG -> CRAG | Update `qa_chain` for use beyond OpenAI models

History Aware Retriever (idea):

# Create a history-aware retriever
history_aware_retriever = create_history_aware_retriever(
    retriever, combine_docs_chain
)

# Create a RAG chain
qa_chain = create_retrieval_chain(history_aware_retriever, llm)

# if the length of messages is 0, or when the user \
# clicks the clear button
if len(msgs.messages) == 0 or st.sidebar.button("Clear message history"):
    msgs.clear()
    # show a default message from the AI
    msgs.add_ai_message("How can I help you?")

# Display coversation history window
avatars = {"human": "user", "ai": "assistant"}
for msg in msgs.messages:
    st.chat_message(avatars[msg.type]).write(msg.content)

# Display user input field and enter button
if user_query := st.chat_input(placeholder="Ask me anything!"):
    st.chat_message("user").write(user_query)

    # Display assistant response
    with st.chat_message("assistant"):
        # Check for the presence of the "messages" key in session state
        if "messages" not in st.session_state:
            st.session_state.messages = []

        retrieval_handler = PrintRetrievalHandler(st.container())
        stream_handler = StreamHandler(st.empty())
        response = qa_chain.invoke(
            {"input": user_query}, callbacks=[retrieval_handler, stream_handler]
        )

Image Upscaler HALTED | Improper import statement for dependency: `basicsr` | ModuleNotFoundError: No module named 'torchvision.transforms.functional_tensor'

Details

Reference Issues:

My Traceback (PII Scrubbed):

(.venv) PS C:\Users\Software\testground\Real-ESRGAN> python .\inference_realesrgan.py      
Traceback (most recent call last):
 File "C:\Users\Software\testground\Real-ESRGAN\inference_realesrgan.py", line 5, in <module>
    from basicsr.archs.rrdbnet_arch import RRDBNet
 File "C:\Users\Software\.venv\lib\site-packages\basicsr\__init__.py", line 4, in <module>
    from .data import *
 File "C:\Users\Software\.venv\lib\site-packages\basicsr\data\__init__.py", line 22, in <module>
    _dataset_modules = [importlib.import_module(f'basicsr.data.{file_name}') for file_name in dataset_filenames]
 File "C:\Users\Software\.venv\lib\site-packages\basicsr\data\__init__.py", line 22, in <listcomp>
    _dataset_modules = [importlib.import_module(f'basicsr.data.{file_name}') for file_name in dataset_filenames]
 File "C:\Users\Software\.venv\lib\site-packages\basicsr\data\realesrgan_dataset.py", line 11, in <module>
    from basicsr.data.degradations import circular_lowpass_kernel, random_mixed_kernels
 File "C:\Users\Software\.venv\lib\site-packages\basicsr\data\degradations.py", line 8, in <module>
    from torchvision.transforms.functional_tensor import rgb_to_grayscale
ModuleNotFoundError: No module named 'torchvision.transforms.functional_tensor'

Suggested Fix

Link
Link

Create Agent Executor w/ Tools + Memory

Idea inspired by the following code from here:

    chat_agent = ConversationalChatAgent.from_llm_and_tools(llm=llm, tools=tools)
    executor = AgentExecutor.from_agent_and_tools(
        agent=chat_agent,
        tools=tools,
        memory=memory,
        return_intermediate_steps=True,
        handle_parsing_errors=True,
    )

Note: By passing memory into the Agent Executor, we no longer have to worry about passing in the right key values to an invoke dictionary.

Image Upscaler (Page) Not Represented on Homepage list of tools

Remove "Ko-Fi" note in footer

`selected_model`s purpose is ambiguous, needs insightful comments

README inaccurate description

Needs updates:
- Short description
- TLDR
- Rename "Vocabulary" to "Key Concepts"
- Remove Real-ESRGAN reference
- Turn repo into Repository Template
- Move "Additional References" up in document, closer to Key Concepts

Add cache_resource to `set_llm`

Home page has no resources links

Link:

this repo
/freestream dir

replace `ConversationalRetrievalChain` with own chain

The following causes non-ChatGPT models to write standalone questions in non-English text:

# Create a chain that ties everything together
qa_chain = ConversationalRetrievalChain.from_llm(
    # switch to 
    # create_history_aware_retriever
    llm, retriever=retriever, memory=memory, verbose=True
)

REMINDER: Remove unused packages from `pyproject.toml`

numexpr
langgraph

migrate from `qa_chain.run` to `qa_chain.invoke`

Issue is based on the following existing snippet:

        response = qa_chain.run(
            user_query, callbacks=[retrieval_handler, stream_handler] # when switching to `qa_chain.invoke` you 
            # will use cfg = RunnableConfig()
            # cfg["callbacks"] = [retrieval_handler, stream_handler]
        )

Gemini-Pro generates non-English text

During ConversationalRetrievalChain, when Gemini creates a retrieval question, it seems to pick a random language
-> Fix:
---> AgentExecutor implementation
---> Custom prompt chaining

Corrective Retrieval Augmented Generation (CRAG)

Requirements:
- Scrutinous original prompt review
- Meticulous analyzation of key points found in context
  - Comparison of relevance between key points in relation to original prompt

# Init different built-in tools
search = DuckDuckGoSearchAPIWrapper()
llm_math_chain = LLMMathChain.from_llm(llm)

# create a toolset list
tools = [
    Tool(
        name="Search",
        func=search.run,
        description="useful for when you need to answer questions about current events. You should ask targeted questions",
    ),
    Tool(
        name="Calculator",
        func=llm_math_chain.run,
        description="useful for when you need to answer questions about math",
    ),
]