Git Product home page Git Product logo

ruoccofabrizio / azure-open-ai-embeddings-qna Goto Github PK

View Code? Open in Web Editor NEW
830.0 39.0 510.0 1.57 MB

A simple web application for a OpenAI-enabled document search. This repo uses Azure OpenAI Service for creating embeddings vectors from documents. For answering the question of a user, it retrieves the most relevant document and then uses GPT-3, GPT-3.5 or GPT-4 to extract the matching answer for the question.

Home Page: https://azure.microsoft.com/en-us/products/cognitive-services/openai-service

License: MIT License

Dockerfile 0.65% Python 99.35%
azureopenai

azure-open-ai-embeddings-qna's Introduction

Azure OpenAI Embeddings QnA

A simple web application for a OpenAI-enabled document search. This repo uses Azure OpenAI Service for creating embeddings vectors from documents. For answering the question of a user, it retrieves the most relevant document and then uses GPT-3 to extract the matching answer for the question.

Architecture

Learning More about Enterprise QnA

Enterprise QnA is built on a pattern the AI community calls "Retrieval-Augmented Generation" (RAG). In addition to this repository having a reference architecture on how to implement this pattern on Azure, here are resources to familiarize yourself with the concepts in RAG, and samples to learn each underlying product's APIs:

Resource Links Purpose Highlights
Reference Architecture GitHub (This Repo) Starter template for enterprise development. - Easily deployable reference architecture following best practices. - Frontend is Azure OpenAI chat orchestrated with Langchain.
- Composes Form Recognizer, Azure Search, Redis in an end-to-end design.
- Supports working with Azure Search, Redis.
Educational Blog Post Microsoft Blog,
GitHub
Learn about the building blocks in a RAG solution. - Introduction to the key elements in a RAG architecture.
- Understand the role of vector search in RAG scenarios.
- See how Azure Search supports this pattern.
- Understand the role of prompts and orchestrator like Langchain.
Azure OpenAI API Sample GitHub Get started with Azure OpenAI features. - Sample code to make an interactive chat client as a web page. - Helps you get started with latest Azure OpenAI APIs
Business Process Automation Samples GitHub Showcase multiple BPA scenarios implemented with Form Recognizer and Azure services. - Consolidates in one repository multiple samples related to BPA and document understanding.
- Includes an end to end app, GUI to create and customize a pipeline to integrate multiple Azure Cognitive services.
- Samples include document intelligence and search.

IMPORTANT NOTE (OpenAI generated)

We have made some changes to the data format in the latest update of this repo.
The new format is more efficient and compatible with the latest standards and libraries. However, we understand that some of you may have existing applications that rely on the previous format and may not be able to migrate to the new one immediately.

Therefore, we have provided a way for you to continue using the previous format in a running application. All you need to do is to set your web application tag to fruocco/oai-embeddings:2023-03-27_25. This will ensure that your application will use the data format that was available on March 27, 2023. We strongly recommend that you update your applications to use the new format as soon as possible.

If you want to move to the new format, please go to:

  • "Add Document" -> "Add documents in Batch" and click on "Convert all files and add embeddings" to reprocess your documents.

Use the Repo with Chat based deployment (gpt-35-turbo or gpt-4-32k or gpt-4)

By default, the repo uses an Instruction based model (like text-davinci-003) for QnA and Chat experience.
If you want to use a Chat based deployment (gpt-35-turbo or gpt-4-32k or gpt-4), please change the environment variables as described here

Running this repo

You have multiple options to run the code:

Deploy on Azure (WebApp + Batch Processing) with Azure Cognitive Search

Deploy to Azure

Click on the Deploy to Azure button and configure your settings in the Azure Portal as described in the Environment variables section.

Architecture

Please be aware that you need:

  • an existing Azure OpenAI resource with models deployments (instruction models e.g. text-davinci-003, and embeddings models e.g. text-embedding-ada-002)

Signing up for Vector Search Private Preview in Azure Cognitive Search

Azure Cognitive Search supports searching using pure vectors, pure text, or in hybrid mode where both are combined. For the vector-based cases, you'll need to sign up for Vector Search Private Preview. To sign up, please fill in this form: https://aka.ms/VectorSearchSignUp.

Preview functionality is provided under Supplemental Terms of Use, without a service level agreement, and isn't recommended for production workloads.

Deploy on Azure (WebApp + Azure Cache for Redis Enterprise + Batch Processing)

Deploy to Azure

Click on the Deploy to Azure button to automatically deploy a template on Azure by with the resources needed to run this example. This option will provision an instance of Azure Cache for Redis with RediSearch installed to store vectors and perform the similiarity search.

Architecture

Please be aware that you still need:

  • an existing Azure OpenAI resource with models deployments (instruction models e.g. text-davinci-003, and embeddings models e.g. text-embedding-ada-002)
  • an existing Form Recognizer Resource
  • an existing Translator Resource
  • Azure marketplace access. (Azure Cache for Redis Enterprise uses the marketplace for IP billing)

You will add the endpoint and access key information for these resources when deploying the template.

Deploy on Azure/Azure China (WebApp + Redis Stack + Batch Processing)

Deploy to Azure Deploy to Azure

Click on the Deploy to Azure button and configure your settings in the Azure Portal as described in the Environment variables section.

Architecture

Please be aware that you need:

  • an existing Azure OpenAI resource with models deployments (instruction models e.g. text-davinci-003, and embeddings models e.g. text-embedding-ada-002)
  • an existing Form Recognizer Resource (OPTIONAL - if you want to extract text out of documents)
  • an existing Translator Resource (OPTIONAL - if you want to translate documents)

Deploy on Azure/Azure China (WebApp + Azure PostgreSQL + Batch Processing)

Deploy to Azure Deploy to Azure

Click on the Deploy to Azure button and configure your settings in the Azure Portal as described in the Environment variables section.

Architecture

Run everything locally in Docker (WebApp + Redis Stack + Batch Processing)

First, clone the repo:

git clone https://github.com/ruoccofabrizio/azure-open-ai-embeddings-qna
cd azure-open-ai-embeddings-qna

Next, configure your .env as described in Environment variables:

cp .env.template .env
vi .env # or use whatever you feel comfortable with

Finally run the application:

docker compose up

Open your browser at http://localhost:8080

This will spin up three Docker containers:

  • The WebApp itself
  • Redis Stack for storing the embeddings
  • Batch Processing Azure Function

NOTE: Please note that the Batch Processing Azure Function uses an Azure Storage Account for queuing the documents to process. Please create a Queue named "doc-processing" in the account used for the "AzureWebJobsStorage" env setting.

Run everything locally in Python with Conda (WebApp only)

This requires Redis running somewhere and expects that you've setup .env as described above. In this case, point REDIS_ADDRESS to your Redis deployment.

You can run a local Redis instance via:

 docker run -p 6379:6379 redis/redis-stack-server:latest

You can run a local Batch Processing Azure Function:

 docker run -p 7071:80 fruocco/oai-batch:latest

Create conda environment for Python:

conda env create -f code/environment.yml
conda activate openai-qna-env

Configure your .env as described in as described in Environment variables

Run WebApp:

cd code
streamlit run OpenAI_Queries.py

Run everything locally in Python with venv

This requires Redis running somewhere and expects that you've setup .env as described above. In this case, point REDIS_ADDRESS to your Redis deployment.

You can run a local Redis instance via:

 docker run -p 6379:6379 redis/redis-stack-server:latest

You can run a local Batch Processing Azure Function:

 docker run -p 7071:80 fruocco/oai-batch:latest

Please ensure you have Python 3.9+ installed.

Create venv environment for Python:

python -m venv .venv
.venv\Scripts\activate

Install PIP Requirements

pip install -r code\requirements.txt

Configure your .env as described in as described in Environment variables

Run the WebApp

cd code
streamlit run OpenAI_Queries.py

Run WebApp locally in Docker against an existing Redis deployment

Option 1 - Run the prebuilt Docker image

Configure your .env as described in as described in Environment variables

Then run:

docker run --env-file .env -p 8080:80 fruocco/oai-embeddings:latest

Option 2 - Build the Docker image yourself

Configure your .env as described in as described in Environment variables

docker build . -f Dockerfile -t your_docker_registry/your_docker_image:your_tag
docker run --env-file .env -p 8080:80 your_docker_registry/your_docker_image:your_tag

Note: You can use

  • WebApp.Dockerfile to build the Web Application
  • BatchProcess.Dockerfile to build the Azure Function for Batch Processing

Use the QnA API from the backend

You can use a QnA API on your data exposed by the Azure Function for Batch Processing.

POST https://YOUR_BATCH_PROCESS_AZURE_FUNCTION_URL/api/apiQnA
Body:
    question: str
    history: (str,str) -- OPTIONAL
    custom_prompt: str -- OPTIONAL
    custom_temperature: float --OPTIONAL

Return:
{'context': 'Introduction to Azure Cognitive Search - Azure Cognitive Search '
            '(formerly known as "Azure Search") is a cloud search service that '
            'gives developers infrastructure, APIs, and tools for building a '
            'rich search experience over private, heterogeneous content in '
            'web, mobile, and enterprise applications...'
            '...'
            '...',

 'question': 'What is ACS?',

 'response': 'ACS stands for Azure Cognitive Search, which is a cloud search service'
             'that provides infrastructure, APIs, and tools for building a rich search experience'
             'over private, heterogeneous content in web, mobile, and enterprise applications...'
             '...'
             '...',
             
 'sources': '[https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search](https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search)'}

Call the API with no history for QnA mode

import requests

r = requests.post('http://http://YOUR_BATCH_PROCESS_AZURE_FUNCTION_URL/api/apiQnA', json={
    'question': 'What is the capital of Italy?'
    })

Call the API with history for Chat mode

r = requests.post('http://YOUR_BATCH_PROCESS_AZURE_FUNCTION_URL/api/apiQnA', json={
    'question': 'can I use python SDK?',
    'history': [
        ("what's ACS?", 
        'ACS stands for Azure Cognitive Search, which is a cloud search service that provides infrastructure, APIs, and tools for building a rich search experience over private, heterogeneous content in web, mobile, and enterprise applications. It includes a search engine for full-text search, rich indexing with lexical analysis and AI enrichment for content extraction and transformation, rich query syntax for text search, fuzzy search, autocomplete, geo-search, and more. ACS can be created, loaded, and queried using the portal, REST API, .NET SDK, or another SDK. It also includes data integration at the indexing layer, AI and machine learning integration with Azure Cognitive Services, and security integration with Azure Active Directory and Azure Private Link integration.'
        )
        ]
    })

Environment variables

Here is the explanation of the parameters:

App Setting Value Note
OPENAI_ENGINE text-davinci-003 Engine deployed in your Azure OpenAI resource. E.g. Instruction based model: text-davinci-003 or Chat based model: gpt-35-turbo or gpt-4-32k or gpt-4. Please use the deployment name and not the model name.
OPENAI_DEPLOYMENT_TYPE Text Text for Instruction engines (text-davinci-003),
Chat for Chat based deployment (gpt-35-turbo or gpt-4-32k or gpt-4)
OPENAI_EMBEDDINGS_ENGINE_DOC text-embedding-ada-002 Embedding engine for documents deployed in your Azure OpenAI resource
OPENAI_EMBEDDINGS_ENGINE_QUERY text-embedding-ada-002 Embedding engine for query deployed in your Azure OpenAI resource
OPENAI_API_BASE https://YOUR_AZURE_OPENAI_RESOURCE.openai.azure.com/ Your Azure OpenAI Resource name. Get it in the Azure Portal
OPENAI_API_KEY YOUR_AZURE_OPENAI_KEY Your Azure OpenAI API Key. Get it in the Azure Portal
OPENAI_TEMPERATURE 0.1 Azure OpenAI Temperature
OPENAI_MAX_TOKENS -1 Azure OpenAI Max Tokens
AZURE_CLOUD AzureCloud Azure Cloud to use. AzureCloud for Azure Global, AzureChinaCloud for Azure China
VECTOR_STORE_TYPE PGVector Vector Store Type. Use AzureSearch for Azure Cognitive Search, PGVector for Azure PostgreSQL, leave it blank for Redis or Azure Cache for Redis Enterprise
AZURE_SEARCH_SERVICE_NAME YOUR_AZURE_SEARCH_SERVICE_URL Your Azure Cognitive Search service name. Get it in the Azure Portal
AZURE_SEARCH_ADMIN_KEY AZURE_SEARCH_ADMIN_KEY Your Azure Cognitive Search Admin key. Get it in the Azure Portal
PGVECTOR_HOST Your_PG_NAME.postgres.database.azure.com or Your_PG_NAME.postgres.database.chinacloudapi.cn
PGVECTOR_PORT 5432
PGVECTOR_DATABASE YOUR_PG_DATABASE
PGVECTOR_USER YOUR_PG_USER
PGVECTOR_PASSWORD YOUR_PG_PASSWORD
REDIS_ADDRESS api URL for Redis Stack: "api" for docker compose
REDIS_PORT 6379 Port for Redis
REDIS_PASSWORD redis-stack-password OPTIONAL - Password for your Redis Stack
REDIS_ARGS --requirepass redis-stack-password OPTIONAL - Password for your Redis Stack
REDIS_PROTOCOL redis://
CHUNK_SIZE 500 OPTIONAL: Chunk size for splitting long documents in multiple subdocs. Default value: 500
CHUNK_OVERLAP 100 OPTIONAL: Overlap between chunks for document splitting. Default: 100
CONVERT_ADD_EMBEDDINGS_URL http://batch/api/BatchStartProcessing URL for Batch processing Function: "http://batch/api/BatchStartProcessing" for docker compose
AzureWebJobsStorage AZURE_BLOB_STORAGE_CONNECTION_STRING FOR_AZURE_FUNCTION_EXECUTION Azure Blob Storage Connection string for Azure Function - Batch Processing

Optional parameters for additional features (e.g. document text extraction with OCR):

App Setting Value Note
BLOB_ACCOUNT_NAME YOUR_AZURE_BLOB_STORAGE_ACCOUNT_NAME OPTIONAL - Get it in the Azure Portal if you want to use the document extraction feature
BLOB_ACCOUNT_KEY YOUR_AZURE_BLOB_STORAGE_ACCOUNT_KEY OPTIONAL - Get it in the Azure Portalif you want to use document extraction feature
BLOB_CONTAINER_NAME YOUR_AZURE_BLOB_STORAGE_CONTAINER_NAME OPTIONAL - Get it in the Azure Portal if you want to use document extraction feature
FORM_RECOGNIZER_ENDPOINT YOUR_AZURE_FORM_RECOGNIZER_ENDPOINT OPTIONAL - Get it in the Azure Portal if you want to use document extraction feature
FORM_RECOGNIZER_KEY YOUR_AZURE_FORM_RECOGNIZER_KEY OPTIONAL - Get it in the Azure Portal if you want to use document extraction feature
PAGES_PER_EMBEDDINGS Number of pages for embeddings creation. Keep in mind you should have less than 3K token for each embedding. Default: A new embedding is created every 2 pages.
TRANSLATE_ENDPOINT YOUR_AZURE_TRANSLATE_ENDPOINT OPTIONAL - Get it in the Azure Portal if you want to use translation feature
TRANSLATE_KEY YOUR_TRANSLATE_KEY OPTIONAL - Get it in the Azure Portal if you want to use translation feature
TRANSLATE_REGION YOUR_TRANSLATE_REGION OPTIONAL - Get it in the Azure Portal if you want to use translation feature
VNET_DEPLOYMENT false Boolean variable to set "true" if you want to deploy the solution in a VNET. Please check your Azure Form Recognizer and Azure Translator endpoints as well.

DISCLAIMER

This presentation, demonstration, and demonstration model are for informational purposes only and (1) are not subject to SOC 1 and SOC 2 compliance audits, and (2) are not designed, intended or made available as a medical device(s) or as a substitute for professional medical advice, diagnosis, treatment or judgment. Microsoft makes no warranties, express or implied, in this presentation, demonstration, and demonstration model. Nothing in this presentation, demonstration, or demonstration model modifies any of the terms and conditions of Microsoft’s written and signed agreements. This is not an offer and applicable terms and the information provided are subject to revision and may be changed at any time by Microsoft.

This presentation, demonstration, and demonstration model do not give you or your organization any license to any patents, trademarks, copyrights, or other intellectual property covering the subject matter in this presentation, demonstration, and demonstration model.

The information contained in this presentation, demonstration and demonstration model represents the current view of Microsoft on the issues discussed as of the date of presentation and/or demonstration, for the duration of your access to the demonstration model. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of presentation and/or demonstration and for the duration of your access to the demonstration model.

No Microsoft technology, nor any of its component technologies, including the demonstration model, is intended or made available as a substitute for the professional advice, opinion, or judgment of (1) a certified financial services professional, or (2) a certified medical professional. Partners or customers are responsible for ensuring the regulatory compliance of any solution they build using Microsoft technologies.

azure-open-ai-embeddings-qna's People

Contributors

csiebler avatar cyberflying avatar edjez avatar ignaciofls avatar msfteegarden avatar plimantour avatar ruoccofabrizio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

azure-open-ai-embeddings-qna's Issues

Streamlit/redis connection error

Although the solution works, I keep seeing this error being output, which looks like either a redis connection problem, or a Streamlit one. Any idea what might be the cause?

File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script exec(code, module.__dict__) File "/usr/local/src/myscripts/pages/04_Index_Management.py", line 24, in <module> data = redisembeddings.get_documents() File "/usr/local/src/myscripts/utilities/redisembeddings.py", line 57, in get_documents results = redis_conn.ft(index_name).search(query) File "/usr/local/lib/python3.9/site-packages/redis/commands/search/commands.py", line 420, in search res = self.execute_command(SEARCH_CMD, *args) File "/usr/local/lib/python3.9/site-packages/redis/client.py", line 1258, in execute_command return conn.retry.call_with_retry( File "/usr/local/lib/python3.9/site-packages/redis/retry.py", line 49, in call_with_retry fail(error) File "/usr/local/lib/python3.9/site-packages/redis/client.py", line 1262, in <lambda> lambda error: self._disconnect_raise(conn, error), File "/usr/local/lib/python3.9/site-packages/redis/client.py", line 1248, in _disconnect_raise raise error File "/usr/local/lib/python3.9/site-packages/redis/retry.py", line 46, in call_with_retry return do() File "/usr/local/lib/python3.9/site-packages/redis/client.py", line 1259, in <lambda> lambda: self._send_command_parse_response( File "/usr/local/lib/python3.9/site-packages/redis/client.py", line 1235, in _send_command_parse_response return self.parse_response(conn, command_name, **options) File "/usr/local/lib/python3.9/site-packages/redis/client.py", line 1275, in parse_response response = connection.read_response() File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 848, in read_response response = self._parser.read_response(disable_decoding=disable_decoding) File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 335, in read_response result = self._read_response(disable_decoding=disable_decoding) File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 344, in _read_response raw = self._buffer.readline() File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 243, in readline self._read_from_socket() File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 195, in _read_from_socket raise ConnectionError(SERVER_CLOSED_CONNECTION_ERROR)

OpenAI Semantic Answer error

Attached the error, hope you can tell me what it could be?
image

RetryError: RetryError[<Future at 0x7c5cdb2d1460 state=finished raised InvalidRequestError>]
Traceback:
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
exec(code, module.dict)
File "/usr/local/src/myscripts/OpenAI_Queries.py", line 70, in
st.session_state['prompt'], st.session_state['response'] = utils.get_semantic_answer(df, question, model=model, engine='davinci', limit_response=st.session_state['limit_response'], tokens_response=st.tokens_response, temperature=st.temperature)
File "/usr/local/src/myscripts/utilities/utils.py", line 50, in get_semantic_answer
res = search_semantic_redis(df, question, n=3, pprint=False, engine=engine)
File "/usr/local/src/myscripts/utilities/utils.py", line 34, in search_semantic_redis
embedding = get_embedding(search_query, engine= os.getenv('OPENAI_EMBEDDINGS_ENGINE_QUERY', f'text-search-{engine}-query-001'))
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 326, in wrapped_f
return self(f, *args, **kw)
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 406, in call
do = self.iter(retry_state=retry_state)
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 363, in iter
raise retry_exc from fut.exception()

ValueError: not enough values to unpack (expected 3, got 2)

Hello Ruocco,

I managed to add a document, but when I start asking questions I keep getting this error:

ValueError: not enough values to unpack (expected 3, got 2)
Traceback:
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
File "/usr/local/src/myscripts/OpenAI_Queries.py", line 74, in <module>
    st.session_state['full_prompt'], st.session_state['response'], st.session_state['source_file'] = utils.get_semantic_answer(df, question, st.session_state['prompt'] ,model=model, engine='davinci', tokens_response=st.tokens_response, temperature=st.temperature

Could you please help me with this? Thanks a lot!

Out of Context answers

I am getting out of context answers when I enter the prompt in the "OpenAI Semantic Answer" text box.

Screenshot 2023-02-17 at 2 16 43 PM

Whereas If i make the question part of prompt in settings, I get the right one.

Screenshot 2023-02-17 at 2 24 17 PM

The Port for Redis is using the Password from the config

Hi,

I get the following error after successfull deployment:

Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/langchain/vectorstores/redis.py", line 46, in init redis_client = redis.from_url(redis_url, **kwargs) File "/usr/local/lib/python3.9/site-packages/redis/utils.py", line 32, in from_url return Redis.from_url(url, **kwargs) File "/usr/local/lib/python3.9/site-packages/redis/client.py", line 902, in from_url connection_pool = ConnectionPool.from_url(url, **kwargs) File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 1329, in from_url url_options = parse_url(url) File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 1257, in parse_url if url.port: File "/usr/local/lib/python3.9/urllib/parse.py", line 178, in port raise ValueError(message) from None ValueError: Port could not be cast to integer value as 'MY REDIS PASSWORD'

-> I have already checked the WebApp configuration Parameters. They are all correct, the port is ok and also the password is in the correct parameter as value.

Maybe there is a wrong reference in the code of the webapp?

Delete documents

Anyone using the web link can view the documents from Document Viewer.
Is it possible to hide/remove the documents from portal itself? (instead of deleting from Blob Storage). It would be super if you can implement that. Pls let me know. Thanks.

long error when running

Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 698, in connect sock = self.retry.call_with_retry( File "/usr/local/lib/python3.9/site-packages/redis/retry.py", line 46, in call_with_retry return do() File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 699, in lambda: self._connect(), lambda error: self.disconnect(error) File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 955, in _connect for res in socket.getaddrinfo( File "/usr/local/lib/python3.9/socket.py", line 954, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/src/myscripts/OpenAI_Queries.py", line 120, in llm_helper = LLMHelper(custom_prompt=st.session_state.custom_prompt, temperature=st.session_state.custom_temperature) File "/usr/local/src/myscripts/utilities/helper.py", line 90, in init self.vector_store: RedisExtended = RedisExtended(redis_url=self.vector_store_full_address, index_name=self.index_name, embedding_function=self.embeddings.embed_query) if vector_store is None else vector_store File "/usr/local/src/myscripts/utilities/redis.py", line 26, in init super().init(redis_url, index_name, embedding_function) File "/usr/local/lib/python3.9/site-packages/langchain/vectorstores/redis.py", line 90, in init _check_redis_module_exist(redis_client, REDIS_REQUIRED_MODULES) File "/usr/local/lib/python3.9/site-packages/langchain/vectorstores/redis.py", line 30, in _check_redis_module_exist installed_modules = client.module_list() File "/usr/local/lib/python3.9/site-packages/redis/commands/core.py", line 5761, in module_list return self.execute_command("MODULE LIST") File "/usr/local/lib/python3.9/site-packages/redis/client.py", line 1255, in execute_command conn = self.connection or pool.get_connection(command_name, **options) File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 1442, in get_connection connection.connect() File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 704, in connect raise ConnectionError(self._error_message(e)) redis.exceptions.ConnectionError: Error -2 connecting to api:6379. Name or service not known.

How to increase answer length limit in azure-open-ai-embeddings-qna?

I am using the azure-open-ai-embeddings-qna project and have noticed that the answers are truncated after a certain number of characters. I would like to know where I can find the answer length limit and how to increase it. Can someone please tell me how to modify the limit and increase the maximum length of the answers? Thank you

Error uploading files

PDF file,39.9M
send raise error azure.core.exceptions.ServiceResponseError: ('Connection aborted.', timeout('The write operation timed out'))

Unable to use "history" when calling the ApiQna function

First, thanks for sharing this great work.
For some reason, I keep getting a 500 when executing the below on ApiQnA. Is there any obvious reason?
image
Here is the detailed error message I’m getting. Note that it works perfectly fine if I remove the “history” attribute from the Body.
image

Batch upload does not seem to work

I tried uploading a single 5 page pdf document via "Add documents in Batch". I get the prompt that 1 document is being upload and that this will happen asynchronously and will take time. However even after waiting for more than an hour I cannot see the converted file in Index Management. What am I doing wrong? All necessary components have been provisioned in Azure.

Batch upload is not working as expected

I tried uploading a single 5 page pdf document via "Add documents in Batch". I get the prompt that 1 document is being upload and that this will happen asynchronously and will take time. However even after waiting for more than an hour I cannot see the converted file in Index Management. What am I doing wrong? All necessary components have been provisioned in Azure.

Below are logs of the error -
azure-open-ai-embeddings-qna-batch-1  | info: Function.BatchStartProcessing[1]
azure-open-ai-embeddings-qna-batch-1  |       Executing 'Functions.BatchStartProcessing' (Reason='This function was programmatically called via the host APIs.', Id=9f86e0ca-325c-461c-bf10-9391d9cce231)
azure-open-ai-embeddings-qna-batch-1  | fail: Function.BatchStartProcessing[3]
azure-open-ai-embeddings-qna-batch-1  |       Executed 'Functions.BatchStartProcessing' (Failed, Id=9f86e0ca-325c-461c-bf10-9391d9cce231, Duration=1ms)
azure-open-ai-embeddings-qna-batch-1  |       Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.BatchStartProcessing
azure-open-ai-embeddings-qna-batch-1  |        ---> Microsoft.Azure.WebJobs.Script.Workers.Rpc.RpcException: Result: Failure
azure-open-ai-embeddings-qna-batch-1  |       Exception: KeyError: 'QUEUE_NAME'
azure-open-ai-embeddings-qna-batch-1  |       Stack:   File "/azure-functions-host/workers/python/3.9/LINUX/X64/azure_functions_worker/dispatcher.py", line 357, in _handle__function_load_request
azure-open-ai-embeddings-qna-batch-1  |           func = loader.load_function(
azure-open-ai-embeddings-qna-batch-1  |         File "/azure-functions-host/workers/python/3.9/LINUX/X64/azure_functions_worker/utils/wrappers.py", line 44, in call
azure-open-ai-embeddings-qna-batch-1  |           return func(*args, **kwargs)
azure-open-ai-embeddings-qna-batch-1  |         File "/azure-functions-host/workers/python/3.9/LINUX/X64/azure_functions_worker/loader.py", line 132, in load_function
azure-open-ai-embeddings-qna-batch-1  |           mod = importlib.import_module(fullmodname)
azure-open-ai-embeddings-qna-batch-1  |         File "/usr/local/lib/python3.9/importlib/init.py", line 127, in import_module
azure-open-ai-embeddings-qna-batch-1  |           return _bootstrap._gcd_import(name[level:], package, level)
azure-open-ai-embeddings-qna-batch-1  |         File "", line 1030, in _gcd_import
azure-open-ai-embeddings-qna-batch-1  |         File "", line 1007, in _find_and_load

Adding Documents - Issue

Hello,

I'm trying to add a a .pdf file to be embedded. It has only 7 pages, but only the first 2 pages are embedded into a single chunk. The rest of the pages (5) are like "skipped".
Could you help me with an answer about this? Am I missing something?

Thanks,
Mihai

change port

it seems to be harcoded to port 8080, how to change it to run on another such as the common port 80?

Batch Processing: QUEUE_NAME variable not found

Hi,

I used the Run everything locally in Docker approach and tried to batch upload a bunch of PDFs.
I got an error about missing environment variable QUEUE_NAME.

Looking at this file, it seems like a storage queue is being used in batch processing.

I created a queue, added the key to .env and everything worked fine.

So I guess the docs might need an update.

Thanks

Anything like this in C#?

I want to do this in C#. I am currently using Cognitive Servuces QnA. Is there anything that exists that is close to this solution but in C#? Thank you.

Deployment process went successful but portal has InvalidRequestError

Please help to solve problem after successful deployment without any errors.
Last deployment were without any errors, but before there was wrong pur to the script openai name service.
Everything from this resource group was deleted and whole procedure again initiated.
Please help to solve the problem because portal is not working :(

InvalidRequestError: The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again.
Traceback:
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
exec(code, module.dict)
File "/usr/local/src/myscripts/pages/00_Chat.py", line 25, in
question, result, _, sources = llm_helper.get_semantic_answer_lang_chain(question, st.session_state['chat_history'])
File "/usr/local/src/myscripts/utilities/helper.py", line 168, in get_semantic_answer_lang_chain
result = chain({"question": question, "chat_history": chat_history})
File "/usr/local/lib/python3.9/site-packages/langchain/chains/base.py", line 116, in call
raise e
File "/usr/local/lib/python3.9/site-packages/langchain/chains/base.py", line 113, in call
outputs = self._call(inputs)
File "/usr/local/lib/python3.9/site-packages/langchain/chains/conversational_retrieval/base.py", line 79, in _call
docs = self._get_docs(new_question, inputs)
File "/usr/local/lib/python3.9/site-packages/langchain/chains/conversational_retrieval/base.py", line 146, in _get_docs
docs = self.retriever.get_relevant_documents(question)
File "/usr/local/lib/python3.9/site-packages/langchain/vectorstores/redis.py", line 424, in get_relevant_documents
docs = self.vectorstore.similarity_search(query, k=self.k)
File "/usr/local/lib/python3.9/site-packages/langchain/vectorstores/redis.py", line 139, in similarity_search
docs_and_scores = self.similarity_search_with_score(query, k=k)
File "/usr/local/lib/python3.9/site-packages/langchain/vectorstores/redis.py", line 191, in similarity_search_with_score
embedding = self.embedding_function(query)
File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 280, in embed_query
embedding = self._embedding_func(text, engine=self.query_model_name)
File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 236, in _embedding_func
return self._get_len_safe_embeddings([text], engine=engine)[0]
File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 206, in _get_len_safe_embeddings
response = embed_with_retry(
File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 53, in embed_with_retry
return _completion_with_retry(**kwargs)
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 314, in iter
return fut.result()
File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 382, in call
result = fn(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 51, in _completion_with_retry
return embeddings.client.create(**kwargs)
File "/usr/local/lib/python3.9/site-packages/openai/api_resources/embedding.py", line 33, in create
response = super().create(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 226, in request
resp, got_stream = self._interpret_response(result, stream)
File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 619, in _interpret_response
self._interpret_response_line(
File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line
raise self.handle_error_response(

Getting a Retry error -> RetryError: RetryError[<Future at 0x7561df248280 state=finished raised InvalidRequestError>]

Hi,

I am running an instance on our internal azure services, and continually getting this error, whenever we are trying to compute the embeddings for any of the documents, or text.

Here is the traceback

File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 561, in _run_script
self._session_state.on_script_will_rerun(rerun_data.widget_states)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/safe_session_state.py", line 72, in on_script_will_rerun
self._state.on_script_will_rerun(latest_widget_states)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 542, in on_script_will_rerun
self._call_callbacks()
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 555, in _call_callbacks
self._new_widget_state.call_callback(wid)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 276, in call_callback
callback(*args, **kwargs)
File "/usr/local/src/myscripts/pages/01_Add_Document.py", line 16, in embeddings
embeddings = utils.chunk_and_embed(st.session_state['doc_text'])
File "/usr/local/src/myscripts/utilities/utils.py", line 104, in chunk_and_embed
full_data['search_embeddings'] = get_embedding(text, engine)
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 326, in iter
raise retry_exc from fut.exception()

An option to delete document is not available

Hi there,

I don't see an option to delete the documents uploaded on the portal. Would you be able to add an option to delete from Document Viewer? Please advise.
I can see an option to delete from Index Management though, but it is not working.

Thanks,
Arun

No new embeddings generated - no error

Hello,

I've been playing with this tool since the first release, however recently, after having rebased by repository on the latest commits here, I can't seem to add any embedding

When I click on "Compute Embeddings" , nothing happens, or better, the first time I restart everything from scratch the index gets created, and after that every other interaction finishes immediately without any error

This is the output of the docker terminal (latest images) for trying to add 3 embeddings

image

I see no errors anywhere, do you have any suggestions on where to search for the issue?
Everything worked perfectly before the new pages were added, and all the models are deployed on Azure OpenAI

Translator module not working

Traceback (most recent call last): File "/usr/local/src/myscripts/OpenAI_Queries.py", line 39, in check_deployment llm_helper.translator.translate("This is a test", "it") File "/usr/local/src/myscripts/utilities/translator.py", line 26, in translate if (response[0]['language'] != language): KeyError: 0

We tested with key provided in a test script, its working.

InvalidRequestError: The API deployment for the resource does not exist.

InvalidRequestError: The API deployment for the resource does not exist.

Getting the same error as the previous person, I do have all the deployments except text-babbage, but still receiving this error....Can you please help me?
Traceback:
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 561, in _run_script
self._session_state.on_script_will_rerun(rerun_data.widget_states)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/safe_session_state.py", line 72, in on_script_will_rerun
self._state.on_script_will_rerun(latest_widget_states)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 542, in on_script_will_rerun
self._call_callbacks()
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 555, in _call_callbacks
self._new_widget_state.call_callback(wid)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 276, in call_callback
callback(*args, **kwargs)
File "/usr/local/src/myscripts/pages/10_Utils - Document_Summary.py", line 8, in summarize
_, response = utils.get_completion(get_prompt(), max_tokens=500, model=os.getenv('OPENAI_ENGINES', 'text-davinci-003'))
File "/usr/local/src/myscripts/utilities/utils.py", line 110, in get_completion
response = openai.Completion.create(
File "/usr/local/lib/python3.9/site-packages/openai/api_resources/completion.py", line 25, in create
return super().create(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 226, in request
resp, got_stream = self._interpret_response(result, stream)
File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 619, in _interpret_response
self._interpret_response_line(
File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 679, in _interpret_response_line
raise self.handle_error_response(

Question

Is redis used as vector database here?

embeddings persistence

I'm attempting to have the embeddings in the redis (api) container persist a restart.

having mounted /data to a dir on localhost I only ever see two dirs (/data/redis and /data/redisinsight). Neither of these seem to contain any data...

i've played around with adding --save config in docker-compose but i am no docker wizard and it looks like any config passed at compose time nukes the default config configured in the container.

very possible i am misunderstanding how this should all hang together... but any advice would be welcome!

Translation model is not working

Please help me to solve translation model problem and is there any settings to be checked.
Before deployment proces I add KEY 1 and endpoint: text translation (https://api.cognitive.microsofttranslator.com/).
How to fix this ?

Message:

Translation model is not working.
Please check your Azure Translator key in the App Settings.
Then restart your application.

Traceback (most recent call last): File "/usr/local/src/myscripts/OpenAI_Queries.py", line 41, in check_deployment llm_helper.translator.translate("This is a test", "it") File "/usr/local/src/myscripts/utilities/translator.py", line 36, in translate if (response[0]['language'] != language): KeyError: 0

image

RetryError: RetryError[<Future at 0x724f009a5670 state=finished raised InvalidRequestError>]

Deploy to Azure, The deployment is successful and there are no exceptions in the deployment process
OpenAI Queries
Traceback:
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
exec(code, module.dict)
File "/usr/local/src/myscripts/OpenAI_Queries.py", line 73, in
st.session_state['full_prompt'], st.session_state['response'] = utils.get_semantic_answer(df, question, st.session_state['prompt'] ,model=model, engine='davinci', limit_response=st.session_state['limit_response'], tokens_response=st.tokens_response, temperature=st.temperature)
File "/usr/local/src/myscripts/utilities/utils.py", line 49, in get_semantic_answer
res = search_semantic_redis(df, question, n=3, pprint=False, engine=engine)
File "/usr/local/src/myscripts/utilities/utils.py", line 34, in search_semantic_redis
embedding = get_embedding(search_query, engine= get_embeddings_model()['query'])
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 326, in iter
raise retry_exc from fut.exception()

Add Document
Add text to the knowledge base
RetryError: RetryError[<Future at 0x724f0269d700 state=finished raised InvalidRequestError>]
Traceback:
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 561, in _run_script
self._session_state.on_script_will_rerun(rerun_data.widget_states)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/safe_session_state.py", line 72, in on_script_will_rerun
self._state.on_script_will_rerun(latest_widget_states)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 542, in on_script_will_rerun
self._call_callbacks()
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 555, in _call_callbacks
self._new_widget_state.call_callback(wid)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 276, in call_callback
callback(*args, **kwargs)
File "/usr/local/src/myscripts/pages/01_Add_Document.py", line 16, in embeddings
embeddings = utils.chunk_and_embed(st.session_state['doc_text'])
File "/usr/local/src/myscripts/utilities/utils.py", line 104, in chunk_and_embed
full_data['search_embeddings'] = get_embedding(text, engine)
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 326, in iter
raise retry_exc from fut.exception()

redis error

Hi, I'm receiving this error:

| redis.exceptions.ConnectionError: Error -2 connecting to your_redis_instance.your_region.redisenterprise.cache.azure.net:10000. Name or service not known.

Based on my understanding, this error suggests that a Redis setup is required in Azure. However, I couldn't find any information about this infrastructure in the project's readme. I'm also curious about the presence of a Redis Docker container in the setup, and how it relates to the Azure setup.

Regards.

loading doc/docx/ppt/ppt formats doesn't work

If I try to upload a document in "ppt"/"pptx" or "doc"/"docx" formats the system crashes ("error message: application/vnd.openxmlformats-officedocument.wordprocessingml.document files are not allowed" is the message).
On the architecture diagram I see that the "docx" are mentioned in the KB; is it a deployment error?
How can I load documents of these formats?

InvalidRequestError: The API deployment for the resource does not exist.

InvalidRequestError: The API deployment for the resource does not exist.
Traceback:
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 561, in _run_script
self._session_state.on_script_will_rerun(rerun_data.widget_states)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/safe_session_state.py", line 72, in on_script_will_rerun
self._state.on_script_will_rerun(latest_widget_states)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 542, in on_script_will_rerun
self._call_callbacks()
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 555, in _call_callbacks
self._new_widget_state.call_callback(wid)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 276, in call_callback
callback(*args, **kwargs)
File "/usr/local/src/myscripts/pages/10_Utils - Document_Summary.py", line 8, in summarize
_, response = utils.get_completion(get_prompt(), max_tokens=500, model=os.getenv('OPENAI_ENGINES', 'text-davinci-003'))
File "/usr/local/src/myscripts/utilities/utils.py", line 109, in get_completion
response = openai.Completion.create(
File "/usr/local/lib/python3.9/site-packages/openai/api_resources/completion.py", line 25, in create
return super().create(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 227, in request
resp, got_stream = self._interpret_response(result, stream)
File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 620, in _interpret_response
self._interpret_response_line(
File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 680, in _interpret_response_line
raise self.handle_error_response(
streamlit-OpenAI_Queries-2023-02-06-18-02-08.webm

not support model 3.5 and 4

I have tested the model gpt-35-turbo and gpt-4, both of these models are not supported? or just support text-davinci-003, is there a need to change models in actual use in the future?

image

My Deployment Models

image

Check deployment after "Deploy to Azure"

Error embedding check with text-embedding-ada-002 existed on Azure. Test error with V1 and V2 model.

Embedding model is not working. Please check you have a deployment name text-embedding-ada-002 in your Azure OpenAI resource https://xxx.openai.azure.com/. Then restart your application.

Traceback (most recent call last): File "/usr/local/src/myscripts/OpenAI_Queries.py", line 28, in check_deployment llm_helper.embeddings.embed_documents(texts=["This is a test"]) File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 258, in embed_documents return self._get_len_safe_embeddings(texts, engine=self.document_model_name) File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 194, in _get_len_safe_embeddings encoding = tiktoken.model.encoding_for_model(self.document_model_name) File "/usr/local/lib/python3.9/site-packages/tiktoken/model.py", line 51, in encoding_for_model raise KeyError( KeyError: 'Could not automatically map text-embedding-ada-002 to a tokeniser. Please use tiktok.get_encoding to explicitly get the tokeniser you expect.'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.