Comments (1)
Hey @shenhai-ran! I'm here to assist you with any bugs, questions, or contributions while you wait for a human maintainer. Let's tackle this issue together!
To configure a self-signed certificate for the IngestionPipeline
with the OpenAI API in LlamaIndex, you need to ensure that the HTTP client settings are correctly applied to all components, including the IngestionPipeline
. Hereβs how you can do it:
-
Configure the HTTP client with the self-signed certificate:
Ensure that thehttp_client
andasync_http_client
are configured with the path to your self-signed certificate. -
Pass the configured HTTP client to the
OpenAIEmbedding
used in theIngestionPipeline
:
You need to explicitly pass thehttp_client
andasync_http_client
to theOpenAIEmbedding
instance used in theIngestionPipeline
.
Here is an updated version of your code:
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
from llama_index.embeddings.openai import OpenAIEmbedding
import httpx
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.extractors import TitleExtractor
from llama_index.core.ingestion import IngestionPipeline
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadb
from llama_index.readers.file import PyMuPDFReader
# Configure HTTP clients with self-signed certificate
http_client = httpx.Client(verify="<path/to/certificate>")
async_http_client = httpx.AsyncClient(verify="<path/to/certificate>")
# Set the default LLM and embedding model with the configured HTTP clients
Settings.llm = OpenAI(
http_client=http_client,
async_http_client=async_http_client,
)
Settings.embed_model = OpenAIEmbedding(
http_client=http_client,
)
# Load documents and create an index
documents = SimpleDirectoryReader("path/to/folder").load_data()
index = VectorStoreIndex.from_documents(documents, show_progress=True)
# Query the index
query_engine = index.as_query_engine()
response = query_engine.query("<my query>")
print(response)
# Configure Chroma vector store
chroma_client = chromadb.EphemeralClient()
chroma_collection = chroma_client.create_collection("ingestion_demo")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
# Create an IngestionPipeline with the configured OpenAIEmbedding
pipeline = IngestionPipeline(
transformations=[
SentenceSplitter(chunk_size=300, chunk_overlap=100),
TitleExtractor(),
OpenAIEmbedding(http_client=http_client, async_http_client=async_http_client), # Pass the configured HTTP clients
],
vector_store=vector_store,
)
# Load documents and run the pipeline
documents = SimpleDirectoryReader(
"<path/to/folder>",
required_exts=[".pdf"],
file_extractor={".pdf": PyMuPDFReader()},
).load_data()
pipeline.run(documents=documents, show_progress=True)
index = VectorStoreIndex.from_vector_store(vector_store, show_progress=True)
vector_retriever = index.as_retriever(similarity_top_k=2)
By explicitly passing the configured http_client
and async_http_client
to the OpenAIEmbedding
instance used in the IngestionPipeline
, you ensure that the self-signed certificate is correctly applied, which should resolve the APIConnectionError
you are encountering.
To continue talking to Dosu, mention @dosu.
from llama_index.
Related Issues (20)
- [Question]: How to insert/delete document to/from VectorStoreIndex when using IngestionPipeline? HOT 2
- Compatibility issue between Qdrant and DSPy when Qdrant is used as the VectorStoreIndex's storage context HOT 5
- [Question]: AttributeError: 'property' object has no attribute 'context_window' HOT 1
- [Question]: The created knowledge graph does not have edge relationships neo4j HOT 11
- [Documentation]: Some of the URL Not Working HOT 3
- [Question]: Unable to understand how document storage works in case nodes are deleted HOT 1
- [Documentation]: Broken 'Examples' Link HOT 3
- [Feature Request]: Add a notebook to show llamaindex agent works with graphRAG and Vertex AI
- [Bug]: File rename error in llama-index-finetuning/llama_index/finetuning/mistralai/utils.py HOT 1
- [Question]: How to enable "Calling function" print out after querying from Multi-Document Agent example HOT 3
- [Question]: Access LLM's response object CompleteResponse() attribute `additional_kwarg` in RAG HOT 2
- [Bug]: Error in initializing neo4j HOT 2
- Indexes cannot be created correctly using the MilvusVectorStore. HOT 12
- How should the dim parameter value of MilvusVectorStore be calculated? HOT 4
- [Bug]: ERROR: Failed building wheel for pystemmer HOT 1
- How to deploy open-source embedding models in auto-merging retriever: ValueError: shapes (1024,) and (384,) not aligned: 1024 (dim 0) != 384 (dim 0) HOT 2
- [Bug]: No module named 'llama_index.llms.openai.base HOT 1
- [Bug]: [OpenAILike] Cannot use llm_chat_callback on an instance without a callback_manager attribute HOT 4
- [Feature Request]: Version pinning for sub packages HOT 2
- I wonder how to use llama_index to retrieve the Milvus collection after it is created and indexed using the MilvusVectorStore. HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama_index.