Git Product home page Git Product logo

pinecone-python-client's Introduction

Pinecone Python Client

License CI

The official Pinecone Python client.

For more information, see the docs at https://www.pinecone.io/docs/

Documentation

Upgrading your client

Upgrading from 4.x to 5.x

As part of an overall move to stop exposing generated code in the package's public interface, an obscure configuration property (openapi_config) was removed in favor of individual configuration options such as proxy_url, proxy_headers, and ssl_ca_certs. All of these properties were available in v3 and v4 releases of the SDK, with deprecation notices shown to affected users.

It is no longer necessary to install a separate plugin, pinecone-plugin-inference, to try out the Inference API; that plugin is now installed by default in the v5 SDK. See usage instructions below.

Older releases

  • Upgrading to 4.x : For this upgrade you are unlikely to be impacted by breaking changes unless you are using the grpc extras (see install steps below). Read full details in these v4 Release Notes.

  • Upgrading to 3.x: Many things were changed in the v3 client to pave the way for Pinecone's new Serverless index offering. These changes are covered in detail in the v3 Migration Guide. Serverless indexes are only available in 3.x release versions or greater.

Example code

Many of the brief examples shown in this README are using very small vectors to keep the documentation concise, but most real world usage will involve much larger embedding vectors. To see some more realistic examples of how this client can be used, explore some of our many Jupyter notebooks in the examples repository.

Prerequisites

The Pinecone Python client is compatible with Python 3.8 and greater.

Installation

There are two flavors of the Pinecone python client. The default client installed from PyPI as pinecone-client has a minimal set of dependencies and interacts with Pinecone via HTTP requests.

If you are aiming to maximimize performance, you can install additional gRPC dependencies to access an alternate client implementation that relies on gRPC for data operations. See the guide on tuning performance.

Installing with pip

# Install the latest version
pip3 install pinecone-client

# Install the latest version, with extra grpc dependencies
pip3 install "pinecone-client[grpc]"

# Install a specific version
pip3 install pinecone-client==5.0.0

# Install a specific version, with grpc extras
pip3 install "pinecone-client[grpc]"==5.0.0

Installing with poetry

# Install the latest version
poetry add pinecone-client

# Install the latest version, with grpc extras
poetry add pinecone-client --extras grpc

# Install a specific version
poetry add pinecone-client==5.0.0

# Install a specific version, with grpc extras
poetry add pinecone-client==5.0.0 --extras grpc

Usage

Initializing the client

Before you can use the Pinecone SDK, you must sign up for an account and find your API key in the Pinecone console dashboard at https://app.pinecone.io.

Using environment variables

The Pinecone class is your main entry point into the Pinecone python SDK. If you have set your API Key in the PINECONE_API_KEY environment variable, you can instantiate the client with no other arguments.

from pinecone import Pinecone

pc = Pinecone() # This reads the PINECONE_API_KEY env var

Using configuration keyword params

If you prefer to pass configuration in code, for example if you have a complex application that needs to interact with multiple different Pinecone projects, the constructor accepts a keyword argument for api_key.

If you pass configuration in this way, you can have full control over what name to use for the environment variable, sidestepping any issues that would result from two different client instances both needing to read the same PINECONE_API_KEY variable that the client implicitly checks for.

Configuration passed with keyword arguments takes precedence over environment variables.

import os
from pinecone import Pinecone

pc = Pinecone(api_key=os.environ.get('CUSTOM_VAR'))

Proxy configuration

If your network setup requires you to interact with Pinecone via a proxy, you will need to pass additional configuration using optional keyword parameters. These optional parameters are forwarded to urllib3, which is the underlying library currently used by the Pinecone client to make HTTP requests. You may find it helpful to refer to the urllib3 documentation on working with proxies while troubleshooting these settings.

Here is a basic example:

from pinecone import Pinecone

pc = Pinecone(
    api_key='YOUR_API_KEY',
    proxy_url='https://your-proxy.com'
)

pc.list_indexes()

If your proxy requires authentication, you can pass those values in a header dictionary using the proxy_headers parameter.

from pinecone import Pinecone
import urllib3 import make_headers

pc = Pinecone(
    api_key='YOUR_API_KEY',
    proxy_url='https://your-proxy.com',
    proxy_headers=make_headers(proxy_basic_auth='username:password')
)

pc.list_indexes()

Using proxies with self-signed certificates

By default the Pinecone Python client will perform SSL certificate verification using the CA bundle maintained by Mozilla in the certifi package.

If your proxy server is using a self-signed certificate, you will need to pass the path to the certificate in PEM format using the ssl_ca_certs parameter.

from pinecone import Pinecone
import urllib3 import make_headers

pc = Pinecone(
    api_key="YOUR_API_KEY",
    proxy_url='https://your-proxy.com',
    proxy_headers=make_headers(proxy_basic_auth='username:password'),
    ssl_ca_certs='path/to/cert-bundle.pem'
)

pc.list_indexes()

Disabling SSL verification

If you would like to disable SSL verification, you can pass the ssl_verify parameter with a value of False. We do not recommend going to production with SSL verification disabled.

from pinecone import Pinecone
import urllib3 import make_headers

pc = Pinecone(
    api_key='YOUR_API_KEY',
    proxy_url='https://your-proxy.com',
    proxy_headers=make_headers(proxy_basic_auth='username:password'),
    ssl_ca_certs='path/to/cert-bundle.pem',
    ssl_verify=False
)

pc.list_indexes()

Working with GRPC (for improved performance)

If you've followed instructions above to install with optional grpc extras, you can unlock some performance improvements by working with an alternative version of the client imported from the pinecone.grpc subpackage.

import os
from pinecone.grpc import PineconeGRPC

pc = PineconeGRPC(api_key=os.environ.get('PINECONE_API_KEY'))

# From here on, everything is identical to the REST-based client.
index = pc.Index(host='my-index-8833ca1.svc.us-east1-gcp.pinecone.io')

index.upsert(vectors=[])
index.query(vector=[...], top_key=10)

Indexes

Create Index

Create a serverless index

The following example creates a serverless index in the us-west-2 region of AWS. For more information on serverless and regional availability, see Understanding indexes.

from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')
pc.create_index(
    name='my-index',
    dimension=1536,
    metric='euclidean',
    deletion_protection='enabled',
    spec=ServerlessSpec(
        cloud='aws',
        region='us-west-2'
    )
)

Create a pod index

The following example creates an index without a metadata configuration. By default, Pinecone indexes all metadata.

from pinecone import Pinecone, PodSpec

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')
pc.create_index(
    name="example-index",
    dimension=1536,
    metric="cosine",
    deletion_protection='enabled',
    spec=PodSpec(
        environment='us-west-2',
        pod_type='p1.x1'
    )
)

Pod indexes support many optional configuration fields. For example, the following example creates an index that only indexes the "color" metadata field. Queries against this index cannot filter based on any other metadata field.

from pinecone import Pinecone, PodSpec

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

metadata_config = {
    "indexed": ["color"]
}

pc.create_index(
    "example-index-2",
    dimension=1536,
    spec=PodSpec(
        environment='us-west-2',
        pod_type='p1.x1',
        metadata_config=metadata_config
    )
)

List indexes

The following example returns all indexes in your project.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')
for index in pc.list_indexes():
    print(index['name'])

Describe index

The following example returns information about the index example-index.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

index_description = pc.describe_index("example-index")

Delete an index

The following example deletes the index named example-index. Only indexes which are not protected by deletion protection may be deleted.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

pc.delete_index("example-index")

Scale replicas

The following example changes the number of replicas for example-index.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

new_number_of_replicas = 4
pc.configure_index("example-index", replicas=new_number_of_replicas)

Configuring deletion protection

If you would like to enable deletion protection, which prevents an index from being deleted, the configure_index method also handles that via an optional deletion_protection keyword argument.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

# To enable deletion protection
pc.configure_index("example-index", deletion_protection='enabled')

# Disable deletion protection
pc.configure_index("example-index", deletion_protection='disabled')

# Call describe index to verify the configuration change has been applied
desc = pc.describe_index("example-index")
print(desc.deletion_protection)

Describe index statistics

The following example returns statistics about the index example-index.

import os
from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')
index = pc.Index(host=os.environ.get('INDEX_HOST'))

index_stats_response = index.describe_index_stats()

Upsert vectors

The following example upserts vectors to example-index.

import os
from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')
index = pc.Index(host=os.environ.get('INDEX_HOST'))

upsert_response = index.upsert(
    vectors=[
        ("vec1", [0.1, 0.2, 0.3, 0.4], {"genre": "drama"}),
        ("vec2", [0.2, 0.3, 0.4, 0.5], {"genre": "action"}),
    ],
    namespace="example-namespace"
)

Query an index

The following example queries the index example-index with metadata filtering.

import os
from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

# Find your index host by calling describe_index
# through the Pinecone web console
index = pc.Index(host=os.environ.get('INDEX_HOST'))

query_response = index.query(
    namespace="example-namespace",
    vector=[0.1, 0.2, 0.3, 0.4],
    top_k=10,
    include_values=True,
    include_metadata=True,
    filter={
        "genre": {"$in": ["comedy", "documentary", "drama"]}
    }
)

Delete vectors

The following example deletes vectors by ID.

import os
from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

# Find your index host by calling describe_index
# through the Pinecone web console
index = pc.Index(host=os.environ.get('INDEX_HOST'))

delete_response = index.delete(ids=["vec1", "vec2"], namespace="example-namespace")

Fetch vectors

The following example fetches vectors by ID.

import os
from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

# Find your index host by calling describe_index
# through the Pinecone web console
index = pc.Index(host=os.environ.get('INDEX_HOST'))

fetch_response = index.fetch(ids=["vec1", "vec2"], namespace="example-namespace")

Update vectors

The following example updates vectors by ID.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

# Find your index host by calling describe_index
# through the Pinecone web console
index = pc.Index(host=os.environ.get('INDEX_HOST'))

update_response = index.update(
    id="vec1",
    values=[0.1, 0.2, 0.3, 0.4],
    set_metadata={"genre": "drama"},
    namespace="example-namespace"
)

List vectors

The list and list_paginated methods can be used to list vector ids matching a particular id prefix. With clever assignment of vector ids, this can be used to help model hierarchical relationships between different vectors such as when there are embeddings for multiple chunks or fragments related to the same document.

The list method returns a generator that handles pagination on your behalf.

from pinecone import Pinecone

pc = Pinecone(api_key='xxx')
index = pc.Index(host='hosturl')

# To iterate over all result pages using a generator function
namespace = 'foo-namespace'
for ids in index.list(prefix='pref', limit=3, namespace=namespace):
    print(ids) # ['pref1', 'pref2', 'pref3']

    # Now you can pass this id array to other methods, such as fetch or delete.
    vectors = index.fetch(ids=ids, namespace=namespace)

There is also an option to fetch each page of results yourself with list_paginated.

from pinecone import Pinecone

pc = Pinecone(api_key='xxx')
index = pc.Index(host='hosturl')

# For manual control over pagination
results = index.list_paginated(
    prefix='pref',
    limit=3,
    namespace='foo',
    pagination_token='eyJza2lwX3Bhc3QiOiI5IiwicHJlZml4IjpudWxsfQ=='
)
print(results.namespace) # 'foo'
print([v.id for v in results.vectors]) # ['pref1', 'pref2', 'pref3']
print(results.pagination.next) # 'eyJza2lwX3Bhc3QiOiI5IiwicHJlZml4IjpudWxsfQ=='
print(results.usage) # { 'read_units': 1 }

Collections

Create collection

The following example creates the collection example-collection from example-index.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

pc.create_collection(
    name="example-collection",
    source="example-index"
)

List collections

The following example returns a list of the collections in the current project.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

active_collections = pc.list_collections()

Describe a collection

The following example returns a description of the collection example-collection.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

collection_description = pc.describe_collection("example-collection")

Delete a collection

The following example deletes the collection example-collection.

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

pc.delete_collection("example-collection")

Inference API

The Pinecone SDK now supports creating embeddings via the Inference API.

from pinecone import Pinecone

pc = Pinecone(api_key="YOUR_API_KEY")
model = "multilingual-e5-large"

# Embed documents
text = [
    "Turkey is a classic meat to eat at American Thanksgiving.",
    "Many people enjoy the beautiful mosques in Turkey.",
]
text_embeddings = pc.inference.embed(
    model=model,
    inputs=text,
    parameters={"input_type": "passage", "truncate": "END"},
)

# Upsert documents into Pinecone index

# Embed a query
query = ["How should I prepare my turkey?"]
query_embeddings = pc.inference.embed(
    model=model,
    inputs=query,
    parameters={"input_type": "query", "truncate": "END"},
)

# Send query to Pinecone index to retrieve similar documents

Contributing

If you'd like to make a contribution, or get setup locally to develop the Pinecone python client, please see our contributing guide

pinecone-python-client's People

Contributors

abhinav-upadhyay avatar acatav avatar adamgs avatar austin-denoble avatar benjaminran avatar byronnlandry avatar chelseatroy avatar cherryleaf avatar daverigby avatar dependabot[bot] avatar efung avatar fsxfreak avatar gdj0nes avatar haruska avatar hiradp avatar igiloh-pinecone avatar izeye avatar jackpertschuk avatar jhamon avatar loisaidasam avatar markmcd avatar miararoy avatar mjvankampen avatar mutayroei avatar pineconemachine avatar rajat08 avatar tdonia avatar tomcsojn avatar yaakovs avatar yarden-slon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pinecone-python-client's Issues

[Bug] index.delete(delete_all=True) does not delete everything

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

After issuing index.delete(delete_all=True), I still see vectors in index.describe_index_stats(). I am new to pinecone and very well could just be doing something wrong.

Expected Behavior

I expect there to be 0 vectors after issuing a delete_all command

Steps To Reproduce

index = pinecone.GRPCIndex(index_name)

(Pdb) index.delete(delete_all=True)

(Pdb) index.describe_index_stats()
{'dimension': 1536,
'index_fullness': 0.1,
'namespaces': {'no-vectors': {'vector_count': 0},
'vectors': {'vector_count': 5},
'vectors-copy': {'vector_count': 0}},
'total_vector_count': 5}

Relevant log output

No response

Environment

- OS: OS X
- Python: 3.10.12
- pinecone:

pinecone-client==2.2.2

Additional Context

No response

Queries consistently getting "Connection Reset by Peer" after 30 mins of inactivity

I initialize a Pinecone index on process start. After 30 mins (probably less) of inactivity, I am consistently seeing the next request throw an error for "Connection reset by peer".

urllib3.exceptions.ProtocolError
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

Traceback (most recent call last)
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 440, in _make_request
httplib_response = conn.getresponse(buffering=True)
During handling of the above exception, another exception occurred:
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 449, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
# Permission is hereby granted, free of charge, to any person obtaining a copy
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 444, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.8/http/client.py", line 1348, in getresponse
response.begin()
File "/usr/local/lib/python3.8/http/client.py", line 316, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.8/http/client.py", line 277, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/local/lib/python3.8/socket.py", line 669, in readinto
return self._sock.recv_into(b)
File "/usr/local/lib/python3.8/ssl.py", line 1241, in recv_into
return self.read(nbytes, buffer)
File "/usr/local/lib/python3.8/ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
During handling of the above exception, another exception occurred:
File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2091, in __call__
return self.wsgi_app(environ, start_response)
File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2076, in wsgi_app
response = self.handle_exception(e)
File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2073, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1518, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1516, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1502, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
File "/app/service.py", line 28, in search
response = faiss.query(embedding.tolist(), content['top_k'])
File "/app/faiss.py", line 32, in query
results = index.query(
File "/usr/local/lib/python3.8/site-packages/pinecone/core/utils/error_handling.py", line 17, in inner_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pinecone/index.py", line 100, in query
return self._vector_api.query(
File "/usr/local/lib/python3.8/site-packages/pinecone/core/client/api_client.py", line 776, in __call__
return self.callable(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pinecone/core/client/api/vector_operations_api.py", line 595, in __query
return self.call_with_http_info(**kwargs)
File "/usr/local/lib/python3.8/site-packages/pinecone/core/client/api_client.py", line 838, in call_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.8/site-packages/pinecone/core/client/api_client.py", line 413, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.8/site-packages/pinecone/core/client/api_client.py", line 200, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.8/site-packages/pinecone/core/client/api_client.py", line 459, in request
return self.rest_client.POST(url,
File "/usr/local/lib/python3.8/site-packages/pinecone/core/client/rest.py", line 271, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.8/site-packages/pinecone/core/client/rest.py", line 157, in request
r = self.pool_manager.request(
File "/usr/local/lib/python3.8/site-packages/urllib3/request.py", line 78, in request
return self.request_encode_body(
File "/usr/local/lib/python3.8/site-packages/urllib3/request.py", line 170, in request_encode_body
return self.urlopen(method, url, **extra_kw)
File "/usr/local/lib/python3.8/site-packages/urllib3/poolmanager.py", line 376, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 785, in urlopen
retries = retries.increment(
File "/usr/local/lib/python3.8/site-packages/urllib3/util/retry.py", line 550, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/local/lib/python3.8/site-packages/urllib3/packages/six.py", line 769, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 449, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
# Permission is hereby granted, free of charge, to any person obtaining a copy
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 444, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.8/http/client.py", line 1348, in getresponse
response.begin()
File "/usr/local/lib/python3.8/http/client.py", line 316, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.8/http/client.py", line 277, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/local/lib/python3.8/socket.py", line 669, in readinto
return self._sock.recv_into(b)
File "/usr/local/lib/python3.8/ssl.py", line 1241, in recv_into
return self.read(nbytes, buffer)
File "/usr/local/lib/python3.8/ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

index.query() missing 1 required positional argument: 'queries'

index.query appears to require an attribute ['queries'] that is not documented.

index = pinecone.Index(index_name='my-index')
index.query(vector=vector, top_k=1, include_values=True, include_metadata=True)

TypeError                                 Traceback (most recent call last)
/var/folders/l0/ktlckjc908l0dxxgj_ctm15r0000gn/T/ipykernel_78177/4103042064.py in <module>
----> 1 index.query(vector=vector, top_k=1, include_values=True, include_metadata=True)

/usr/local/lib/python3.9/site-packages/pinecone/core/utils/sentry.py in inner_func(*args, **kwargs)
     21     def inner_func(*args, **kwargs):
     22         try:
---> 23             return func(*args, **kwargs)
     24         except Exception as e:
     25             init_sentry()

/usr/local/lib/python3.9/site-packages/pinecone/core/utils/error_handling.py in inner_func(*args, **kwargs)
     15         Config.validate()  # raises exceptions in case of invalid config
     16         try:
---> 17             return func(*args, **kwargs)
     18         except MaxRetryError as e:
     19             if isinstance(e.reason, ProtocolError):

TypeError: query() missing 1 required positional argument: 'queries'

Is there any schema there in pinecone index. How to access the schema of the pinecone index

Is this your first time submitting a feature request?

  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing client functionality

Describe the feature

Is there any schema there in pinecone index. How to access the schema of the pinecone index

Describe alternatives you've considered

Is there any schema there in pinecone index. How to access the schema of the pinecone index

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

[Bug] Error loading documents to Pinecone

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Throws an error when loading document vectors into Pinecone.

Expected Behavior

I was expecting document vectors to be added to the index.

Steps To Reproduce

  1. Clone this repo https://github.com/ricardopinto/pinecone-test
  2. Update the code in qa_docs.py and add your apy key, region, and index name (5120 dimensions)
  3. Follow the instructions in the readme to download the model and start the app

Relevant log output

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 398, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.10/dist-packages/urllib3/connection.py", line 244, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/usr/lib/python3.10/http/client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1328, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1076, in _send_output
    self.send(chunk)
  File "/usr/lib/python3.10/http/client.py", line 998, in send
    self.sock.sendall(data)
  File "/usr/lib/python3.10/ssl.py", line 1237, in sendall
    v = self.send(byte_view[count:])
  File "/usr/lib/python3.10/ssl.py", line 1206, in send
    return self._sslobj.write(data)
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2396)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/qa_docs.py", line 27, in <module>
    docsearch = Pinecone.from_documents(texts, embeddings, index_name=index_name)
  File "/usr/local/lib/python3.10/dist-packages/langchain/vectorstores/base.py", line 218, in from_documents
    return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/langchain/vectorstores/pinecone.py", line 246, in from_texts
    index.upsert(vectors=list(to_upsert), namespace=namespace)
  File "/usr/local/lib/python3.10/dist-packages/pinecone/core/utils/error_handling.py", line 17, in inner_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/pinecone/index.py", line 147, in upsert
    return self._upsert_batch(vectors, namespace, _check_type, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/pinecone/index.py", line 231, in _upsert_batch
    return self._vector_api.upsert(
  File "/usr/local/lib/python3.10/dist-packages/pinecone/core/client/api_client.py", line 776, in __call__
    return self.callable(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/pinecone/core/client/api/vector_operations_api.py", line 956, in __upsert
    return self.call_with_http_info(**kwargs)
  File "/usr/local/lib/python3.10/dist-packages/pinecone/core/client/api_client.py", line 838, in call_with_http_info
    return self.api_client.call_api(
  File "/usr/local/lib/python3.10/dist-packages/pinecone/core/client/api_client.py", line 413, in call_api
    return self.__call_api(resource_path, method,
  File "/usr/local/lib/python3.10/dist-packages/pinecone/core/client/api_client.py", line 200, in __call_api
    response_data = self.request(
  File "/usr/local/lib/python3.10/dist-packages/pinecone/core/client/api_client.py", line 459, in request
    return self.rest_client.POST(url,
  File "/usr/local/lib/python3.10/dist-packages/pinecone/core/client/rest.py", line 271, in POST
    return self.request("POST", url,
  File "/usr/local/lib/python3.10/dist-packages/pinecone/core/client/rest.py", line 157, in request
    r = self.pool_manager.request(
  File "/usr/local/lib/python3.10/dist-packages/urllib3/request.py", line 78, in request
    return self.request_encode_body(
  File "/usr/local/lib/python3.10/dist-packages/urllib3/request.py", line 170, in request_encode_body
    return self.urlopen(method, url, **extra_kw)
  File "/usr/local/lib/python3.10/dist-packages/urllib3/poolmanager.py", line 376, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 815, in urlopen
    return self.urlopen(
  File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 815, in urlopen
    return self.urlopen(
  File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 815, in urlopen
    return self.urlopen(
  File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/usr/local/lib/python3.10/dist-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='test-7ae06e9.svc.asia-northeast1-gcp.pinecone.io', port=443): Max retries exceeded with url: /vectors/upsert (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:2396)')))

Environment

- OS: Ubuntu 22.04 host, docker image from ubuntu 22.04
- Python: 3.10.6
- pinecone: 2.2.1

Additional Context

reproducible code: https://github.com/ricardopinto/pinecone-test

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior
I have Pinecone upgraded to v3.0.0 and also installed the GRPC client.

When I run the following command:

pinecone.init(api_key="API_KEY", environment="ENV")

I get the following error:

AttributeError: module 'pinecone' has no attribute 'init'

Expected Behavior
Previously the same code was perfectly running in the Google Colab with no issue. But, today suddenly this issue arrived.

Methods Tried:
When further researching, some searches suggested re-installing the pinecone-client, and installing the v3.0.0, but none of the solutions worked.

Environment

- Google Colab:
- Python: 3.10.12
- pinecone: 3.0.0
-openai: 0.27.7 

Deprecate Numpy

I'm requesting to remove numpy as a dependency. It is barely used within the repository and creates such a large dependency that is unexpected from what is essentially an API wrapper. Anyone who is trying to deploy a lambda function to AWS that uses the pinecone client would have to create a custom numpy layer to support the pinecone client.

PineconeProtocolError: Failed to connect; did you specify the correct index name?

After creating index when run the below command
index.describe_index_stats() it shows like PineconeProtocolError: Failed to connect; did you specify the correct index name?
And pinecone.list_indexes() also returns the empty list []. But index creation was successful using

pinecone.init(api_key="key", environment="env") 
index = pinecone.Index("TAPAS2")
index

The above code returned <pinecone.index.Index at 04e2e0>

DeprecationWarning: HTTPResponse.getheader() is deprecated

I constantly get the following error message:

/path-to-environment/lib/python3.8/site-packages/pinecone/core/client/rest.py:45: DeprecationWarning: HTTPResponse.getheader() is deprecated and will be removed in urllib3 v2.1.0. Instead use HTTResponse.headers.get(name, default).

I can see in the source code the following:

class RESTResponse(io.IOBase):
    # ...
    def getheader(self, name, default=None):
        """Returns a given response header."""
        return self.urllib3_response.getheader(name, default)

The warnings are very frequent, and the fix at first glance seems rather easy.

Lib Version
2.1.0

OS/Hardware
Mac OS Monterey 12.0.1 / Apple M1 Macbook Pro

[Bug] Upsert fails with object of type 'int' has no len()

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I am calling upsert with a batch_size of 100. Each batch has tuples of (id, embedding_values, metadata_dict).
This fails with the error:

len(input_values) > current_validations['max_length'])`
TypeError: object of type 'int' has no len()

Expected Behavior

Upsert should work correctly and store my embeddings.

Steps To Reproduce

index.upsert(vectors=[(0, (1.0, 0.2), {'text': 'test text'})])

Relevant log output

Traceback (most recent call last):
  File "my-project/embed.py", line 140, in <module>
    index.upsert(vectors=to_upsert, batch_size=len(meta_batch), show_progress=True)
  File "my-project/venv/lib/python3.10/site-packages/pinecone/core/utils/error_handling.py", line 17, in inner_func
    return func(*args, **kwargs)
  File "my-project/venv/lib/python3.10/site-packages/pinecone/index.py", line 155, in upsert
    batch_result = self._upsert_batch(vectors[i:i + batch_size], namespace, _check_type, **kwargs)
  File "my-project/venv/lib/python3.10/site-packages/pinecone/index.py", line 233, in _upsert_batch
    vectors=list(map(_vector_transform, vectors)),
  File "my-project/venv/lib/python3.10/site-packages/pinecone/index.py", line 226, in _vector_transform
    return Vector(id=id, values=values, metadata=metadata or {}, _check_type=_check_type)
  File "my-project/venv/lib/python3.10/site-packages/pinecone/core/client/model_utils.py", line 49, in wrapped_init
    return fn(_self, *args, **kwargs)
  File "my-project/venv/lib/python3.10/site-packages/pinecone/core/client/model/vector.py", line 280, in __init__
    self.id = id
  File "my-project/venv/lib/python3.10/site-packages/pinecone/core/client/model_utils.py", line 188, in __setattr__
    self[attr] = value
  File "my-project/venv/lib/python3.10/site-packages/pinecone/core/client/model_utils.py", line 488, in __setitem__
    self.set_attribute(name, value)
  File "my-project/venv/lib/python3.10/site-packages/pinecone/core/client/model_utils.py", line 170, in set_attribute
    check_validations(
  File "my-project/venv/lib/python3.10/site-packages/pinecone/core/client/model_utils.py", line 908, in check_validations
    len(input_values) > current_validations['max_length']):
TypeError: object of type 'int' has no len()


### Environment

```markdown
- OS: Mac 13.2.1
- Python: 3.10
- pinecone: 2.2.1

Additional Context

No response

[Bug] Importing Pinecone is Making Network Calls that Don't Timeout

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When importing the pinecone client in a docker container lambda on aws in a public subnet (not able to connect to the internet), importing the pinecone client stalls the entire initialization of the container.

Expected Behavior

Import without stalling the entire container.

Steps To Reproduce

Steps assume you have aws cdk cli tool installed and account configured

  1. Download zip and unzip
  2. cd into directory
  3. in app.py replace the ACCOUNT_NUMBER & ACCOUNT_NUMBER with appropriate values
  4. run cdk deploy --require-approval never
  5. navigate to the AWS console
  6. Invoke the public lambda with URL in the console (without internet connectivity) and observe that the container stalls
  7. Invoke the private lambda with URL in the console (with internet access in the private subnet) and observe that the import works

Full Test Zip:
test-pinecone.zip

Relevant log output

Public Subnet Lambda (without internet connectivity)

2023-07-08T06:49:52.852-06:00	2023-07-08 12:49:52.852 | INFO | main:<module>:2 - Importing libraries

2023-07-08T06:50:05.122-06:00	2023-07-08 12:50:05.122 | INFO | main:<module>:2 - Importing libraries

2023-07-08T06:51:32.647-06:00	EXTENSION Name: lambda-adapter State: Ready Events: []

2023-07-08T06:51:32.647-06:00	START RequestId: 4bd8c7b9-70ff-4ab1-b98c-cfbd98b7e80f Version: $LATEST

2023-07-08T06:51:32.663-06:00	2023-07-08T12:51:32.663Z 4bd8c7b9-70ff-4ab1-b98c-cfbd98b7e80f Task timed out after 90.11 seconds

2023-07-08T06:51:32.663-06:00	END RequestId: 4bd8c7b9-70ff-4ab1-b98c-cfbd98b7e80f

2023-07-08T06:51:32.663-06:00	REPORT RequestId: 4bd8c7b9-70ff-4ab1-b98c-cfbd98b7e80f Duration: 90106.73 ms Billed Duration: 90000 ms Memory Size: 128 MB Max Memory Used: 24 MB XRAY TraceId: 1-64a95b6f-41ce4c1d62aaa3d621c5dbd2 SegmentId: 60fe0233117e66e3 Sampled: true

2023-07-08T06:51:32.878-06:00	2023-07-08 12:51:32.878 | INFO | main:<module>:2 - Importing libraries

Private subnet lambda with internet access:

2023-07-08T06:49:36.040-06:00	2023-07-08 12:49:36.040 | INFO | main:<module>:2 - Importing libraries

2023-07-08T06:49:43.210-06:00	2023-07-08 12:49:43.210 | INFO | main:create_app:8 - Creating app

2023-07-08T06:49:43.211-06:00	2023-07-08 12:49:43.210 | INFO | main:create_app:10 - App created

2023-07-08T06:49:43.211-06:00	INFO: Started server process [12]

2023-07-08T06:49:43.211-06:00	INFO: Waiting for application startup.

2023-07-08T06:49:43.212-06:00	INFO: Application startup complete.

2023-07-08T06:49:43.212-06:00	INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)

2023-07-08T06:49:43.221-06:00	INFO: 127.0.0.1:39946 - "GET / HTTP/1.1" 404 Not Found

2023-07-08T06:49:43.222-06:00	EXTENSION Name: lambda-adapter State: Ready Events: []

2023-07-08T06:49:43.225-06:00	START RequestId: c356ce53-8eff-4e6f-9414-a4ff7d601300 Version: $LATEST

2023-07-08T06:49:43.232-06:00	INFO: 71.195.207.116:0 - "GET / HTTP/1.1" 404 Not Found

2023-07-08T06:49:43.234-06:00	END RequestId: c356ce53-8eff-4e6f-9414-a4ff7d601300

2023-07-08T06:49:43.234-06:00	REPORT RequestId: c356ce53-8eff-4e6f-9414-a4ff7d601300 Duration: 9.26 ms Billed Duration: 7756 ms Memory Size: 128

Environment

- OS: linux
- Python:3.10 (also tested with 3.8)
- pinecone: 2.2.2 (also tested with 2.2.0 & 2.0.3)

Additional Context

I'm pretty sure the issue is coming from this code:

action_api = ActionAPI(host=config.controller_host, api_key=config.api_key)
try:
    whoami_response = action_api.whoami()
except requests.exceptions.RequestException:
    # proceed with default values; reset() may be called later w/ correct values
    whoami_response = WhoAmIResponse()

Which ultimately calls this guy:


import requests
from requests.exceptions import HTTPError


class BaseAPI:
    """Base class for HTTP API calls."""

    def __init__(self, host: str, api_key: str = None):
        self.host = host
        self.api_key = api_key

    @property
    def headers(self):
        return {"api-key": self.api_key}

    def _send_request(self, request_handler, url, **kwargs):
        response = request_handler('{0}{1}'.format(self.host, url), headers=self.headers, **kwargs)
        try:
            response.raise_for_status()
        except HTTPError as e:
            e.args = e.args + (response.text,)
            raise e
        return response.json()

    def get(self, url: str, params: dict = None):
        return self._send_request(requests.get, url, params=params)

    def post(self, url: str, json: dict = None):
        return self._send_request(requests.post, url, json=json)

    def patch(self, url: str, json: dict = None):
        return self._send_request(requests.patch, url, json=json)

    def delete(self, url: str):
        return self._send_request(requests.delete, url)

I tested this on normal docker containers (not with the web adapter that AWS offers, and I still got the same stall). However, I haven't tested with a normal python lambda, but that should be easy to do with a couple modifications to the code I provided.

To be completely honest, this bug was pretty frustrating for three reasons:

  1. Why is the client initiating network connections during import???? That's a NoNo as it is completely unexpected behavior and typically should occur in the init method. This makes unit testing interesting as you should be able to unit test without external connections.
    This atypical behavior made this really hard to debug and made me think that there was a compatibility issue when loading a pinecone dependency(so I spent a majority of my time focusing on checking compatibility (python version, base container image, pinecone version, other imports not playing well) instead of the root cause in the network call) -> this is why it's not a great idea to do this in imports
  2. Even if this is needed in the import (which it shouldn't be), why isn't there a timeout on the BaseAPI? This is a basic safe guard against issues like this.
  3. This was hard to debug because other imports (such as langchain) import pinecone in their code, so it made it appear that langchain was also incompatible.
  4. For applications where speed is important, making network calls in an import is not ideal (especially when there is a second network call in the init method).

I know the above may sound harsh, but please know that I really really love what you guys make and how you are empowering tons of people to get into search. I'm excited to continue using the products you all have developed. ๐Ÿฅณ

Preview of files in zip:

app.py

from aws_cdk import (
    aws_lambda as _lambda,
    App,
    aws_ec2 as ec2,
    Stack,
    Environment,
    Duration,
)
from constructs import Construct

app: App = App()

env = Environment(account="ACCOUNT_NUMBER", region="ACCOUNT_NUMBER")

class TestStack(Stack):

    def __init__(self, scope: Construct, id_: str, **kwargs) -> None:
        super().__init__(scope, id_, **kwargs)
        vpc = ec2.Vpc(
            scope=self,
            id="pinecone-test-vpc",
            vpc_name="pinecone-test-vpc",
            max_azs=1,
            nat_gateways=1,
            subnet_configuration=[
                ec2.SubnetConfiguration(
                    name="public",
                    subnet_type=ec2.SubnetType.PUBLIC,
                    cidr_mask=24,
                ),
                ec2.SubnetConfiguration(
                    name="private",
                    subnet_type=ec2.SubnetType.PRIVATE_WITH_EGRESS,
                    cidr_mask=24,
                )
            ]

        )
        

        func: _lambda.DockerImageFunction = _lambda.DockerImageFunction(
            scope=self,
            id="pinecone-test-pbulic-subnet",
            function_name="pinecone-test-pbulic-subnet",
            tracing=_lambda.Tracing.ACTIVE,
            code=_lambda.DockerImageCode.from_image_asset("."),
            allow_public_subnet=True,
            vpc=vpc,
            vpc_subnets=ec2.SubnetSelection(subnet_type=ec2.SubnetType.PUBLIC),
            timeout=Duration.seconds(90),
        )
        func.add_function_url(
            auth_type=_lambda.FunctionUrlAuthType.NONE,
            cors=_lambda.FunctionUrlCorsOptions(
                allowed_headers=["*"],
                allowed_origins=["*"],
            ),
            invoke_mode=_lambda.InvokeMode.BUFFERED,
        )
        func_2: _lambda.DockerImageFunction = _lambda.DockerImageFunction(
            scope=self,
            id="pinecone-test-private-internet",
            function_name="pinecone-test-private-internet",
            tracing=_lambda.Tracing.ACTIVE,
            code=_lambda.DockerImageCode.from_image_asset("."),
            allow_public_subnet=True,
            vpc=vpc,
            vpc_subnets=ec2.SubnetSelection(subnet_type=ec2.SubnetType.PRIVATE_WITH_EGRESS),
            timeout=Duration.seconds(90),
        )
        func_2.add_function_url(
            auth_type=_lambda.FunctionUrlAuthType.NONE,
            cors=_lambda.FunctionUrlCorsOptions(
                allowed_headers=["*"],
                allowed_origins=["*"],
            ),
            invoke_mode=_lambda.InvokeMode.BUFFERED,
        )
            

stack = TestStack(app, "pinecone-test", env=env)

app.synth()

Dockerfile

FROM public.ecr.aws/docker/library/python:3.10-slim-buster AS build
COPY --from=public.ecr.aws/awsguru/aws-lambda-adapter:0.7.0 /lambda-adapter /opt/extensions/lambda-adapter
WORKDIR /var/task
COPY requirements.txt .
RUN pip install --no-cache-dir pip && pip install -r requirements.txt && find . -name '*.pyc' -delete
RUN pip install uvicorn
FROM build AS add-build-context
ENV PORT=8000
WORKDIR /var/task
COPY . .
CMD exec uvicorn --port $PORT --factory main:create_app

main.py

from loguru import logger
logger.info("Importing libraries")
import pinecone
from fastapi import FastAPI


def create_app():
    logger.info("Creating app")
    app = FastAPI()
    logger.info("App created")
    return app

requirements.txt

fastapi
pinecone-client

[Bug] I can't create an exe using pinecone

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I am trying to extract an exe from the code I wrote with Pinecone using pyinstaller.
pyinstaller --onefile main.py -n pinecone.exe

Expected Behavior

I get this error while creating exe

185920 INFO: Fixing EXE headers
186784 INFO: Building EXE from EXE-00.toc completed successfully.
Generating spec file...Traceback (most recent call last):
  File "main.py", line 2, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "init.py", line 2, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "nodes\CreateIndex.py", line 7, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "pinecone\__init__.py", line 4, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 385, in exec_module
  File "pinecone\core\utils\constants.py", line 33, in <module>
  File "pinecone\core\utils\__init__.py", line 53, in get_environment
  File "pathlib.py", line 1134, in read_text
  File "pathlib.py", line 1119, in open
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\IKRAMM~1\\AppData\\Local\\Temp\\_MEI260042\\pinecone\\__environment__'
[25284] Failed to execute script 'main' due to unhandled exception!
2023/09/22 14:29:48 exit status 1

Steps To Reproduce

  • I just wrote the following simple code and tried to create exe with pyinstaller
  1. the code
import pinecone

apiKey = "my_key"
environment = "my"


pinecone.init(api_key=apiKey,environment=environment)
a= pinecone.list_indexes()
print(a)
params={}
a['name']= "helloworld"
a['dimension'] = 8
#pinecone.create_index(name="helloworld", dimension=8, metric="euclidean")
pinecone.create_index(**params)
pinecone.describe_index("helloworld")
a= pinecone.list_indexes()
print(a)
  1. pyinstaller --onefile a.py -n test-pinecone.exe

Relevant log output

No response

Environment

- OS: windows 11
- Python: 3.10.11
- pinecone: 2.2.4
- pyinstaller: 5.13.2

Additional Context

No response

Add support for Poetry

I think we should add support for managing the Pinecone Py thon client locally via Poetry,

For what it's worth, the Python LangChain library does this and it makes it easier to develop against LangChain locally while:

  • Seeing your edits to the library code reflected immediately
  • Tracking all your changes in git so they can be easily contributed back via your fork.

[Feature] Being able to create multiple Client instances with different api_key

Is this your first time submitting a feature request?

  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing client functionality

Describe the feature

Hi, in my app, i need to connect to two different pinecone accounts with different api_key. I don't find a way to do it with this lib with pinecone.init. Did i miss something? If not, can we add that as a feature?

Describe alternatives you've considered

No response

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

[Bug] TypeError: __init__() missing 1 required positional argument: 'top_k'

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When trying to query my existing index, I get an error telling me that I am missing the "top_k" parameter, even though I am not missing it.

Expected Behavior

Query the Pinecone index

Steps To Reproduce

  1. In Python 3.8 using PyCharm, attempt to use basic query example.
def __init__(self, documents, embedding, model_name: str = "text-davinci-003", namespace: str = "example"):
        self.index = pinecone.Index(namespace)
        ...

def query(self):
        query_response = self.index.query(
            namespace=self.namespace,
            top_k=10,
            include_values=True,
            include_metadata=True,
        )
        return query_response

Relevant log output

Traceback (most recent call last):
  File "C:/Users/retai/PycharmProjects/earnings-ai/main.py", line 43, in <module>
    query = Index.index.query()
  File "C:\Users\retai\PycharmProjects\earnings-ai\venv\lib\site-packages\pinecone\core\utils\error_handling.py", line 17, in inner_func
    return func(*args, **kwargs)
  File "C:\Users\retai\PycharmProjects\earnings-ai\venv\lib\site-packages\pinecone\index.py", line 450, in query
    QueryRequest(
  File "C:\Users\retai\PycharmProjects\earnings-ai\venv\lib\site-packages\pinecone\core\client\model_utils.py", line 49, in wrapped_init
    return fn(_self, *args, **kwargs)
TypeError: __init__() missing 1 required positional argument: 'top_k'

Environment

- OS: Windows 10
- Python: 3.8
- pinecone: 2.2.1

Additional Context

No response

[Bug] Connection Leaks in manage.py, IndexOperationsApi, VectorOperationsApi

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Your python client has memory leaks that proved problematic...

Top issue is this: you leak 5 descriptors per API call, never to be reclaimed until eventually the Python process inevitably and eventually crashes because it exceeds max open file descriptors. Why does this happen? My answers are below...

Sample output from lsof to point out the problem; this is one of 32,000 file descriptor leaks created in relative short order using your product:

.... 10-20 thousand more above...
Python 60130 robert.buck 7221u IPv4 0xd946e14d9f16b61b 0t0 TCP 192.168.0.103:60745->ec2-18-213-200-10.compute-1.amazonaws.com:https (ESTABLISHED)
Python 60130 robert.buck 7222 PIPE 0x2727afc4fb51e58a 16384 ->0xd97beed562c859f2
Python 60130 robert.buck 7223 PIPE 0xd97beed562c859f2 16384 ->0x2727afc4fb51e58a
Python 60130 robert.buck 7224r PSXSEM 0t0 /mp-r4wqhiyk
Python 60130 robert.buck 7225r PSXSEM 0t0 /mp-vpwtcej6
.... 10-20 thousand more below...

I didn't know if it was your product or another product. So...

I instrumented the socket.socket API in Python only to find out what's going on. I created a command line tool whose sole purpose is to crush your API to see how it breaks. It didn't take but seconds to break your API. This is (I suspect) related to other support events that were side-stepped by Pinecone.io support, related to service issues in AWS. I could trace the issues back to your product via some socket instrumentation:

def generate_stack_trace():
    stack = traceback.extract_stack()
    print(f"\nStack Trace:\n")
    for frame in stack:
        filename, line_number, function_name, code = frame
        print(f"File: {filename}, Line: {line_number}, Function: {function_name}, Code: {code}")
    print(f"\n-------\n\n")

class TraceSocket(socket.socket):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        logging.info(f"socket created: {self.getsockname()}")
        generate_stack_trace()

    def close(self) -> None:
        logging.info(f"socket closed: {self.getsockname()}")
        super().close()


socket.socket = TraceSocket

Analysis...

In a nutshell, your code in rest.py, api_client.py, all go through great and appropriate lengths to ensure that connections can be reused and cleaned up when a client is disposed. But, that's where the goodness ends. Other parts have leaks... The rest of the code walks away from even taking advantage of Python's with-idiom. And responsibility of cleaning up after itself.

Let's discuss the problems in your code step by step...

First up: [manage.py]

def _get_api_instance():
    ...
    api_client = ApiClient(configuration=client_config)
    ...
    api_instance = IndexOperationsApi(api_client)
    return api_instance

...

# example call site, all call sites are completely broken...

def delete_index(name: str, timeout: int = None):
    ...
    api_instance = _get_api_instance()
    ...

Here, no cleanup possible of the connection pool, and no opportunity for connection reuse, hence it's slow as sludge
because of all the connection setup repeated again and again! Every call to any management API creates a new Client, and by implication a new connection pool, and by implication a single solitary connection. Which you never cleanup and results in a leaked file descriptor.

At the OS level, this means you have two leaked semaphores, AND two leaked pipes, AND one leaked socket. Five file descriptor leaks per API call. That's a lot of file descriptors leaked.

The proper choice would have been to change manage.py into a proper management API that takes a ApiClient as a constructor argument to the ManagementApi object, thus users can reuse the underlying pools. Not much different than IndexOperationsApi which does:

    def __init__(self, api_client=None):
        if api_client is None:
            api_client = ApiClient() # note this is a leak in all your existing APIs too!!!!
        self.api_client = api_client

Second up: [IndexOperationsApi, VectorOperationsApi]

Both APIs suffer the same issue; line numbers listed at the end of this report for where the leaks occur. But in a nutshell, the APIs permissively take an api_client as a constructor argument (good) but falls back to creating one if it wasn't provided (bad).

Why bad? Because the constructor defines OWNERSHIP of the objects it manages, in this case ApiClient. And the OperationsApi takes no steps to do any cleanup, hence leaks, if it did create the ApiClient object.

My suggestion would be to take away the is None block of code that creates an ApiClient on behalf of users, and demand that all users create the client, thus reused. As an alternative, introduce a new class property, named owns_client, and add __enter__, __exit__, and close methods. In the close method, if owns_client is True, then call close on the client. You know what to do in __enter__ and __exit__, it should be plainly obvious (implement the with-idiom).

This will fix the leaks in these two APIs.

Detailed List of Leaks

  • manage.py (5 leaks at line 56)
  • IndexOperationsApi (5 leaks at line 43)
  • VectorOperationsApi (5 leaks at line 50)

Expected Behavior

The product should NOT leak file descriptors.

Steps To Reproduce

Create a directory of 64000 single word documents. Ingest them concurrently (in parallel if possible, to speed things up), watch the python client fail (die/crash) because of leaked file descriptors.

You can monitor file descriptors in real time:

#!/bin/bash

# Get the PID of the process
PID=$(pgrep -f "/Library/Frameworks/Python.framework/Versions/3.8/Resources/Python.app/Contents/MacOS/Python /Users/robert.buck/ws/workspaces/library-import/src/testing/venv/bin/driver crush -c 1000")

# Continuously monitor the open file count
while true; do
    echo "Open files for PID $PID :"
    lsof -p $PID
    open_file_count=$(lsof -p $PID | wc -l | awk '{print $1}')
    echo "Open file count for PID $PID: $open_file_count"
    sleep 5  # Adjust the interval as needed
done

Relevant log output

Similar to:


.... 10-20 thousand more above...
Python 60130 robert.buck 7221u IPv4 0xd946e14d9f16b61b 0t0 TCP 192.168.0.103:60745->ec2-18-213-200-10.compute-1.amazonaws.com:https (ESTABLISHED)
Python 60130 robert.buck 7222 PIPE 0x2727afc4fb51e58a 16384 ->0xd97beed562c859f2
Python 60130 robert.buck 7223 PIPE 0xd97beed562c859f2 16384 ->0x2727afc4fb51e58a
Python 60130 robert.buck 7224r PSXSEM 0t0 /mp-r4wqhiyk
Python 60130 robert.buck 7225r PSXSEM 0t0 /mp-vpwtcej6
.... 10-20 thousand more below...

And...

WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa98a427af0>: Failed to establish a new connection: [Errno 16] Device or resource busy')': /vectors/upsert
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(OSError(24, 'Too many open files'))': /vectors/upsert
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa98a427310>: Failed to establish a new connection: [Errno 16] Device or resource busy')': /vectors/upsert
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa98a427e20>: Failed to establish a new connection: [Errno 24] Too many open files')': /vectors/upsert
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa98a4270a0>: Failed to establish a new connection: [Errno 24] Too many open files')': /vectors/upsert
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa98a4276a0>: Failed to establish a new connection: [Errno 24] Too many open files')': /vectors/upsert
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa97fc98880>: Failed to establish a new connection: [Errno 16] Device or resource busy')': /vectors/upsert
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa98dc96ac0>: Failed to establish a new connection: [Errno 16] Device or resource busy')': /vectors/upsert
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(OSError(24, 'Too many open files'))': /vectors/upsertINFO:root:Successfully shutdown the libtool import.Traceback (most recent call last):ย  File "/usr/local/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 446, in ssl_wrap_socketย  ย  context.load_verify_locations(ca_certs, ca_cert_dir, ca_cert_data)OSError: [Errno 24] Too many open files
The above exception was the direct cause of the following exception:
Traceback (most recent call last):ย  

File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 791, in urlopenย  ย  response = self._make_request(ย  File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 492, in _make_requestย  ย  raise new_eย  
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 468, in _make_requestย  ย  self._validate_conn(conn)ย  
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 1097, in _validate_connย  ย  conn.connect()ย  
File "/usr/local/lib/python3.8/site-packages/urllib3/connection.py", line 642, in connectย  ย  sock_and_verified = _ssl_wrap_socket_and_match_hostname(ย  
File "/usr/local/lib/python3.8/site-packages/urllib3/connection.py", line 783, in _ssl_wrap_socket_and_match_hostnameย  ย  ssl_sock = ssl_wrap_socket(ย  
File "/usr/local/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 448, in ssl_wrap_socketย  ย  raise SSLError(e) from eurllib3.exceptions.SSLError: [Errno 24] Too many open files

The above exception was the direct cause of the following exception:
Traceback (most recent call last):ย  

File "/usr/local/bin/libtool", line 11, in <module>ย  ย  load_entry_point('libtool==1.0.0', 'console_scripts', 'libtool')()ย  
File "/usr/local/lib/python3.8/site-packages/libtool/__main__.py", line 71, in mainย  ย  args.func(args, verbose)ย  
File "/usr/local/lib/python3.8/site-packages/libtool/command/load.py", line 8, in load_actionย  ย  loader.run()ย  
File "/usr/local/lib/python3.8/site-packages/libtool/core/loader.py", line 119, in runย  ย  loop.run_until_complete(asyncio.gather(*writer_tasks, ))ย  
File "/usr/lib64/python3.8/asyncio/base_events.py", line 616, in run_until_completeย  ย  return future.result()ย  
File "/usr/local/lib/python3.8/site-packages/libtool/core/loader.py", line 74, in writerย  ย  await plugin_writer(self.config, record, metadata)ย  
File "/usr/local/lib/python3.8/site-packages/libtool/core/backend/pinecone/backend.py", line 48, in writeย  ย  Pinecone.from_texts(ย  
File "/usr/local/lib/python3.8/site-packages/langchain/vectorstores/pinecone.py", line 416, in from_textsย  ย  pinecone.add_texts(ย  
File "/usr/local/lib/python3.8/site-packages/langchain/vectorstores/pinecone.py", line 149, in add_textsย  ย  [res.get() for res in async_res]ย  
File "/usr/local/lib/python3.8/site-packages/langchain/vectorstores/pinecone.py", line 149, in <listcomp>ย  ย  [res.get() for res in async_res]ย  
File "/usr/lib64/python3.8/multiprocessing/pool.py", line 771, in getย  ย  raise self._valueย  File "/usr/lib64/python3.8/multiprocessing/pool.py", line 125, in workerย  ย  result = (True, func(*args, **kwds))ย  
File "/usr/local/lib/python3.8/site-packages/pinecone/core/client/api_client.py", line 200, in __call_apiย  ย  response_data = self.request(ย  
File "/usr/local/lib/python3.8/site-packages/pinecone/core/client/api_client.py", line 459, in requestย  ย  return self.rest_client.POST(url,ย  
File "/usr/local/lib/python3.8/site-packages/pinecone/core/client/rest.py", line 271, in POSTย  ย  return self.request("POST", url,ย  
File "/usr/local/lib/python3.8/site-packages/pinecone/core/client/rest.py", line 157, in requestย  ย  r = self.pool_manager.request(ย  
File "/usr/local/lib/python3.8/site-packages/urllib3/_request_methods.py", line 118, in requestย  ย  return self.request_encode_body(ย  
File "/usr/local/lib/python3.8/site-packages/urllib3/_request_methods.py", line 217, in request_encode_bodyย  ย  return self.urlopen(method, url, **extra_kw)ย  
File "/usr/local/lib/python3.8/site-packages/urllib3/poolmanager.py", line 443, in urlopenย  ย  response = conn.urlopen(method, u.request_uri, **kw)ย  
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 875, in urlopenย  ย  return self.urlopen(ย  
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 875, in urlopenย  ย  return self.urlopen(ย  
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 875, in urlopenย  ย  return self.urlopen(ย  
File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 845, in urlopenย  ย  retries = retries.increment(ย  
File "/usr/local/lib/python3.8/site-packages/urllib3/util/retry.py", line 515, in incrementย  ย  raise MaxRetryError(_pool, url, reason) from reasonย  # type: ignore[arg-type]urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='m01-8113f38.svc.us-east-1-aws.pinecone.io', port=443): Max retries exceeded with url: /vectors/upsert (Caused by SSLError(OSError(24, 'Too many open files')))
-- 


### Environment

```markdown
- OS: Mac, Red Hat Enterprise Linux, Any
- Python: 3.8.2
- pinecone: 2.2.4

Additional Context

Work-around, avoid the APIs that leak.

[Feature] Support for Compatible dnspython Version for pinecone-client

Is this your first time submitting a feature request?

  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing client functionality

Describe the feature

I'm encountering a compatibility issue when using pinecone-client in conjunction with eventlet. The problem arises due to conflicting dependencies with dnspython versions.

I need to use eventlet==0.30.2 on a gunicorn==20.1.0 due to breaking changes that were fixed but never released.

eventlet requires dnspython version 1.16.0, while pinecone-client needs a version of dnspython greater than 2.0.0. As a result, it's currently not possible to use both libraries simultaneously.

I kindly request support for a compatible version of dnspython that can work harmoniously with both pinecone-client and eventlet. This would enable users to benefit from the features provided by both libraries without facing conflicts or incompatibilities.

By supporting a compatible dnspython version, it would provide developers with the flexibility to leverage the power of pinecone-client alongside eventlet or other libraries with similar dependencies.

I appreciate your attention to this matter and look forward to a resolution that allows for seamless integration of pinecone-client in various environments.

Describe alternatives you've considered

  • Deploy both services individually on different servers
  • Fork pinecone-client, strip features that need dnsclient > 1.16.0

I want to see if it can be done in the project before considering these alternatives.

Who will this benefit?

Users of eventlet, gunicorn and pinecone will benefit from this change. This kind of use should be standard on Azure environments since they use gunicorn servers by default.

Are you interested in contributing this feature?

Yes

Anything else?

No response

[Bug] Upsert fails with Invalid vector value passed: cannot interpret type <class 'float'>

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I am running into an issue where I cannot upsert vectors, seemingly regardless of how I create the embedding due to the following error:
ValueError: Invalid vector value passed: cannot interpret type <class 'float'>

Expected Behavior

I expect Pinecone to upsert the vector embeddings, along with any associated metadata.

Steps To Reproduce

I have tried using the code below (following code from the Pinecone-examples repo) to create the embeddings and upsert:

import pinecone
from sentence_transformers import SentenceTransformer

pinecone.init(api_key, environment='eu-west1-gcp')

if index_name not in pinecone.list_indexes():
    pinecone.create_index(index_name, dimension=384, metric='cosine')
index = pinecone.Index(index_name)

doc = {
    'text': 'Hello world this is a test',
    'source': 'user',
    'id': '1234'
}

model = SentenceTransformer('msmarco-MiniLM-L6-cos-v5', device='cpu')
vector = model.encode(doc['text'], show_progress_bar=True).tolist()
doc.pop('text')
index.upsert(name=index_name, id=doc['id'], vectors=vector, metadata=doc)

Relevant log output

$ python insert-test.py                                                  
Batches: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1/1 [00:00<00:00, 18.05it/s]
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ /home/adam/Research/machine-learning/fable/insert-test.py:26 in <module>                         โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   23                                                                                             โ”‚
โ”‚   24 vector = model.encode(doc['text'], show_progress_bar=True).tolist()                         โ”‚
โ”‚   25 doc.pop('text')                                                                             โ”‚
โ”‚ โฑ 26 index.upsert(name=index_name, id=doc['id'], vectors=vector, metadata=doc)                   โ”‚
โ”‚   27                                                                                             โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /home/adam/.local/lib/python3.10/site-packages/pinecone/core/utils/error_handling.py:17 in       โ”‚
โ”‚ inner_func                                                                                       โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   14 โ”‚   def inner_func(*args, **kwargs):                                                        โ”‚
โ”‚   15 โ”‚   โ”‚   Config.validate()  # raises exceptions in case of invalid config                    โ”‚
โ”‚   16 โ”‚   โ”‚   try:                                                                                โ”‚
โ”‚ โฑ 17 โ”‚   โ”‚   โ”‚   return func(*args, **kwargs)                                                    โ”‚
โ”‚   18 โ”‚   โ”‚   except MaxRetryError as e:                                                          โ”‚
โ”‚   19 โ”‚   โ”‚   โ”‚   if isinstance(e.reason, ProtocolError):                                         โ”‚
โ”‚   20 โ”‚   โ”‚   โ”‚   โ”‚   raise PineconeProtocolError(                                                โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /home/adam/.local/lib/python3.10/site-packages/pinecone/index.py:147 in upsert                   โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   144 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚    'https://docs.pinecone.io/docs/insert-data#sending-upserts-   โ”‚
โ”‚   145 โ”‚   โ”‚                                                                                      โ”‚
โ”‚   146 โ”‚   โ”‚   if batch_size is None:                                                             โ”‚
โ”‚ โฑ 147 โ”‚   โ”‚   โ”‚   return self._upsert_batch(vectors, namespace, _check_type, **kwargs)           โ”‚
โ”‚   148 โ”‚   โ”‚                                                                                      โ”‚
โ”‚   149 โ”‚   โ”‚   if not isinstance(batch_size, int) or batch_size <= 0:                             โ”‚
โ”‚   150 โ”‚   โ”‚   โ”‚   raise ValueError('batch_size must be a positive integer')                      โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /home/adam/.local/lib/python3.10/site-packages/pinecone/index.py:233 in _upsert_batch            โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   230 โ”‚   โ”‚                                                                                      โ”‚
โ”‚   231 โ”‚   โ”‚   return self._vector_api.upsert(                                                    โ”‚
โ”‚   232 โ”‚   โ”‚   โ”‚   UpsertRequest(                                                                 โ”‚
โ”‚ โฑ 233 โ”‚   โ”‚   โ”‚   โ”‚   vectors=list(map(_vector_transform, vectors)),                             โ”‚
โ”‚   234 โ”‚   โ”‚   โ”‚   โ”‚   **args_dict,                                                               โ”‚
โ”‚   235 โ”‚   โ”‚   โ”‚   โ”‚   _check_type=_check_type,                                                   โ”‚
โ”‚   236 โ”‚   โ”‚   โ”‚   โ”‚   **{k: v for k, v in kwargs.items() if k not in _OPENAPI_ENDPOINT_PARAMS}   โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /home/adam/.local/lib/python3.10/site-packages/pinecone/index.py:229 in _vector_transform        โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   226 โ”‚   โ”‚   โ”‚   โ”‚   return Vector(id=id, values=values, metadata=metadata or {}, _check_type   โ”‚
โ”‚   227 โ”‚   โ”‚   โ”‚   elif isinstance(item, Mapping):                                                โ”‚
โ”‚   228 โ”‚   โ”‚   โ”‚   โ”‚   return _dict_to_vector(item)                                               โ”‚
โ”‚ โฑ 229 โ”‚   โ”‚   โ”‚   raise ValueError(f"Invalid vector value passed: cannot interpret type {type(   โ”‚
โ”‚   230 โ”‚   โ”‚                                                                                      โ”‚
โ”‚   231 โ”‚   โ”‚   return self._vector_api.upsert(                                                    โ”‚
โ”‚   232 โ”‚   โ”‚   โ”‚   UpsertRequest(                                                                 โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
ValueError: Invalid vector value passed: cannot interpret type <class 'float'>


### Environment

```markdown
- OS:Ubuntu 22.04
- Python: 3.10.6
- pinecone: 2.2.1

Additional Context

No response

Support returning the number of deleted vectors after a delete operation

Is this your first time submitting a feature request?

  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing client functionality

Describe the feature

Delete operation returns the number of deleted vectors.

Describe alternatives you've considered

No response

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

Compatiblility with Jina Docarray

I'm looking to use pinecone as the vector database backend with Jina but it doesn't seem to be supported natively in the Docarray repo yet (They support milvus and others). Is this something pinecone couple help implement and merge into Jina? I opened an issue with them here

[Bug] AttributeError: module 'pinecone' has no attribute 'GRPCIndex'

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I have pinecone upgraded to v2.2.4, and also installed the GRPC client.

When I run the following command:

index = pinecone.GRPCIndex(index_name)

I get the following error:

AttributeError: module 'pinecone' has no attribute 'GRPCIndex'

Expected Behavior

Previously when using v2.2.2, using the same pipeline worked.

Steps To Reproduce

Run Pinecone v2.2.4 with GRPC client.

Relevant log output

No response

Environment

- OS:
- Python: 3.10.11
- pinecone: 2.2.4

Additional Context

No response

how to insert text and image embeddings for multimodal feature of pinecone

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I am aiming to build multimodel with simple dataset that has image and text, I tried to implement it from following blog: post but it didnt work because it seems that in the blog it insert only image embeddings but search with text embedding:

image_data_df["vector_id"] = image_data_df.index
image_data_df["vector_id"] = image_data_df["vector_id"].apply(str)
# Get all the metadata
final_metadata = []
for index in range(len(image_data_df)):
 final_metadata.append({
     'ID':  index,
     'caption': image_data_df.iloc[index].caption,
     'image': image_data_df.iloc[index].image_url
 })
image_IDs = image_data_df.vector_id.tolist()
image_embeddings = [arr.tolist() for arr in image_data_df.img_embeddings.tolist()]
# Create the single list of dictionary format to insert
data_to_upsert = list(zip(image_IDs, image_embeddings, final_metadata))
# Upload the final data
my_index.upsert(vectors = data_to_upsert)
# Check index size for each namespace
my_index.describe_index_stats()

When tried the code in this blog after queryin as follow:

# Get the query text
text_query = image_data_df.iloc[10].caption
 
# Get the caption embedding
query_embedding = get_single_text_embedding(text_query).tolist()
 
# Run the query
my_index.query(query_embedding, top_k=4, include_metadata=True)

it returns nothing

Expected Behavior

Assume that I have dataframe with caption as text, image embedding and text embeddings how can i insert both to an index and query based on image or text?

Steps To Reproduce

follow to blog post's colab notebook and run all, you will notice it doesn't work in query part.

Relevant log output

.

Environment

- OS: on google colab notebook
- Python: 
- pinecone: 2.2.4

Additional Context

.

init

AttributeError: module 'pinecone' has no attribute 'init'

glitch in the readme

'''The following example deletes example-index''' is in the code blob rather than markdown text.

[Bug] api client multiprocessing usage incompatible with AWS lambda

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Using the python api client ultimately results in an invocation of _multiprocessing.SemLock, which is not implemented on AWS lambda using a python3.11 runtime. Related stackoverflow: https://stackoverflow.com/questions/34005930/multiprocessing-semlock-is-not-implemented-when-running-on-aws-lambda

Expected Behavior

Using the python API client should work without issue on AWS lambda.

index = pinecone.Index("xyz")
index.insert(...)

Steps To Reproduce

  1. create a docker container with pinecone-client[grpc]==2.2.4 installed, and a python script that uses the pinecone python client
  2. upload container to AWS ECR
  3. create AWS lambda using that container and invoke the lambda

Relevant log output

Here is roughly my stack trace (extracted from cloudwatch)

File "/var/lang/lib/python3.11/site-packages/langchain/schema/vectorstore.py", line 122, in add_documents
--
return self.add_texts(texts, metadatas, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.11/site-packages/langchain/vectorstores/pinecone.py", line 138, in add_texts
async_res = [
^
File "/var/lang/lib/python3.11/site-packages/langchain/vectorstores/pinecone.py", line 139, in <listcomp>
self._index.upsert(
File "/var/lang/lib/python3.11/site-packages/pinecone/core/utils/error_handling.py", line 17, in inner_func
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.11/site-packages/pinecone/index.py", line 150, in upsert
return self._upsert_batch(vectors, namespace, _check_type, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.11/site-packages/pinecone/index.py", line 237, in _upsert_batch
return self._vector_api.upsert(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.11/site-packages/pinecone/core/client/api_client.py", line 776, in __call__
return self.callable(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.11/site-packages/pinecone/core/client/api/vector_operations_api.py", line 956, in __upsert
return self.call_with_http_info(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.11/site-packages/pinecone/core/client/api_client.py", line 838, in call_with_http_info
return self.api_client.call_api(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.11/site-packages/pinecone/core/client/api_client.py", line 421, in call_api
return self.pool.apply_async(self.__call_api, (resource_path,
^^^^^^^^^
File "/var/lang/lib/python3.11/site-packages/pinecone/core/client/api_client.py", line 107, in pool
self._pool = ThreadPool(self.pool_threads)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.11/multiprocessing/pool.py", line 930, in __init__
Pool.__init__(self, processes, initializer, initargs)
File "/var/lang/lib/python3.11/multiprocessing/pool.py", line 196, in __init__
self._change_notifier = self._ctx.SimpleQueue()
^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.11/multiprocessing/context.py", line 113, in SimpleQueue
return SimpleQueue(ctx=self.get_context())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.11/multiprocessing/queues.py", line 341, in __init__
self._rlock = ctx.Lock()
^^^^^^^^^^
File "/var/lang/lib/python3.11/multiprocessing/context.py", line 68, in Lock
return Lock(ctx=self.get_context())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.11/multiprocessing/synchronize.py", line 169, in __init__
SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
File "/var/lang/lib/python3.11/multiprocessing/synchronize.py", line 57, in __init__
sl = self._semlock = _multiprocessing.SemLock(
^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 38] Function not implemented


### Environment

```markdown
- OS: linux
- Python: 3.11
- pinecone: 2.2.4

Additional Context

https://stackoverflow.com/questions/34005930/multiprocessing-semlock-is-not-implemented-when-running-on-aws-lambda

async_req not well documented, types incorrect, async/await support?

We are trying to run multiple api calls concurrently, we are using async_req for that.
Unfortunately the docs and types for this mode of operation are rather lacking.

Also an async/await variant would be very nice since ThreadPools do not play very nicely with async/await in general.

Is there a request rate limit for Pinecone?

When I try to send multiple upsert requests to Pinecone simultaneously using multithreading, I occasionally encounter connection timeouts, especially when dealing with large amounts of data. P.S. I'm using a free testing account. My Program run on Azure.

Env:

  • python 3.10
  • OS: Ubuntu 20.04.5 LTS
  • pinecone-client: 2.2.1

When I executed the netstat -antp command, I found that a connection sending upsert requests to Pinecone was consistently blocked (the number of requests in the send queue never changed), and after waiting for a while, the console printed a connection timeout error.

image

After waiting for a while...
image

I saw an error was print in stdout

image

Since this issue does not occur frequently, I would like to confirm whether I should implement rate limiting in my application.

[Bug] Nodes with metadata not found for a short time after persisting in Pinecone

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Hello,

I'm having an issue with fetching vectors from an index with metadata filtering. Here's the flow:

  • I'm parsing some nodes and labelling them with metadata
  • I'm storing the nodes in Pinecone
  • Immediately after persisting the nodes I query the index for the nodes, however the search turns out empty

I know that the nodes are persisted correctly, but they're not available for retrieval for some period of time after persisting. I've fallen back to using sleep(30) after calling persist to around this problem. 10 seconds seems to be not be enough, but 30 gets the job done.

It seems plausible for the Plausible backend to take a while to build proper indexes for new nodes, but I'd need to reliably be able to tell when I can run my query. If there was some kind of API that can allow me to check whether they're indexed or not that would be great

Expected Behavior

I can reliably fetch nodes after persisting them in Pinecone

Steps To Reproduce

Happy to submit a fully reproducible example if needed, just want to confirm that it's an actual bug first

Relevant log output

No response

Environment

- OS: MacOS
- Python: 3.11.6
- pinecone: 2.2.4

Additional Context

No response

`list_indexes()` takes `us-west1-gcp` as env by default

Hi all, thanks for the Python client.

I found a weird behavior using the package. It seems that the list_indexes() method doesn't use the environment specified in the pinecone.init() statement.

For this purpose, I used the default pinecone project (created at my account creation) which is us-east1-gcp by default.

Here is my code:

import pinecone

from .config import config


class PineconeHandler:
    def __init__(self):
        pinecone.init(api_key=config.pinecone_api_key, env=config.pinecone_env)
        self.index_name = config.pinecone_index
        if self.index_name not in pinecone.list_indexes():
            pinecone.create_index(self.index_name, config.embedding_dim)
        self.index = pinecone.Index(self.index_name)

This is the config class I use:

from os import getenv
from dotenv import load_dotenv

from pydantic import BaseSettings


class Config(BaseSettings):
    """Configuration for the application."""
    # Pinecone
    pinecone_api_key: str
    pinecone_env: str
    pinecone_index: str
    # LLM
    embedding_dim: int


load_dotenv()
config = Config(
    pinecone_api_key=getenv("PINECONE_API_KEY"),
    pinecone_env=getenv("PINECONE_ENV"),
    pinecone_index=getenv("PINECONE_INDEX"),
    embedding_dim=int(getenv("EMBEDDING_DIM")),
)

It could probably be avoided by changing PINECONE_ENV to PINECONE_ENVIRONMENT.

It seems to come from this line in the Python client code:
https://github.com/pinecone-io/pinecone-python-client/blob/main/pinecone/config.py#L64-71

Any hints on this would be much appreciated.
As a workaround, I have created a project based on us-west1-gcp :) .

[Feature] Add support for fetch using query filters

Is this your first time submitting a feature request?

  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing client functionality

Describe the feature

The fetch() method is very limiting for python client. It only accepts the ids parameter which means the user needs to maintain the map of ids and its metadata. However, the fetch method needs to be able to fetch data entities using a query.

Describe alternatives you've considered

Need to maintain a map of ids and metadata for the embeddings stored in PineconeDB. However, this is not scalable and makes it almost impossible to build production ready app.

Who will this benefit?

This will help in building a database with following benefits:

  • Helps in deduping the data entities. This is important for building RAG applications.
  • The user will not have to map ids and the metadata of the embedding, which can grow quickly as they add more data and slows down the application.

Are you interested in contributing this feature?

Yes, but dont have enough time.

Anything else?

This is a basic functionality for fetching data from any db. Pinecone has much cleaner interface and empowering it with such functionality will help grow the users.

Normalizing SPLADE embeddings - a bad idea?

Hi!

Not sure if this repo is the right place to ask -- if not feel free to close and redirect me to the appropriate channels :)

I'm using SPLADE together with sentence-transformers/multi-qa-mpnet-base-cos-v1 SentenceTransformer to create hybrid embeddings for use in Pinecone's sparse-dense indexes.

The sparse-dense indexes can only use dotproduct similarity, which is why I chose a dense model trained with cosine similarity. This means I get back dense embeddings with L2 norm of 1 and dot product similarity in range [-1, 1] which I can easily rescale to the unit interval. Based on my somewhat limited understanding, this seems like a relatively sound approach to getting scores which our users can understand as % similarity (assuming in distribution).

After transitioning to sparse-dense vectors, I noticed that SPLADE does not produce normalized embeddings, which means this approach no longer works. I thought about normalizing the SPLADE embeddings, but I'm not sure how this would affect performance.

On a separate note, I'm using the convex combination logic:

# alpha in range [0, 1]
embedding.sparse.values = [
    value * (1 - alpha) for value in embedding.sparse.values
]
embedding.dense = [value * alpha for value in embedding.dense]

I am struggling to reason about how all of this interacts and what effect it has on ranking. See here for info on how score is calculated and here for more details about convex combination logic.

Any help understanding this stuff would be hugely appreciated ๐Ÿ™Œ

Cheers!

Batch Query Options?

I'm currently multithreading requests to pinecone similar to this:

embeddings = await self.embedding_function(queries)
with ThreadPoolExecutor(max_workers=30) as executor:
    results = executor.map(
        self._similarity_search,
        embeddings,
        [k] * len(queries),
        [filter] * len(queries),
    )

Preferably I'm able to do something like this:

query1 = Query(vector=vector1, topk=k, metadata_filter=filter)
...
query10 = Query(vector=vector10, topk=k, metadata_filter=filter)
results = index.batch_query(queries=[query1...query10])

Is such a thing possible?

[Bug] Pinecone defaults to us-west-gcp when used in VSCode terminal

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

import pinecone

api_key="MY KEY"
env="asia-southeast1-gcp"

pinecone.init(api_key=api_key, enviroment="asia-southeast1-gcp")
pinecone.whoami()

results in HTTPError: ('401 Client Error: Unauthorized for url: https://controller.us-west1-gcp.pinecone.io/actions/whoami', 'API key is missing or invalid for the environment "us-west1-gcp". Check that the correct environment is specified.')

Expected Behavior

Should not try to adress us-west-gcp when asia southeast is referenced

Steps To Reproduce

running on python 3.10.0 on windows. Happens only in Visual Studio Code (clean project, no virtual environment). Happens in a jupyter notebook and when entering python through powershell.

Relevant log output

HTTPError: ('401 Client Error: Unauthorized for url: https://controller.us-west1-gcp.pinecone.io/actions/whoami', 'API key is missing or invalid for the environment "us-west1-gcp". Check that the correct environment is specified.')

Environment

- OS:Windows 10
- Python:3.10.0
- pinecone:2.2.1

Additional Context

No response

[Feature] better way to disable show_progress

Is this your first time submitting a feature request?

  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing client functionality

Describe the feature

When using Pinecone from langchain's Pinecone.from_existing_index there is no way to disable the progressbar. That it is by default True is odd, because this is like verbose output.

Please make this False by default and add an option to turn it on if needed

Describe alternatives you've considered

Patch the code

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

pinecone.core.exceptions.PineconeException with simple examples

I have created a simple script from readme to demonstrate that almost simple example is not working on my end for unknown reasons. I have tested several examples with the same result. However, I tested a JavaScript version that worked perfectly on my end. But I need to work with Python.
Using Python 3.10.10 with poetry on a Windows machine. I have spent two days trying to figure out what happened, but I have had no luck.

import logging
import os

logging.basicConfig(level=logging.DEBUG)
from dotenv import load_dotenv

load_dotenv()

import pinecone


def main():

    pinecone.init(
        api_key=os.getenv("PINECONE_API_KEY"),
        environment=os.getenv("PINECONE_ENVIRONMENT")
    )

    index_name = "langchainjsfundamentals"
    print(pinecone.list_indexes())

    # ensure that index exists
    assert index_name in pinecone.list_indexes()

    index = pinecone.Index(index_name)  # or pinecone.GRPCIndex

    ########## ERROR IS HERE ########## 
    upsert_response = index.upsert(
        vectors=[
            ("vec1", [0.1, 0.2, 0.3, 0.4], {"genre": "drama"}),
            ("vec2", [0.2, 0.3, 0.4, 0.5], {"genre": "action"}),
        ],
        namespace="example-namespace"
    )
   ###################################### 

    print(upsert_response)


if __name__ == '__main__':
    main()
[tool.poetry.dependencies]
python = "^3.10"
click = "^8.1.3"
langchain = "^0.0.123"
python-dotenv = "^1.0.0"
pinecone-client = {extras = ["grpc"], version = "2.2.1"}
openai = "^0.27.2"
pypdf = "^3.7.0"
chromadb = "^0.3.13"
datasets = "^2.10.1"
....
    self._sslobj.do_handshake()
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "\example.py", line 39, in <module>
    main()
  File "example.py", line 27, in main
    upsert_response = index.upsert(
  File "\site-packages\pinecone\core\utils\error_handling.py", line 25, in inner_func
    raise PineconeProtocolError(f'Failed to connect; did you specify the correct index name?') from e
pinecone.core.exceptions.PineconeProtocolError: Failed to connect; did you specify the correct index name?
....

[Bug] Client initialisation unreliable with FastAPI

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Context:
I am building an LLM application using fastapi, langchain and pinecone.

Scenario:
I run the FastAPI application inside a docker container using gunicorn as a process manager.

Current Behavior:
The first 10-20 requests fail with the following error:

  File "/usr/local/lib/python3.11/site-packages/pinecone/index.py", line 57, in __init__
    openapi_client_config.api_key = openapi_client_config.api_key or {}
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'api_key'

Even though I run pinecone.init as a lifespan event, I believe the config is not updated due to async nature of the application.

Expected Behavior

pinecone.init sets openapi_client_config upon first execution. The application is not expected to fail from the start.

Steps To Reproduce

NA

Relevant log output

No response

Environment

- OS: macos
- Python: 3.11
- pinecone: 2.2.2

Additional Context

No response

[Bug] Metadata string value returned as datetime.date

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

If you upload a metadata field with a certain format, for example the string "1002.6.3" , pinecone will return the datetime object: datetime.date(1002,6,2) instead of the string "1002.6.3"

I'm not certain of all of the formats that lead to his behavior

Expected Behavior

The returned metadata value should be the string "1002.6.3"

Steps To Reproduce

  1. upload a vector with metadata: {"exampleField": "1002.6.3"}
  2. query the database for that vector. You'll see a datetime.date object instead of a string

Relevant log output

No response

Environment

- OS: MacOS Monterrey (12.2.1)
- Python: 3.11.2
- pinecone: 2.2.1

Additional Context

Note that the value is correctly stored in pinecone but the sdk is returning it as a datetime object

Object of type QueryResponse is not JSON serializable

I'm unable to serialize the query response

response = index.query(vector=embedding, top_k=5, include_metadata=True)
json.dumps(response)

And I get Object of type QueryResponse is not JSON serializable.
Is there a recommendation on how to serialize this object?

[Bug] Python client connects to wrong environment

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Python client automatically tries to connect to "us-west1-gcp" even though the env variable is set to "asia-southeast1-gcp-free".

PINECONE_ENV=asia-southeast1-gcp-free

pinecone.core.client.exceptions.UnauthorizedException: (401)
Reason: Unauthorized
HTTP response headers: HTTPHeaderDict({'www-authenticate': 'API key is missing or invalid for the environment "us-west1-gcp". Check that the correct environment is specified.', 'content-length': '114', 'date': 'Sun, 20 Aug 2023 17:10:54 GMT', 'server': 'envoy'})
HTTP response body: API key is missing or invalid for the environment "us-west1-gcp". Check that the correct environment is specified.

Expected Behavior

It should connect to the southeast asia server instead of trying to connect to US-west.

Steps To Reproduce

Using poetry with openai, pinecone, langchain.

Relevant log output

No response

Environment

- OS: Ubuntu
- Python: 3.11.4
- pinecone: n/a

Additional Context

No response

[Feature] Make client available through conda-forge

Is this your first time submitting a feature request?

  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing client functionality

Describe the feature

Currently, the client can only be downloaded through pip. It would be great if the client could also be dowloaded via conda.
The most popular way to do that is through conda-forge. It is an open source platform that builds and distributes packages for conda. The process is easy and straightforward with tools such as grayskull.

Currently, it is not possible to do that, as the license does not allow redistribution.

Ref:

  • https://conda-forge.org
  • https://github.com/conda-forge
  • 3. Restrictions. The rights granted hereunder are subject to the following restrictions: (a) you shall not license, sell, rent, lease, transfer, assign, distribute, host, outsource, disclose or otherwise commercially exploit the Licensed Software or make the Licensed Software available to any third party (other than the entity on whose behalf you enter into this EULA); (b) you shall not modify, make derivative works of, disassemble, reverse compile or reverse engineer any part of the Licensed Software; (c) you shall not access the Licensed Software in order to build a similar or competitive product or service; (d) except as expressly stated herein, no part of the Licensed Software may be copied, reproduced, distributed, republished, downloaded, displayed, posted or transmitted in any form or by any means, including but not limited to electronic, mechanical, photocopying, recording or other means; and (e) any future release, update, or other addition to the functionality of the Licensed Software provided by Pinecone (if any) shall be subject to the terms of this EULA unless Pinecone expressly states otherwise. You shall preserve all copyright and other proprietary rights notices on the Licensed Software and all copies thereof.

Describe alternatives you've considered

Install with pip

Who will this benefit?

Every user that uses conda. Conda is heavily used in the scientific community and would open this tool up to it.

Are you interested in contributing this feature?

I can submit the tool, I just need a different license that allows redistribution.

Anything else?

I am a reviewer at conda-forge and can try to answer any questions that might occur.

[Bug] PineconeProtocolError after v2.2.4

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I use the following command:

index = pinecone.Index(index_name)
pinecone.init(
    api_key=pinecone_key,  # find at app.pinecone.io
    environment=pinecone_env  # next to api key in console
)
index.delete(delete_all=True, namespace=namespace)

And I get:
PineconeProtocolError: Failed to connect; did you specify the correct index name?

Even though my index name is correct. Running the exact same pipeline with v2.2.2 works correctly.

Expected Behavior

No error, index/namespace deleted from Pinecone.

Steps To Reproduce

In Python 3.10, with Pinecone 2.2.4, delete index with command above.

Relevant log output

---------------------------------------------------------------------------
ConnectionResetError                      Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:790, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    789 # Make the request on the HTTPConnection object
--> 790 response = self._make_request(
    791     conn,
    792     method,
    793     url,
    794     timeout=timeout_obj,
    795     body=body,
    796     headers=headers,
    797     chunked=chunked,
    798     retries=retries,
    799     response_conn=response_conn,
    800     preload_content=preload_content,
    801     decode_content=decode_content,
    802     **response_kw,
    803 )
    805 # Everything went great!

File /opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:491, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
    490         new_e = _wrap_proxy_error(new_e, conn.proxy.scheme)
--> 491     raise new_e
    493 # conn.request() calls http.client.*.request, not the method in
    494 # urllib3.request. It also calls makefile (recv) on the socket.

File /opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:467, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
    466 try:
--> 467     self._validate_conn(conn)
    468 except (SocketTimeout, BaseSSLError) as e:

File /opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:1092, in HTTPSConnectionPool._validate_conn(self, conn)
   1091 if conn.is_closed:
-> 1092     conn.connect()
   1094 if not conn.is_verified:

File /opt/conda/lib/python3.10/site-packages/urllib3/connection.py:635, in HTTPSConnection.connect(self)
    627     warnings.warn(
    628         (
    629             f"System time is way off (before {RECENT_DATE}). This will probably "
   (...)
    632         SystemTimeWarning,
    633     )
--> 635 sock_and_verified = _ssl_wrap_socket_and_match_hostname(
    636     sock=sock,
    637     cert_reqs=self.cert_reqs,
    638     ssl_version=self.ssl_version,
    639     ssl_minimum_version=self.ssl_minimum_version,
    640     ssl_maximum_version=self.ssl_maximum_version,
    641     ca_certs=self.ca_certs,
    642     ca_cert_dir=self.ca_cert_dir,
    643     ca_cert_data=self.ca_cert_data,
    644     cert_file=self.cert_file,
    645     key_file=self.key_file,
    646     key_password=self.key_password,
    647     server_hostname=server_hostname,
    648     ssl_context=self.ssl_context,
    649     tls_in_tls=tls_in_tls,
    650     assert_hostname=self.assert_hostname,
    651     assert_fingerprint=self.assert_fingerprint,
    652 )
    653 self.sock = sock_and_verified.socket

File /opt/conda/lib/python3.10/site-packages/urllib3/connection.py:776, in _ssl_wrap_socket_and_match_hostname(sock, cert_reqs, ssl_version, ssl_minimum_version, ssl_maximum_version, cert_file, key_file, key_password, ca_certs, ca_cert_dir, ca_cert_data, assert_hostname, assert_fingerprint, server_hostname, ssl_context, tls_in_tls)
    774         server_hostname = normalized
--> 776 ssl_sock = ssl_wrap_socket(
    777     sock=sock,
    778     keyfile=key_file,
    779     certfile=cert_file,
    780     key_password=key_password,
    781     ca_certs=ca_certs,
    782     ca_cert_dir=ca_cert_dir,
    783     ca_cert_data=ca_cert_data,
    784     server_hostname=server_hostname,
    785     ssl_context=context,
    786     tls_in_tls=tls_in_tls,
    787 )
    789 try:

File /opt/conda/lib/python3.10/site-packages/urllib3/util/ssl_.py:466, in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir, key_password, ca_cert_data, tls_in_tls)
    464     pass
--> 466 ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
    467 return ssl_sock

File /opt/conda/lib/python3.10/site-packages/urllib3/util/ssl_.py:510, in _ssl_wrap_socket_impl(sock, ssl_context, tls_in_tls, server_hostname)
    508     return SSLTransport(sock, ssl_context, server_hostname)
--> 510 return ssl_context.wrap_socket(sock, server_hostname=server_hostname)

File /opt/conda/lib/python3.10/ssl.py:513, in SSLContext.wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, session)
    507 def wrap_socket(self, sock, server_side=False,
    508                 do_handshake_on_connect=True,
    509                 suppress_ragged_eofs=True,
    510                 server_hostname=None, session=None):
    511     # SSLSocket class handles server_hostname encoding before it calls
    512     # ctx._wrap_socket()
--> 513     return self.sslsocket_class._create(
    514         sock=sock,
    515         server_side=server_side,
    516         do_handshake_on_connect=do_handshake_on_connect,
    517         suppress_ragged_eofs=suppress_ragged_eofs,
    518         server_hostname=server_hostname,
    519         context=self,
    520         session=session
    521     )

File /opt/conda/lib/python3.10/ssl.py:1071, in SSLSocket._create(cls, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, context, session)
   1070             raise ValueError("do_handshake_on_connect should not be specified for non-blocking sockets")
-> 1071         self.do_handshake()
   1072 except (OSError, ValueError):

File /opt/conda/lib/python3.10/ssl.py:1342, in SSLSocket.do_handshake(self, block)
   1341         self.settimeout(None)
-> 1342     self._sslobj.do_handshake()
   1343 finally:

ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

ProtocolError                             Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/pinecone/core/utils/error_handling.py:17, in validate_and_convert_errors.<locals>.inner_func(*args, **kwargs)
     16 try:
---> 17     return func(*args, **kwargs)
     18 except MaxRetryError as e:

File /opt/conda/lib/python3.10/site-packages/pinecone/index.py:335, in Index.delete(self, ids, delete_all, namespace, filter, **kwargs)
    330 args_dict = self._parse_non_empty_args([('ids', ids),
    331                                         ('delete_all', delete_all),
    332                                         ('namespace', namespace),
    333                                         ('filter', filter)])
--> 335 return self._vector_api.delete(
    336     DeleteRequest(
    337         **args_dict,
    338         **{k: v for k, v in kwargs.items() if k not in _OPENAPI_ENDPOINT_PARAMS and v is not None},
    339         _check_type=_check_type
    340     ),
    341     **{k: v for k, v in kwargs.items() if k in _OPENAPI_ENDPOINT_PARAMS}
    342 )

File /opt/conda/lib/python3.10/site-packages/pinecone/core/client/api_client.py:776, in Endpoint.__call__(self, *args, **kwargs)
    766 """ This method is invoked when endpoints are called
    767 Example:
    768 
   (...)
    774 
    775 """
--> 776 return self.callable(self, *args, **kwargs)

File /opt/conda/lib/python3.10/site-packages/pinecone/core/client/api/vector_operations_api.py:117, in VectorOperationsApi.__init__.<locals>.__delete(self, delete_request, **kwargs)
    115 kwargs['delete_request'] = \
    116     delete_request
--> 117 return self.call_with_http_info(**kwargs)

File /opt/conda/lib/python3.10/site-packages/pinecone/core/client/api_client.py:838, in Endpoint.call_with_http_info(self, **kwargs)
    836     params['header']['Content-Type'] = header_list
--> 838 return self.api_client.call_api(
    839     self.settings['endpoint_path'], self.settings['http_method'],
    840     params['path'],
    841     params['query'],
    842     params['header'],
    843     body=params['body'],
    844     post_params=params['form'],
    845     files=params['file'],
    846     response_type=self.settings['response_type'],
    847     auth_settings=self.settings['auth'],
    848     async_req=kwargs['async_req'],
    849     _check_type=kwargs['_check_return_type'],
    850     _return_http_data_only=kwargs['_return_http_data_only'],
    851     _preload_content=kwargs['_preload_content'],
    852     _request_timeout=kwargs['_request_timeout'],
    853     _host=_host,
    854     collection_formats=params['collection_format'])

File /opt/conda/lib/python3.10/site-packages/pinecone/core/client/api_client.py:413, in ApiClient.call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, async_req, _return_http_data_only, collection_formats, _preload_content, _request_timeout, _host, _check_type)
    412 if not async_req:
--> 413     return self.__call_api(resource_path, method,
    414                            path_params, query_params, header_params,
    415                            body, post_params, files,
    416                            response_type, auth_settings,
    417                            _return_http_data_only, collection_formats,
    418                            _preload_content, _request_timeout, _host,
    419                            _check_type)
    421 return self.pool.apply_async(self.__call_api, (resource_path,
    422                                                method, path_params,
    423                                                query_params,
   (...)
    431                                                _request_timeout,
    432                                                _host, _check_type))

File /opt/conda/lib/python3.10/site-packages/pinecone/core/client/api_client.py:200, in ApiClient.__call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, _return_http_data_only, collection_formats, _preload_content, _request_timeout, _host, _check_type)
    198 try:
    199     # perform request and return response
--> 200     response_data = self.request(
    201         method, url, query_params=query_params, headers=header_params,
    202         post_params=post_params, body=body,
    203         _preload_content=_preload_content,
    204         _request_timeout=_request_timeout)
    205 except ApiException as e:

File /opt/conda/lib/python3.10/site-packages/pinecone/core/client/api_client.py:459, in ApiClient.request(self, method, url, query_params, headers, post_params, body, _preload_content, _request_timeout)
    458 elif method == "POST":
--> 459     return self.rest_client.POST(url,
    460                                  query_params=query_params,
    461                                  headers=headers,
    462                                  post_params=post_params,
    463                                  _preload_content=_preload_content,
    464                                  _request_timeout=_request_timeout,
    465                                  body=body)
    466 elif method == "PUT":

File /opt/conda/lib/python3.10/site-packages/pinecone/core/client/rest.py:271, in RESTClientObject.POST(self, url, headers, query_params, post_params, body, _preload_content, _request_timeout)
    269 def POST(self, url, headers=None, query_params=None, post_params=None,
    270          body=None, _preload_content=True, _request_timeout=None):
--> 271     return self.request("POST", url,
    272                         headers=headers,
    273                         query_params=query_params,
    274                         post_params=post_params,
    275                         _preload_content=_preload_content,
    276                         _request_timeout=_request_timeout,
    277                         body=body)

File /opt/conda/lib/python3.10/site-packages/pinecone/core/client/rest.py:157, in RESTClientObject.request(self, method, url, query_params, headers, body, post_params, _preload_content, _request_timeout)
    156         request_body = json.dumps(body)
--> 157     r = self.pool_manager.request(
    158         method, url,
    159         body=request_body,
    160         preload_content=_preload_content,
    161         timeout=timeout,
    162         headers=headers)
    163 elif headers['Content-Type'] == 'application/x-www-form-urlencoded':  # noqa: E501

File /opt/conda/lib/python3.10/site-packages/urllib3/_request_methods.py:118, in RequestMethods.request(self, method, url, body, fields, headers, json, **urlopen_kw)
    117 else:
--> 118     return self.request_encode_body(
    119         method, url, fields=fields, headers=headers, **urlopen_kw
    120     )

File /opt/conda/lib/python3.10/site-packages/urllib3/_request_methods.py:217, in RequestMethods.request_encode_body(self, method, url, fields, headers, encode_multipart, multipart_boundary, **urlopen_kw)
    215 extra_kw.update(urlopen_kw)
--> 217 return self.urlopen(method, url, **extra_kw)

File /opt/conda/lib/python3.10/site-packages/urllib3/poolmanager.py:443, in PoolManager.urlopen(self, method, url, redirect, **kw)
    442 else:
--> 443     response = conn.urlopen(method, u.request_uri, **kw)
    445 redirect_location = redirect and response.get_redirect_location()

File /opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:844, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    842     new_e = ProtocolError("Connection aborted.", new_e)
--> 844 retries = retries.increment(
    845     method, url, error=new_e, _pool=self, _stacktrace=sys.exc_info()[2]
    846 )
    847 retries.sleep()

File /opt/conda/lib/python3.10/site-packages/urllib3/util/retry.py:470, in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
    469 if read is False or method is None or not self._is_method_retryable(method):
--> 470     raise reraise(type(error), error, _stacktrace)
    471 elif read is not None:

File /opt/conda/lib/python3.10/site-packages/urllib3/util/util.py:38, in reraise(tp, value, tb)
     37 if value.__traceback__ is not tb:
---> 38     raise value.with_traceback(tb)
     39 raise value

File /opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:790, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    789 # Make the request on the HTTPConnection object
--> 790 response = self._make_request(
    791     conn,
    792     method,
    793     url,
    794     timeout=timeout_obj,
    795     body=body,
    796     headers=headers,
    797     chunked=chunked,
    798     retries=retries,
    799     response_conn=response_conn,
    800     preload_content=preload_content,
    801     decode_content=decode_content,
    802     **response_kw,
    803 )
    805 # Everything went great!

File /opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:491, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
    490         new_e = _wrap_proxy_error(new_e, conn.proxy.scheme)
--> 491     raise new_e
    493 # conn.request() calls http.client.*.request, not the method in
    494 # urllib3.request. It also calls makefile (recv) on the socket.

File /opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:467, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
    466 try:
--> 467     self._validate_conn(conn)
    468 except (SocketTimeout, BaseSSLError) as e:

File /opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:1092, in HTTPSConnectionPool._validate_conn(self, conn)
   1091 if conn.is_closed:
-> 1092     conn.connect()
   1094 if not conn.is_verified:

File /opt/conda/lib/python3.10/site-packages/urllib3/connection.py:635, in HTTPSConnection.connect(self)
    627     warnings.warn(
    628         (
    629             f"System time is way off (before {RECENT_DATE}). This will probably "
   (...)
    632         SystemTimeWarning,
    633     )
--> 635 sock_and_verified = _ssl_wrap_socket_and_match_hostname(
    636     sock=sock,
    637     cert_reqs=self.cert_reqs,
    638     ssl_version=self.ssl_version,
    639     ssl_minimum_version=self.ssl_minimum_version,
    640     ssl_maximum_version=self.ssl_maximum_version,
    641     ca_certs=self.ca_certs,
    642     ca_cert_dir=self.ca_cert_dir,
    643     ca_cert_data=self.ca_cert_data,
    644     cert_file=self.cert_file,
    645     key_file=self.key_file,
    646     key_password=self.key_password,
    647     server_hostname=server_hostname,
    648     ssl_context=self.ssl_context,
    649     tls_in_tls=tls_in_tls,
    650     assert_hostname=self.assert_hostname,
    651     assert_fingerprint=self.assert_fingerprint,
    652 )
    653 self.sock = sock_and_verified.socket

File /opt/conda/lib/python3.10/site-packages/urllib3/connection.py:776, in _ssl_wrap_socket_and_match_hostname(sock, cert_reqs, ssl_version, ssl_minimum_version, ssl_maximum_version, cert_file, key_file, key_password, ca_certs, ca_cert_dir, ca_cert_data, assert_hostname, assert_fingerprint, server_hostname, ssl_context, tls_in_tls)
    774         server_hostname = normalized
--> 776 ssl_sock = ssl_wrap_socket(
    777     sock=sock,
    778     keyfile=key_file,
    779     certfile=cert_file,
    780     key_password=key_password,
    781     ca_certs=ca_certs,
    782     ca_cert_dir=ca_cert_dir,
    783     ca_cert_data=ca_cert_data,
    784     server_hostname=server_hostname,
    785     ssl_context=context,
    786     tls_in_tls=tls_in_tls,
    787 )
    789 try:

File /opt/conda/lib/python3.10/site-packages/urllib3/util/ssl_.py:466, in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir, key_password, ca_cert_data, tls_in_tls)
    464     pass
--> 466 ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
    467 return ssl_sock

File /opt/conda/lib/python3.10/site-packages/urllib3/util/ssl_.py:510, in _ssl_wrap_socket_impl(sock, ssl_context, tls_in_tls, server_hostname)
    508     return SSLTransport(sock, ssl_context, server_hostname)
--> 510 return ssl_context.wrap_socket(sock, server_hostname=server_hostname)

File /opt/conda/lib/python3.10/ssl.py:513, in SSLContext.wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, session)
    507 def wrap_socket(self, sock, server_side=False,
    508                 do_handshake_on_connect=True,
    509                 suppress_ragged_eofs=True,
    510                 server_hostname=None, session=None):
    511     # SSLSocket class handles server_hostname encoding before it calls
    512     # ctx._wrap_socket()
--> 513     return self.sslsocket_class._create(
    514         sock=sock,
    515         server_side=server_side,
    516         do_handshake_on_connect=do_handshake_on_connect,
    517         suppress_ragged_eofs=suppress_ragged_eofs,
    518         server_hostname=server_hostname,
    519         context=self,
    520         session=session
    521     )

File /opt/conda/lib/python3.10/ssl.py:1071, in SSLSocket._create(cls, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, context, session)
   1070             raise ValueError("do_handshake_on_connect should not be specified for non-blocking sockets")
-> 1071         self.do_handshake()
   1072 except (OSError, ValueError):

File /opt/conda/lib/python3.10/ssl.py:1342, in SSLSocket.do_handshake(self, block)
   1341         self.settimeout(None)
-> 1342     self._sslobj.do_handshake()
   1343 finally:

ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

The above exception was the direct cause of the following exception:

PineconeProtocolError                     Traceback (most recent call last)
Cell In[8], line 1
----> 1 index.delete(delete_all=True, namespace=namespace)

File /opt/conda/lib/python3.10/site-packages/pinecone/core/utils/error_handling.py:25, in validate_and_convert_errors.<locals>.inner_func(*args, **kwargs)
     23         raise
     24 except ProtocolError as e:
---> 25     raise PineconeProtocolError(f'Failed to connect; did you specify the correct index name?') from e

PineconeProtocolError: Failed to connect; did you specify the correct index name?

Environment

- OS:
- Python: 3.10.11
- pinecone: 2.2.4

Additional Context

No response

[Bug] Failed to connect to all addresses

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I am simply trying to delete a specific namespace within my index using the following code:

import os
import sys
import json
import pinecone

# pinecone key
pinecone_env = xxxx
pinecone_key =  xxxx

# Initializing index name
index_name = "index_name"
# index = pinecone.Index(index_name)
index = pinecone.GRPCIndex(index_name)
namespace='namespace'

# Initialize pinecone
pinecone.init(
    api_key=pinecone_key,  # find at app.pinecone.io
    environment=pinecone_env  # next to api key in console
)

# Delete namespace
index.delete(delete_all=True, namespace=namespace)

print('Namespace deleted')

Doing this, I get the following error:

PineconeException: UNKNOWN:failed to connect to all addresses; last error: UNAVAILABLE: ipv4:34.127.5.128:443: recvmsg:Connection reset by peer {created_time:"2023-09-28T20:38:58.196686658+00:00", grpc_status:14}

Would appreciate any insights into this

Expected Behavior

Expected behavior was that the code would delete the specific namespace as requested.

Steps To Reproduce

Use grpc-gateway-protoc-gen-openapiv2-0.1.0

Relevant log output

No response

Environment

No response

Additional Context

No response

[Feature] Pinecone GUI

Is this your first time submitting a feature request?

  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing client functionality

Describe the feature

A web based GUI would be better similar to Mongo Compass to see records do all the CRUD ops with export and import functionality

Describe alternatives you've considered

No response

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

[Bug] SSL Error when trying to `.list_indexes()`

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When trying to .list_indexes(), I'm getting a SSL error (see logs below). I do have certifi installed in the Python environment.

Expected Behavior

Successfully able to list existing indexes.

Steps To Reproduce

  1. Import pinecone.
  2. Initialize pinecone with and api_key and environment.
  3. Run pinecone.list_indexes().

Relevant log output

Python 3.11.3 | packaged by conda-forge | (main, Apr  6 2023, 08:50:54) [MSC v.1934 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.12.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import pinecone

In [2]: pinecone.init(api_key="<redacted>", environment="us-east1-gcp")

In [3]: pinecone.list_indexes()
---------------------------------------------------------------------------
SSLError                                  Traceback (most recent call last)
File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\util\ssl_.py:402, in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir, key_password, ca_cert_data, tls_in_tls)
    401 try:
--> 402     context.load_verify_locations(ca_certs, ca_cert_dir, ca_cert_data)
    403 except (IOError, OSError) as e:

SSLError: [X509] PEM lib (_ssl.c:4149)

During handling of the above exception, another exception occurred:

SSLError                                  Traceback (most recent call last)
File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\connectionpool.py:703, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    702 # Make the request on the httplib connection object.
--> 703 httplib_response = self._make_request(
    704     conn,
    705     method,
    706     url,
    707     timeout=timeout_obj,
    708     body=body,
    709     headers=headers,
    710     chunked=chunked,
    711 )
    713 # If we're going to release the connection in ``finally:``, then
    714 # the response doesn't need to know about the connection. Otherwise
    715 # it will also try to release it and we'll have a double-release
    716 # mess.

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\connectionpool.py:386, in HTTPConnectionPool._make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    385 try:
--> 386     self._validate_conn(conn)
    387 except (SocketTimeout, BaseSSLError) as e:
    388     # Py2 raises this as a BaseSSLError, Py3 raises it as socket timeout.

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\connectionpool.py:1042, in HTTPSConnectionPool._validate_conn(self, conn)
   1041 if not getattr(conn, "sock", None):  # AppEngine might not have  `.sock`
-> 1042     conn.connect()
   1044 if not conn.is_verified:

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\connection.py:419, in HTTPSConnection.connect(self)
    417     context.load_default_certs()
--> 419 self.sock = ssl_wrap_socket(
    420     sock=conn,
    421     keyfile=self.key_file,
    422     certfile=self.cert_file,
    423     key_password=self.key_password,
    424     ca_certs=self.ca_certs,
    425     ca_cert_dir=self.ca_cert_dir,
    426     ca_cert_data=self.ca_cert_data,
    427     server_hostname=server_hostname,
    428     ssl_context=context,
    429     tls_in_tls=tls_in_tls,
    430 )
    432 # If we're using all defaults and the connection
    433 # is TLSv1 or TLSv1.1 we throw a DeprecationWarning
    434 # for the host.

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\util\ssl_.py:404, in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir, key_password, ca_cert_data, tls_in_tls)
    403     except (IOError, OSError) as e:
--> 404         raise SSLError(e)
    406 elif ssl_context is None and hasattr(context, "load_default_certs"):
    407     # try to load OS default certs; works well on Windows (require Python3.4+)

SSLError: [X509] PEM lib (_ssl.c:4149)

During handling of the above exception, another exception occurred:

MaxRetryError                             Traceback (most recent call last)
Cell In[3], line 1
----> 1 pinecone.list_indexes()

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\pinecone\manage.py:185, in list_indexes()
    183 """Lists all indexes."""
    184 api_instance = _get_api_instance()
--> 185 response = api_instance.list_indexes()
    186 return response

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\pinecone\core\client\api_client.py:776, in Endpoint.__call__(self, *args, **kwargs)
    765 def __call__(self, *args, **kwargs):
    766     """ This method is invoked when endpoints are called
    767     Example:
    768
   (...)
    774
    775     """
--> 776     return self.callable(self, *args, **kwargs)

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\pinecone\core\client\api\index_operations_api.py:1132, in IndexOperationsApi.__init__.<locals>.__list_indexes(self, **kwargs)
   1128 kwargs['_check_return_type'] = kwargs.get(
   1129     '_check_return_type', True
   1130 )
   1131 kwargs['_host_index'] = kwargs.get('_host_index')
-> 1132 return self.call_with_http_info(**kwargs)

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\pinecone\core\client\api_client.py:838, in Endpoint.call_with_http_info(self, **kwargs)
    834     header_list = self.api_client.select_header_content_type(
    835         content_type_headers_list)
    836     params['header']['Content-Type'] = header_list
--> 838 return self.api_client.call_api(
    839     self.settings['endpoint_path'], self.settings['http_method'],
    840     params['path'],
    841     params['query'],
    842     params['header'],
    843     body=params['body'],
    844     post_params=params['form'],
    845     files=params['file'],
    846     response_type=self.settings['response_type'],
    847     auth_settings=self.settings['auth'],
    848     async_req=kwargs['async_req'],
    849     _check_type=kwargs['_check_return_type'],
    850     _return_http_data_only=kwargs['_return_http_data_only'],
    851     _preload_content=kwargs['_preload_content'],
    852     _request_timeout=kwargs['_request_timeout'],
    853     _host=_host,
    854     collection_formats=params['collection_format'])

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\pinecone\core\client\api_client.py:413, in ApiClient.call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, async_req, _return_http_data_only, collection_formats, _preload_content, _request_timeout, _host, _check_type)
    359 """Makes the HTTP request (synchronous) and returns deserialized data.
    360
    361 To make an async_req request, set the async_req parameter.
   (...)
    410     then the method will return the response directly.
    411 """
    412 if not async_req:
--> 413     return self.__call_api(resource_path, method,
    414                            path_params, query_params, header_params,
    415                            body, post_params, files,
    416                            response_type, auth_settings,
    417                            _return_http_data_only, collection_formats,
    418                            _preload_content, _request_timeout, _host,
    419                            _check_type)
    421 return self.pool.apply_async(self.__call_api, (resource_path,
    422                                                method, path_params,
    423                                                query_params,
   (...)
    431                                                _request_timeout,
    432                                                _host, _check_type))

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\pinecone\core\client\api_client.py:200, in ApiClient.__call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, _return_http_data_only, collection_formats, _preload_content, _request_timeout, _host, _check_type)
    196     url = _host + resource_path
    198 try:
    199     # perform request and return response
--> 200     response_data = self.request(
    201         method, url, query_params=query_params, headers=header_params,
    202         post_params=post_params, body=body,
    203         _preload_content=_preload_content,
    204         _request_timeout=_request_timeout)
    205 except ApiException as e:
    206     e.body = e.body.decode('utf-8')

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\pinecone\core\client\api_client.py:439, in ApiClient.request(self, method, url, query_params, headers, post_params, body, _preload_content, _request_timeout)
    437 """Makes the HTTP request using RESTClient."""
    438 if method == "GET":
--> 439     return self.rest_client.GET(url,
    440                                 query_params=query_params,
    441                                 _preload_content=_preload_content,
    442                                 _request_timeout=_request_timeout,
    443                                 headers=headers)
    444 elif method == "HEAD":
    445     return self.rest_client.HEAD(url,
    446                                  query_params=query_params,
    447                                  _preload_content=_preload_content,
    448                                  _request_timeout=_request_timeout,
    449                                  headers=headers)

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\pinecone\core\client\rest.py:236, in RESTClientObject.GET(self, url, headers, query_params, _preload_content, _request_timeout)
    234 def GET(self, url, headers=None, query_params=None, _preload_content=True,
    235         _request_timeout=None):
--> 236     return self.request("GET", url,
    237                         headers=headers,
    238                         _preload_content=_preload_content,
    239                         _request_timeout=_request_timeout,
    240                         query_params=query_params)

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\pinecone\core\client\rest.py:202, in RESTClientObject.request(self, method, url, query_params, headers, body, post_params, _preload_content, _request_timeout)
    199             raise ApiException(status=0, reason=msg)
    200     # For `GET`, `HEAD`
    201     else:
--> 202         r = self.pool_manager.request(method, url,
    203                                       fields=query_params,
    204                                       preload_content=_preload_content,
    205                                       timeout=timeout,
    206                                       headers=headers)
    207 except urllib3.exceptions.SSLError as e:
    208     msg = "{0}\n{1}".format(type(e).__name__, str(e))

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\request.py:74, in RequestMethods.request(self, method, url, fields, headers, **urlopen_kw)
     71 urlopen_kw["request_url"] = url
     73 if method in self._encode_url_methods:
---> 74     return self.request_encode_url(
     75         method, url, fields=fields, headers=headers, **urlopen_kw
     76     )
     77 else:
     78     return self.request_encode_body(
     79         method, url, fields=fields, headers=headers, **urlopen_kw
     80     )

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\request.py:96, in RequestMethods.request_encode_url(self, method, url, fields, headers, **urlopen_kw)
     93 if fields:
     94     url += "?" + urlencode(fields)
---> 96 return self.urlopen(method, url, **extra_kw)

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\poolmanager.py:376, in PoolManager.urlopen(self, method, url, redirect, **kw)
    374     response = conn.urlopen(method, url, **kw)
    375 else:
--> 376     response = conn.urlopen(method, u.request_uri, **kw)
    378 redirect_location = redirect and response.get_redirect_location()
    379 if not redirect_location:

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\connectionpool.py:815, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    810 if not conn:
    811     # Try again
    812     log.warning(
    813         "Retrying (%r) after connection broken by '%r': %s", retries, err, url
    814     )
--> 815     return self.urlopen(
    816         method,
    817         url,
    818         body,
    819         headers,
    820         retries,
    821         redirect,
    822         assert_same_host,
    823         timeout=timeout,
    824         pool_timeout=pool_timeout,
    825         release_conn=release_conn,
    826         chunked=chunked,
    827         body_pos=body_pos,
    828         **response_kw
    829     )
    831 # Handle redirect?
    832 redirect_location = redirect and response.get_redirect_location()

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\connectionpool.py:815, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    810 if not conn:
    811     # Try again
    812     log.warning(
    813         "Retrying (%r) after connection broken by '%r': %s", retries, err, url
    814     )
--> 815     return self.urlopen(
    816         method,
    817         url,
    818         body,
    819         headers,
    820         retries,
    821         redirect,
    822         assert_same_host,
    823         timeout=timeout,
    824         pool_timeout=pool_timeout,
    825         release_conn=release_conn,
    826         chunked=chunked,
    827         body_pos=body_pos,
    828         **response_kw
    829     )
    831 # Handle redirect?
    832 redirect_location = redirect and response.get_redirect_location()

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\connectionpool.py:815, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    810 if not conn:
    811     # Try again
    812     log.warning(
    813         "Retrying (%r) after connection broken by '%r': %s", retries, err, url
    814     )
--> 815     return self.urlopen(
    816         method,
    817         url,
    818         body,
    819         headers,
    820         retries,
    821         redirect,
    822         assert_same_host,
    823         timeout=timeout,
    824         pool_timeout=pool_timeout,
    825         release_conn=release_conn,
    826         chunked=chunked,
    827         body_pos=body_pos,
    828         **response_kw
    829     )
    831 # Handle redirect?
    832 redirect_location = redirect and response.get_redirect_location()

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\connectionpool.py:787, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    784 elif isinstance(e, (SocketError, HTTPException)):
    785     e = ProtocolError("Connection aborted.", e)
--> 787 retries = retries.increment(
    788     method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
    789 )
    790 retries.sleep()
    792 # Keep track of the error for the retry warning.

File ~\scoop\apps\mambaforge\current\envs\llm\Lib\site-packages\urllib3\util\retry.py:592, in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
    581 new_retry = self.new(
    582     total=total,
    583     connect=connect,
   (...)
    588     history=history,
    589 )
    591 if new_retry.is_exhausted():
--> 592     raise MaxRetryError(_pool, url, error or ResponseError(cause))
    594 log.debug("Incremented Retry for (url='%s'): %r", url, new_retry)
    596 return new_retry

MaxRetryError: HTTPSConnectionPool(host='controller.us-east1-gcp.pinecone.io', port=443): Max retries exceeded with url: /databases (Caused by SSLError(SSLError(524297, '[X509] PEM lib (_ssl.c:4149)')))

In [4]:

Environment

- OS: Windows 10
- Python: Python 3.11.3
- pinecone: 2.2.1

Additional Context

I'm able to successfully able to ping the hostname (controller.us-east1-gcp.pinecone.io). So network connectivity appears to be working.

I haven't found a way to disable SSL verification.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.