Git Product home page Git Product logo

pymilvus-orm's Issues

Support String data type

I set a FieldScheme with dtype=DataType.STRING. i got exception When i insert data as string type. Anyone could help me ? Thanks
File "engine.py", line 55, in insert self.collection.insert(data=data) File "/usr/local/lib/python3.8/site-packages/pymilvus_orm/collection.py", line 517, in insert res = conn.insert(collection_name=self._name, entities=entities, ids=None, File "/usr/local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 63, in handler raise e File "/usr/local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 51, in handler return func(self, *args, **kwargs) File "/usr/local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 40, in inner return func(self, *args, **kwargs) File "/usr/local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 852, in insert return handler.bulk_insert(collection_name, entities, partition_name, timeout, **kwargs) File "/usr/local/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 80, in handler raise e File "/usr/local/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 57, in handler return func(self, *args, **kwargs) File "/usr/local/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 539, in bulk_insert raise err File "/usr/local/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 525, in bulk_insert request = self._prepare_bulk_insert_request(collection_name, entities, partition_name, timeout, **kwargs) File "/usr/local/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 518, in _prepare_bulk_insert_request else Prepare.bulk_insert_param(collection_name, entities, partition_name, fields_info) File "/usr/local/lib/python3.8/site-packages/pymilvus/client/prepare.py", line 333, in bulk_insert_param raise ParamError("UnSupported data type") pymilvus.client.exceptions.ParamError: UnSupported data type

Unable to run the hello_milvus tutorial code after Milvus cluster installation

I installed the Milvus cluster on my local machine (ubuntu 20.04) by simply following this step: https://milvus.io/docs/v2.0.0/install_cluster-docker.md

Then I installed pymilvus-orm using pip install pymilvus-orm==2.0.0rc1

I tried to run it (using Python3), but it fails at # create connection connections.connect() with error:

Traceback (most recent call last): File "hello_milvus.py", line 74, in <module> hello_milvus() File "hello_milvus.py", line 19, in hello_milvus connections.connect(alias="default", host="localhost", port="19530") File "/home/jp/.local/lib/python3.8/site-packages/pymilvus_orm/connections.py", line 158, in connect conn = connect_milvus(**kwargs) File "/home/jp/.local/lib/python3.8/site-packages/pymilvus_orm/connections.py", line 148, in connect_milvus return Milvus(tmp_host, tmp_port, handler, pool, **tmp_kwargs) File "/home/jp/.local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 122, in __init__ self._update_connection_pool() File "/home/jp/.local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 175, in _update_connection_pool self._wait_for_healthy() File "/home/jp/.local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 40, in inner return func(self, *args, **kwargs) File "/home/jp/.local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 148, in _wait_for_healthy raise Exception("server is not healthy, please try again later") Exception: server is not healthy, please try again later

The same steps with standalone milvus works fine though.

Possibility to remove entity by ids.

As the official documentation says, in version v.1.1.1 it is possible to delete entities by ids with Milvus.delete_entity_by_id method. Can I delete entities in 2.0.0 version somehow?

Unable to search using pandas structures or numpy arrays as data

Attempting to conduct a search using data of type "pandas.core.frame.DataFrame" or "numpy.ndarray" will result in a ParamError.

Example error:

ParamError: `search_data` value                                           embeddings
0  [189, 20, 1, 10, 33, 183, 99, 9, 159, 60, 50, ...
1  [179, 89, 169, 164, 101, 99, 24, 139, 182, 83,... is illegal

Searching with a dataframe works fine in version 2.0.0rc1, but errors as described above in version 2.0.0rc4.

Some exception are not of type MilvusException

All the cases below are in the /tests20/python_client/testcases/test_collection.py

  1. test_collection_empty_name
error = {ct.err_code: 1, ct.err_msg: f'`collection_name` value is illegal'}
<function api_request at 0x7f536a17bd30>: `collection_name` value  is illegal 
  1. test_collection_illegal_name
error = {ct.err_code: 1, ct.err_msg: "`collection_name` value {} is illegal".format(name)}
<function api_request at 0x7f5d60506d30>: `collection_name` value 1 is illegal
  1. test_collection_invalid_type_field
<function api_request at 0x7f675fc08d30>: (1,) has type tuple, but expected one of: bytes, unicode
  1. test_collection_none_field_name
<function api_request at 0x7fb552657d30>: You should specify the name of field! 
  1. test_collection_field_dtype_float_value
error = {ct.err_code: 0, ct.err_msg: "Field type must be of DataType!"}
<function api_request at 0x7fa2a1f15e50>: Field type must be of DataType!
  1. test_collection_vector_invalid_dim
<function api_request at 0x7f62228f0d30>: invalid dim: []
  1. test_collection_none_desc
<function api_request at 0x7f9272a7dd30>: None has type NoneType, but expected one of: bytes, unicode

Attribute error in 'collection.create_partition' if create an existed partition

>>> collection.create_partition("comedy", description="comedy films")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/yangxuan/Github/pymilvus-orm/pymilvus_orm/collection.py", line 763, in create_partition
    raise PartitionAlreadyExistException(0, ExceptionsMessage.PartitionAlreayExist)
AttributeError: type object 'ExceptionsMessage' has no attribute 'PartitionAlreayExist'

pymilvus-orm == 2.0a1.dev59

Method calc_distance is not available in pymilvus utility

When running the example.py, I got the below error.

Traceback (most recent call last):
File "milvus_example.py", line 111, in
results = utility.calc_distance(vectors_left=op_l, vectors_right=op_r, params=params)
AttributeError: module 'pymilvus_orm.utility' has no attribute 'calc_distance'

Inconsistencies with multiple indexes

A lot of the collection based index functions deal with only one index but it is possible for the collection to have more than one index. How are we supposed to specify which index to drop when using drop_index() or which index is returned on index()

Implement calc_distance()

Implement calc_distance() in pymilvus_orm.
The prototype of this api in pymilvus is:

    def calc_distance(self, vectors_left, vectors_right, params=None, timeout=None, **kwargs):
        """
        Calculate distance between two vector arrays.

        :param vectors_left: The vectors on the left of operator.
        :type  vectors_left: dict
        `{"ids": [1, 2, 3, .... n], "collection": "c_1", "partition": "p_1", "field": "v_1"}`
        or
        `{"float_vectors": [[1.0, 2.0], [3.0, 4.0], ... [9.0, 10.0]]}`
        or
        `{"bin_vectors": [b'\x94', b'N', ... b'\xca']}`

        :param vectors_right: The vectors on the right of operator.
        :type  vectors_right: dict
        `{"ids": [1, 2, 3, .... n], "collection": "col_1", "partition": "p_1", "field": "v_1"}`
        or
        `{"float_vectors": [[1.0, 2.0], [3.0, 4.0], ... [9.0, 10.0]]}`
        or
        `{"bin_vectors": [b'\x94', b'N', ... b'\xca']}`

        :param params: parameters, currently only support "metric_type", default value is "L2"
                       extra parameter for "L2" distance: "sqrt", true or false, default is false
                       extra parameter for "HAMMING" and "TANIMOTO": "dim", set this value if dimension is not a multiple of 8, otherwise the dimension will be calculted by list length
        :type  params: dict
            There are examples of supported metric_type:
                `{"metric": "L2"}`
                `{"metric": "IP"}`
                `{"metric": "HAMMING"}`
                `{"metric": "TANIMOTO"}`
            Note: "L2", "IP", "HAMMING", "TANIMOTO" are case insensitive

        :return: 2-d array distances
        :rtype: list[list[int]] for "HAMMING" or list[list[float]] for others
            Assume the vectors_left: L_1, L_2, L_3
            Assume the vectors_right: R_a, R_b
            Distance between L_n and R_m we called "D_n_m"
            The returned distances are arranged like this:
              [D_1_a, D_1_b, D_2_a, D_2_b, D_3_a, D_3_b]

        """

SchemaNotReadyException will be thrown when insert INT16 data

Env:
Python 3.7.10, pymilvus-orm==2.0.0rc4

Description:

When I create a collection with schema which has DataType.INT16 field, it would fail if I try to insert int data.

If I change the DataType from INT16 to INT64, it work well.
It just failed when I set INT16 or INT32(not test for any other types).

Code below is my example:
Fields those would fail are color and brand.

>>> from pymilvus_orm import connections, Collection, FieldSchema, CollectionSchema, DataType
>>> import random
>>> connections.connect()
<pymilvus.client.stub.Milvus object at 0x10b39aa90>
>>> schema = CollectionSchema([
...     FieldSchema("id", DataType.INT64, is_primary=True),
...     FieldSchema("vector", dtype=DataType.FLOAT_VECTOR, dim=128),
...     FieldSchema("color", DataType.INT16),
...     FieldSchema("brand", DataType.INT16),
... ], auto_id=True)
>>> collection = Collection("car_test", schema)
>>> data = [
...     [[random.random() for _ in range(128)] for _ in range(10)],
...     [i for i in range(10)],
...     [i for i in range(10)],
... ]
>>> collection.insert(data)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/czhen/Desktop/milvus_cli/venv/lib/python3.7/site-packages/pymilvus_orm/collection.py", line 514, in insert
    raise SchemaNotReadyException(0, ExceptionsMessage.TypeOfDataAndSchemaInconsistent)
pymilvus_orm.exceptions.SchemaNotReadyException: <SchemaNotReadyException: (code=0, message=The types of schema and data do not match.)>

Output of search result is not easy to read

issue from slack channel

>>> results = collection.search(vectors[:5], field_name, param=search_params, limit=10, expr=None)
>>> results[0].ids
[424363819726212428, 424363819726212436, ...]
>>> results[0].distances
[0.0, 1.0862197875976562, 1.1029295921325684, ...]

I wonder why pymilvus returns data like this. As an application developer, this data structure is not very easy to use. especially if we are going to add more data type in Milvus
Can we just output something like this?

[
    {
        'id': 424363819726212428,
        'field': field-value,
        'distance': 0
    },
    {
        'id': 424363819726212428,
        'field': field-value,
        'distance': 1
     },
     ...
]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.