Git Product home page Git Product logo

pymilvus-orm's Introduction

The content of this repository had been merged into pymilvus repository. This repository was archived on August 31th, and no changes to it were allowed ever since. Please find newest codes or assistance at pymilvus repository.


Milvus Python SDK

version Supported Python Versions Downloads Downloads Downloads license Mergify Status

Another Python SDK for Milvus. To contribute code to this project, please read our contribution guidelines first. If you have some ideas or encounter a problem, you can find us in the Slack channel #py-milvus.

Compatibility

The following collection shows Milvus versions and recommended PyMilvus-ORM versions:

Milvus version Recommended PyMilvus-ORM version
2.0.0-RC1 2.0.0rc1
2.0.0-RC2 2.0.0rc2
2.0.0-RC4 2.0.0rc4

Installation

You can install PyMilvus-ORM via pip3 for Python 3.6+:

# Note this will only install the latest stable version
$ pip3 install pymilvus-orm

You can install a specific version of PyMilvus-ORM by:

$ pip3 install pymilvus-orm==2.0.0rc4

You can upgrade PyMilvus-ORM to the latest stable version by:

$ pip3 install --upgrade pymilvus-orm

Documentation

Documentation is available online: https://pymilvus-orm.readthedocs.io/.

Packages

Released packages

The release of PyMilvus ORM is managed on GitHub, and GitHub Actions will package and upload each version to PyPI.

The release version number of PyMilvus ORM follows PEP440, the format is x.y.z, and the corresponding git tag name is vx.y.z (x/y/z are numbers from 0 to 9).

For example, after PyMilvus ORM 1.0.1 is released, a tag named v1.0.1 can be found on GitHub, and a package with version 1.0.1 can be downloaded on PyPI.

Developing packages

The commits on the development branch of each version will be packaged and uploaded to Test PyPI. Development branches refer to branches such as 1.0 and 1.1, and version releases are generated from the development branches, such as 1.0.1 and 1.0.2.

The package name generated by the development branch is x.y.z.dev, where is the number of commits that differ from the most recent release.

For example, after the release of 1.0.1, two commits were submitted on the 1.0 branch. At this time, the automatic packaging version number of the development branch is 1.0.1.dev2.

To install a specific version package on test.pypi.org, you need to append the parameter --extra-index-url after pip, for example:

$ python3 -m pip install --extra-index-url https://test.pypi.org/simple/ pymilvus-orm==x.y.z.dev<dist>

License

Apache License 2.0

pymilvus-orm's People

Contributors

bennu-li avatar bigsheeper avatar binbinlv avatar congqixia avatar cydrain avatar czhen-zilliz avatar czs007 avatar fishpenguin avatar godchen0212 avatar leonardokidd avatar longjiquan avatar notryan avatar scipe avatar wangting0128 avatar wxyucs avatar xiaocai2333 avatar xiaofan-luan avatar xuanyang-cn avatar yanliang567 avatar yhmo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pymilvus-orm's Issues

Some exception are not of type MilvusException

All the cases below are in the /tests20/python_client/testcases/test_collection.py

  1. test_collection_empty_name
error = {ct.err_code: 1, ct.err_msg: f'`collection_name` value is illegal'}
<function api_request at 0x7f536a17bd30>: `collection_name` value  is illegal 
  1. test_collection_illegal_name
error = {ct.err_code: 1, ct.err_msg: "`collection_name` value {} is illegal".format(name)}
<function api_request at 0x7f5d60506d30>: `collection_name` value 1 is illegal
  1. test_collection_invalid_type_field
<function api_request at 0x7f675fc08d30>: (1,) has type tuple, but expected one of: bytes, unicode
  1. test_collection_none_field_name
<function api_request at 0x7fb552657d30>: You should specify the name of field! 
  1. test_collection_field_dtype_float_value
error = {ct.err_code: 0, ct.err_msg: "Field type must be of DataType!"}
<function api_request at 0x7fa2a1f15e50>: Field type must be of DataType!
  1. test_collection_vector_invalid_dim
<function api_request at 0x7f62228f0d30>: invalid dim: []
  1. test_collection_none_desc
<function api_request at 0x7f9272a7dd30>: None has type NoneType, but expected one of: bytes, unicode

Unable to search using pandas structures or numpy arrays as data

Attempting to conduct a search using data of type "pandas.core.frame.DataFrame" or "numpy.ndarray" will result in a ParamError.

Example error:

ParamError: `search_data` value                                           embeddings
0  [189, 20, 1, 10, 33, 183, 99, 9, 159, 60, 50, ...
1  [179, 89, 169, 164, 101, 99, 24, 139, 182, 83,... is illegal

Searching with a dataframe works fine in version 2.0.0rc1, but errors as described above in version 2.0.0rc4.

Unable to run the hello_milvus tutorial code after Milvus cluster installation

I installed the Milvus cluster on my local machine (ubuntu 20.04) by simply following this step: https://milvus.io/docs/v2.0.0/install_cluster-docker.md

Then I installed pymilvus-orm using pip install pymilvus-orm==2.0.0rc1

I tried to run it (using Python3), but it fails at # create connection connections.connect() with error:

Traceback (most recent call last): File "hello_milvus.py", line 74, in <module> hello_milvus() File "hello_milvus.py", line 19, in hello_milvus connections.connect(alias="default", host="localhost", port="19530") File "/home/jp/.local/lib/python3.8/site-packages/pymilvus_orm/connections.py", line 158, in connect conn = connect_milvus(**kwargs) File "/home/jp/.local/lib/python3.8/site-packages/pymilvus_orm/connections.py", line 148, in connect_milvus return Milvus(tmp_host, tmp_port, handler, pool, **tmp_kwargs) File "/home/jp/.local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 122, in __init__ self._update_connection_pool() File "/home/jp/.local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 175, in _update_connection_pool self._wait_for_healthy() File "/home/jp/.local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 40, in inner return func(self, *args, **kwargs) File "/home/jp/.local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 148, in _wait_for_healthy raise Exception("server is not healthy, please try again later") Exception: server is not healthy, please try again later

The same steps with standalone milvus works fine though.

SchemaNotReadyException will be thrown when insert INT16 data

Env:
Python 3.7.10, pymilvus-orm==2.0.0rc4

Description:

When I create a collection with schema which has DataType.INT16 field, it would fail if I try to insert int data.

If I change the DataType from INT16 to INT64, it work well.
It just failed when I set INT16 or INT32(not test for any other types).

Code below is my example:
Fields those would fail are color and brand.

>>> from pymilvus_orm import connections, Collection, FieldSchema, CollectionSchema, DataType
>>> import random
>>> connections.connect()
<pymilvus.client.stub.Milvus object at 0x10b39aa90>
>>> schema = CollectionSchema([
...     FieldSchema("id", DataType.INT64, is_primary=True),
...     FieldSchema("vector", dtype=DataType.FLOAT_VECTOR, dim=128),
...     FieldSchema("color", DataType.INT16),
...     FieldSchema("brand", DataType.INT16),
... ], auto_id=True)
>>> collection = Collection("car_test", schema)
>>> data = [
...     [[random.random() for _ in range(128)] for _ in range(10)],
...     [i for i in range(10)],
...     [i for i in range(10)],
... ]
>>> collection.insert(data)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/czhen/Desktop/milvus_cli/venv/lib/python3.7/site-packages/pymilvus_orm/collection.py", line 514, in insert
    raise SchemaNotReadyException(0, ExceptionsMessage.TypeOfDataAndSchemaInconsistent)
pymilvus_orm.exceptions.SchemaNotReadyException: <SchemaNotReadyException: (code=0, message=The types of schema and data do not match.)>

Implement calc_distance()

Implement calc_distance() in pymilvus_orm.
The prototype of this api in pymilvus is:

    def calc_distance(self, vectors_left, vectors_right, params=None, timeout=None, **kwargs):
        """
        Calculate distance between two vector arrays.

        :param vectors_left: The vectors on the left of operator.
        :type  vectors_left: dict
        `{"ids": [1, 2, 3, .... n], "collection": "c_1", "partition": "p_1", "field": "v_1"}`
        or
        `{"float_vectors": [[1.0, 2.0], [3.0, 4.0], ... [9.0, 10.0]]}`
        or
        `{"bin_vectors": [b'\x94', b'N', ... b'\xca']}`

        :param vectors_right: The vectors on the right of operator.
        :type  vectors_right: dict
        `{"ids": [1, 2, 3, .... n], "collection": "col_1", "partition": "p_1", "field": "v_1"}`
        or
        `{"float_vectors": [[1.0, 2.0], [3.0, 4.0], ... [9.0, 10.0]]}`
        or
        `{"bin_vectors": [b'\x94', b'N', ... b'\xca']}`

        :param params: parameters, currently only support "metric_type", default value is "L2"
                       extra parameter for "L2" distance: "sqrt", true or false, default is false
                       extra parameter for "HAMMING" and "TANIMOTO": "dim", set this value if dimension is not a multiple of 8, otherwise the dimension will be calculted by list length
        :type  params: dict
            There are examples of supported metric_type:
                `{"metric": "L2"}`
                `{"metric": "IP"}`
                `{"metric": "HAMMING"}`
                `{"metric": "TANIMOTO"}`
            Note: "L2", "IP", "HAMMING", "TANIMOTO" are case insensitive

        :return: 2-d array distances
        :rtype: list[list[int]] for "HAMMING" or list[list[float]] for others
            Assume the vectors_left: L_1, L_2, L_3
            Assume the vectors_right: R_a, R_b
            Distance between L_n and R_m we called "D_n_m"
            The returned distances are arranged like this:
              [D_1_a, D_1_b, D_2_a, D_2_b, D_3_a, D_3_b]

        """

Method calc_distance is not available in pymilvus utility

When running the example.py, I got the below error.

Traceback (most recent call last):
File "milvus_example.py", line 111, in
results = utility.calc_distance(vectors_left=op_l, vectors_right=op_r, params=params)
AttributeError: module 'pymilvus_orm.utility' has no attribute 'calc_distance'

Attribute error in 'collection.create_partition' if create an existed partition

>>> collection.create_partition("comedy", description="comedy films")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/yangxuan/Github/pymilvus-orm/pymilvus_orm/collection.py", line 763, in create_partition
    raise PartitionAlreadyExistException(0, ExceptionsMessage.PartitionAlreayExist)
AttributeError: type object 'ExceptionsMessage' has no attribute 'PartitionAlreayExist'

pymilvus-orm == 2.0a1.dev59

Possibility to remove entity by ids.

As the official documentation says, in version v.1.1.1 it is possible to delete entities by ids with Milvus.delete_entity_by_id method. Can I delete entities in 2.0.0 version somehow?

Output of search result is not easy to read

issue from slack channel

>>> results = collection.search(vectors[:5], field_name, param=search_params, limit=10, expr=None)
>>> results[0].ids
[424363819726212428, 424363819726212436, ...]
>>> results[0].distances
[0.0, 1.0862197875976562, 1.1029295921325684, ...]

I wonder why pymilvus returns data like this. As an application developer, this data structure is not very easy to use. especially if we are going to add more data type in Milvus
Can we just output something like this?

[
    {
        'id': 424363819726212428,
        'field': field-value,
        'distance': 0
    },
    {
        'id': 424363819726212428,
        'field': field-value,
        'distance': 1
     },
     ...
]

Support String data type

I set a FieldScheme with dtype=DataType.STRING. i got exception When i insert data as string type. Anyone could help me ? Thanks
File "engine.py", line 55, in insert self.collection.insert(data=data) File "/usr/local/lib/python3.8/site-packages/pymilvus_orm/collection.py", line 517, in insert res = conn.insert(collection_name=self._name, entities=entities, ids=None, File "/usr/local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 63, in handler raise e File "/usr/local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 51, in handler return func(self, *args, **kwargs) File "/usr/local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 40, in inner return func(self, *args, **kwargs) File "/usr/local/lib/python3.8/site-packages/pymilvus/client/stub.py", line 852, in insert return handler.bulk_insert(collection_name, entities, partition_name, timeout, **kwargs) File "/usr/local/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 80, in handler raise e File "/usr/local/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 57, in handler return func(self, *args, **kwargs) File "/usr/local/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 539, in bulk_insert raise err File "/usr/local/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 525, in bulk_insert request = self._prepare_bulk_insert_request(collection_name, entities, partition_name, timeout, **kwargs) File "/usr/local/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 518, in _prepare_bulk_insert_request else Prepare.bulk_insert_param(collection_name, entities, partition_name, fields_info) File "/usr/local/lib/python3.8/site-packages/pymilvus/client/prepare.py", line 333, in bulk_insert_param raise ParamError("UnSupported data type") pymilvus.client.exceptions.ParamError: UnSupported data type

Inconsistencies with multiple indexes

A lot of the collection based index functions deal with only one index but it is possible for the collection to have more than one index. How are we supposed to specify which index to drop when using drop_index() or which index is returned on index()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.