Git Product home page Git Product logo

jina's Introduction

Jina logo: Jina is a cloud-native neural search framework

Cloud-Native Neural Search[?] Framework for Any Kind of Data

Python 3.7 3.8 3.9 PyPI Docker Image Version (latest semver) codecov

Jina๐Ÿ”Š is a neural search framework that empowers anyone to build SOTA & scalable deep learning search applications in minutes.

๐ŸŒŒ All data types - Large-scale indexing and querying of any kind of unstructured data: video, image, long/short text, music, source code, PDF, etc.

๐ŸŒฉ๏ธ Fast & cloud-native - Distributed architecture from day one, scalable & cloud-native by design: enjoy containerizing, streaming, paralleling, sharding, async scheduling, HTTP/gRPC/WebSocket protocol.

โฑ๏ธ Save time - The design pattern of neural search systems, from zero to a production-ready system in minutes.

๐Ÿฑ Own your stack - Keep end-to-end stack ownership of your solution, avoid integration pitfalls you get with fragmented, multi-vendor, generic legacy tools.

Install

Run Quick Demo

Build Your First Jina App

Document, Executor, and Flow are the three fundamental concepts in Jina.

Leveraging these three components, we want to build an app that finds lines from a code snippet that are most similar to the query.

๐Ÿ’ก Preliminaries: character embedding, pooling, Euclidean distance

1๏ธโƒฃ Copy-paste the minimum example below and run it:

The architecture of a simple neural search system powered by Jina

import numpy as np
from jina import Document, DocumentArray, Executor, Flow, requests

class CharEmbed(Executor):  # a simple character embedding with mean-pooling
    offset = 32  # letter `a`
    dim = 127 - offset + 1  # last pos reserved for `UNK`
    char_embd = np.eye(dim) * 1  # one-hot embedding for all chars

    @requests
    def foo(self, docs: DocumentArray, **kwargs):
        for d in docs:
            r_emb = [ord(c) - self.offset if self.offset <= ord(c) <= 127 else (self.dim - 1) for c in d.text]
            d.embedding = self.char_embd[r_emb, :].mean(axis=0)  # average pooling

class Indexer(Executor):
    _docs = DocumentArray()  # for storing all documents in memory

    @requests(on='/index')
    def foo(self, docs: DocumentArray, **kwargs):
        self._docs.extend(docs)  # extend stored `docs`

    @requests(on='/search')
    def bar(self, docs: DocumentArray, **kwargs):
         docs.match(self._docs, metric='euclidean', limit=20)

f = Flow(port_expose=12345, protocol='http', cors=True).add(uses=CharEmbed, parallel=2).add(uses=Indexer)  # build a Flow, with 2 parallel CharEmbed, tho unnecessary
with f:
    f.post('/index', (Document(text=t.strip()) for t in open(__file__) if t.strip()))  # index all lines of _this_ file
    f.block()  # block for listening request

2๏ธโƒฃ Open http://localhost:12345/docs (an extended Swagger UI) in your browser, click /search tab and input:

{"data": [{"text": "@requests(on=something)"}]}

That means, we want to find lines from the above code snippet that are most similar to @request(on=something). Now click Execute button!

Jina Swagger UI extension on visualizing neural search results

3๏ธโƒฃ Not a GUI person? Let's do it in Python then! Keep the above server running and start a simple client:

from jina import Client, Document
from jina.types.request import Response


def print_matches(resp: Response):  # the callback function invoked when task is done
    for idx, d in enumerate(resp.docs[0].matches[:3]):  # print top-3 matches
        print(f'[{idx}]{d.scores["euclidean"].value:2f}: "{d.text}"')


c = Client(protocol='http', port=12345)  # connect to localhost:12345
c.post('/search', Document(text='request(on=something)'), on_done=print_matches)

, which prints the following results:

         Client@1608[S]:connected to the gateway at localhost:12345!
[0]0.168526: "@requests(on='/index')"
[1]0.181676: "@requests(on='/search')"
[2]0.218218: "from jina import Document, DocumentArray, Executor, Flow, requests"

๐Ÿ˜” Doesn't work? Our bad! Please report it here.

Support

Join Us

Jina is backed by Jina AI. We are actively hiring full-stack developers, solution engineers to build the next neural search ecosystem in open source.

Contributing

We welcome all kinds of contributions from the open-source community, individuals and partners. We owe our success to your active involvement.

All Contributors

jina's People

Contributors

hanxiao avatar jina-bot avatar joanfm avatar nan-wang avatar deepankarm avatar alexcg1 avatar maximilianwerk avatar bwanglzu avatar cristianmtr avatar florian-hoenicke avatar catstark avatar fhaase2 avatar davidbp avatar yongxuanzhang avatar numb3r3 avatar rutujasurve94 avatar bhavsarpratik avatar anish2197 avatar jacobowitz avatar yueliu1415926 avatar mapleeit avatar shivam-raj avatar alaeddine-13 avatar bingho1013 avatar allcontributors[bot] avatar kelton8z avatar antonkurenkov avatar fionnd avatar redram avatar guiferviz avatar

Stargazers

feifeivv avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.