Git Product home page Git Product logo

go-sonic's Introduction

GoDoc GoLint

Go client for the sonic search backend

This package implement all commands to work with sonic. If there is one missing, open an issue ! :)

Sonic: https://github.com/valeriansaliou/sonic

Install

go get github.com/expectedsh/go-sonic

Example

package main

import (
	"fmt"
	"github.com/expectedsh/go-sonic/sonic"
)

func main() {

	ingester, err := sonic.NewIngester("localhost", 1491, "SecretPassword")
	if err != nil {
		panic(err)
	}

	// I will ignore all errors for demonstration purposes

	_ = ingester.BulkPush("movies", "general", 3, []sonic.IngestBulkRecord{
		{"id:6ab56b4kk3", "Star wars"},
		{"id:5hg67f8dg5", "Spider man"},
		{"id:1m2n3b4vf6", "Batman"},
		{"id:68d96h5h9d0", "This is another movie"},
	})

	search, err := sonic.NewSearch("localhost", 1491, "SecretPassword")
	if err != nil {
		panic(err)
	}

	results, _ := search.Query("movies", "general", "man", 10, 0)

	fmt.Println(results)
}

Benchmark bulk

Method BulkPush and BulkPop use custom connection pool with goroutine dispatch algorithm. This is the benchmark (file sonic/ingester_test.go):

goos: linux
goarch: amd64
pkg: github.com/expectedsh/go-sonic/sonic
BenchmarkIngesterChannel_BulkPushMaxCPUs-8   	       2	 662657959 ns/op
BenchmarkIngesterChannel_BulkPush10-8        	       2	 603779977 ns/op
BenchmarkIngesterChannel_Push-8              	       1	1023322864 ns/op
PASS

Bulk push is faster than for loop on Push. Hardware detail: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz

Thread Safety

The driver itself isn't thread safe. You could use locks or channels to avoid crashes.

package main

import (
	"fmt"

	"github.com/expectedsh/go-sonic/sonic"
)

func main() {
	events := make(chan []string, 1)

	event := []string{"some_text", "some_id"}
	tryCrash := func() {
		for {
			// replace "event" with whatever is giving you events: pubsub, amqp messages…
			events <- event
		}
	}

	go tryCrash()
	go tryCrash()
	go tryCrash()
	go tryCrash()

	ingester, _ := sonic.NewIngester("localhost", 1491, "SecretPassword")

	for {
		msg := <-events
		// Or use some buffering along with BulkPush
		ingester.Push("collection", "bucket", msg[1], msg[0])
	}
}

go-sonic's People

Contributors

alexisvisco avatar benslabbert avatar gologi avatar jonarod avatar remicaumette avatar timurgarif avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

go-sonic's Issues

Thread safety

Edit: figured out the google pubsub library uses goroutines for each event handling, and the driver isn't thread safe. Fixed that using channels.

Suggestion: add a documentation entry, eventually with channel usage example, to tell about that? I can make a PR if you're interested.

==============

I have this strange crash while using a Push() loop:

panic: runtime error: slice bounds out of range [8:6]

goroutine 131 [running]:
bufio.(*Reader).ReadSlice(0xc00005b020, 0x44760a, 0x96, 0x8, 0xc0000e3a68, 0x4cae88, 0x0)
        /usr/local/go/src/bufio/bufio.go:334 +0x22d
bufio.(*Reader).ReadLine(0xc00005b020, 0x96, 0xa0, 0x96, 0x0, 0x0, 0xc000464776)
        /usr/local/go/src/bufio/bufio.go:388 +0x34
github.com/expectedsh/go-sonic/sonic.(*connection).read(0xc0000ce990, 0xc0004646e0, 0x96, 0xa0, 0x96)
        /go/pkg/mod/github.com/expectedsh/[email protected]/sonic/connection.go:52 +0x76
github.com/expectedsh/go-sonic/sonic.ingesterChannel.Push(0xc000064f80, 0xa56b94, 0x8, 0xa56b94, 0x8, 0xc000027320, 0x24, 0xc000367f80, 0x56, 0xc0000ee790, ...)
        /go/pkg/mod/github.com/expectedsh/[email protected]/sonic/ingester.go:122 +0x351
REDACTED/be/services/implementation/search-ingestion.git/pkg/provider/sonic.(*ingest).Shipment(0xc000064f40, 0xc0002a5a00, 0x8, 0xc00051ee00)
        /go/src/pkg/provider/sonic/init.go:61 +0x14a
main.(*svc).handleSearchIngestEvent(0xc00008ed80, 0xb22c40, 0xc0000655c0, 0xc00021cfc0)
        /go/src/cmd/service/main.go:66 +0x32e
cloud.google.com/go/pubsub.(*Subscription).receive.func3(0xc00002d0d0, 0xc0001f8b60, 0xb22c40, 0xc0000655c0, 0xc00021cfc0)
        /go/pkg/mod/cloud.google.com/[email protected]/pubsub/subscription.go:575 +0x7b
created by cloud.google.com/go/pubsub.(*Subscription).receive
        /go/pkg/mod/cloud.google.com/[email protected]/pubsub/subscription.go:573 +0x221

I don't use BulkPush for now as i receive events from a queue.

buffer overflow on push : Split based on sonic buffer limit

Thanks for this library by the way! It's been super helpful!

Feature Request: Buffer aware ingest push splitting

Per the sonic protocol document:

Upon starting a Sonic Channel session, your library should read the buffer(20000) parameter in the STARTED response, and use this value (in bytes) as to know when a command data should be truncated and split in multiple sub-commands (to avoid buffer overflows, ie. sending too much data in a single command);

This is implemented in the JavaScript client here

For performance reasons, the JavaScript client is not word-aware when splitting, which means we can mis-index a word that has been split in 2 chunks. This is a trade-off, but on large text bodies it should not hurt much.

The JavaScript client will truncate to 50% of the allowed in-buffer characters, so that we leave some character space for other parts of the command. Then, divide by 4 as an UTF-8 character contains a maximum of 4 bytes.

The typical flow for ingest begins:

CONNECTED <sonic-server v1.2.2>
START ingest SecretPassword
STARTED ingest protocol(1) buffer(20000)

Which should be read and parsed here I believe.

Steps to Reproduce problem:

  1. Run sonic as per instructions:
docker run -p 1491:1491 -e RUST_BACKTRACE=1 -v $PWD/config.cfg:/etc/sonic.cfg -v $PWD/store:/var/lib/sonic/store valeriansaliou/sonic:v1.2.2
  1. Take the example code in this repository, and I fill one of the Texts in a IngestBulkRecord to contain 20000 characters, and run it against sonic, sonic will report a buffer overflow:
(DEBUG) - got mode response: ingest
(INFO) - closing channel thread because of buffer overflow
thread 'sonic-channel-client' panicked at 'buffer overflow (40004/20002 bytes)', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/sonic-server-1.2.2/src/channel/handle.rs:149:29
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.29/src/backtrace/libunwind.rs:88
   1: backtrace::backtrace::trace_unsynchronized
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.29/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:47
   3: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:36
   4: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:200
   5: std::panicking::default_hook
             at src/libstd/panicking.rs:214
   6: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:477
   7: std::panicking::continue_panic_fmt
             at src/libstd/panicking.rs:384
   8: std::panicking::begin_panic_fmt
             at src/libstd/panicking.rs:339
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Thanks again for your work!

Bulk Push stuck in loop

Thank you for the library :)

Been playing around with it and got stuck trying the first example. I have set up a dummy repo here: https://github.com/BenSlabbert/go-sonic-test

When running sonic docker-compose up and then the project go run main.go

the program hangs.

after some debugging it looks like it gets stuck in func splitText(longString string, maxLen int) []string in the file ingester.go. Seeing as this is called by the Push func, that too is affected

Bulk functions not work as expected with more than 1 goroutines

As referenced in the issue #9 the bulk function are stuck when using multiple goroutines.

The reason why I have created a bulk function is to dispatch work among multiple goroutines. If this goal is subject to unexpected behavior this code should be refactored to work properly.

Get the Text and not just the Object

Hi,

is there any way to get the text back and not just the object?
If I run your example, I just receive the object which doesn't do much for me.

why read twice in newConnection?

func newConnection(d *driver) (*connection, error) {
	c := &connection{}
	c.close()
	conn, err := net.Dial("tcp", fmt.Sprintf("%s:%d", d.Host, d.Port))
	if err != nil {
		return nil, err
	}

	c.closed = false
	c.conn = conn
	c.reader = bufio.NewReader(c.conn)

	err = c.write(fmt.Sprintf("START %s %s", d.channel, d.Password))
	if err != nil {
		return nil, err
	}

	// what is the purpose?
	_, err = c.read()
	_, err = c.read()
	if err != nil {
		return nil, err
	}
	return c, nil
}

Possible connection leak

In driver.go:

func (c *driver) Quit() error {
	err := c.write("QUIT")
	if err != nil {
		return err
	}

	// should get ENDED
	_, err = c.read()
	c.close()
	return err
}

If c.write returns an error and the underlying TCP connection isn't closed, c.close() will never be called and the connection will be leaked since there's no way of closing the connection as a consumer since that method isn't exported.

How do you use sonic with mongodb?

our production environment is mongodb, but I read the relevant documents provided by Sonic official, did not introduce how to use sonic combined with mongodb, can you provide more detailed documentation? How do you use sonic with mongodb?

Bug with BulkPush function

I don't know why but although it pushes []sonic.IngestBulkRecord to Sonic successfully, returns error. This is not applicable for Push() function.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.