Git Product home page Git Product logo

buraksezer / olric Goto Github PK

View Code? Open in Web Editor NEW
3.0K 52.0 111.0 6.63 MB

Distributed in-memory object store. It can be used as an embedded Go library and a language-independent service.

License: Apache License 2.0

Go 99.86% Shell 0.05% Dockerfile 0.09%
key-value-store eventually-consistent cache-storage cache distributed-database distributed caching groupcache key-value distributed-cache

olric's Introduction

Olric Tweet

Go Reference Coverage Status Build Status Go Report Card Discord License

Olric is a distributed, in-memory object store. It's designed from the ground up to be distributed, and it can be used both as an embedded Go library and as a language-independent service.

With Olric, you can instantly create a fast, scalable, shared pool of RAM across a cluster of computers.

Olric is implemented in Go and uses the Redis serialization protocol. So Olric has client implementations in all major programming languages.

Olric is highly scalable and available. Distributed applications can use it for distributed caching, clustering and publish-subscribe messaging.

It is designed to scale out to hundreds of members and thousands of clients. When you add new members, they automatically discover the cluster and linearly increase the memory capacity. Olric offers simple scalability, partitioning (sharding), and re-balancing out-of-the-box. It does not require any extra coordination processes. With Olric, when you start another process to add more capacity, data and backups are automatically and evenly balanced.

See Docker and Samples sections to get started!

Join our Discord server!

The current production version is v0.5.4

About versions

Olric v0.4 and previous versions use Olric Binary Protocol, v0.5.x and later use Redis serialization protocol for communication and the API was significantly changed. Olric v0.4.x tree is going to receive bug fixes and security updates forever, but I would recommend considering an upgrade to the new version.

This document only covers v0.5.x and later. See v0.4.x documents here.

At a glance

  • Designed to share some transient, approximate, fast-changing data between servers,
  • Uses Redis serialization protocol,
  • Implements a distributed hash table,
  • Provides a drop-in replacement for Redis Publish/Subscribe messaging system,
  • Supports both programmatic and declarative configuration,
  • Embeddable but can be used as a language-independent service with olricd,
  • Supports different eviction algorithms (including LRU and TTL),
  • Highly available and horizontally scalable,
  • Provides best-effort consistency guarantees without being a complete CP (indeed PA/EC) solution,
  • Supports replication by default (with sync and async options),
  • Quorum-based voting for replica control (Read/Write quorums),
  • Supports atomic operations,
  • Provides an iterator on distributed maps,
  • Provides a plugin interface for service discovery daemons,
  • Provides a locking primitive which inspired by SETNX of Redis,

Possible Use Cases

Olric is an eventually consistent, unordered key/value data store. It supports various eviction mechanisms for distributed caching implementations. Olric also provides publish-subscribe messaging, data replication, failure detection and simple anti-entropy services.

It's good at distributed caching and publish/subscribe messaging.

Table of Contents

Features

  • Designed to share some transient, approximate, fast-changing data between servers,
  • Accepts arbitrary types as value,
  • Only in-memory,
  • Uses Redis protocol,
  • Compatible with existing Redis clients,
  • Embeddable but can be used as a language-independent service with olricd,
  • GC-friendly storage engine,
  • O(1) running time for lookups,
  • Supports atomic operations,
  • Provides a lock implementation which can be used for non-critical purposes,
  • Different eviction policies: LRU, MaxIdleDuration and Time-To-Live (TTL),
  • Highly available,
  • Horizontally scalable,
  • Provides best-effort consistency guarantees without being a complete CP (indeed PA/EC) solution,
  • Distributes load fairly among cluster members with a consistent hash function,
  • Supports replication by default (with sync and async options),
  • Quorum-based voting for replica control,
  • Thread-safe by default,
  • Provides an iterator on distributed maps,
  • Provides a plugin interface for service discovery daemons and cloud providers,
  • Provides a locking primitive which inspired by SETNX of Redis,
  • Provides a drop-in replacement of Redis' Publish-Subscribe messaging feature.

See Architecture section to see details.

Support

You feel free to ask any questions about Olric and possible integration problems.

You also feel free to open an issue on GitHub to report bugs and share feature requests.

Installing

With a correctly configured Golang environment:

go install github.com/buraksezer/olric/cmd/[email protected]

Now you can start using Olric:

olricd -c cmd/olricd/olricd-local.yaml

See Configuration section to create your cluster properly.

Docker

You can launch olricd Docker container by running the following command.

docker run -p 3320:3320 olricio/olricd:v0.5.4

This command will pull olricd Docker image and run a new Olric Instance. You should know that the container exposes 3320 and 3322 ports.

Now, you can access an Olric cluster using any Redis client including redis-cli:

redis-cli -p 3320
127.0.0.1:3320> DM.PUT my-dmap my-key "Olric Rocks!"
OK
127.0.0.1:3320> DM.GET my-dmap my-key
"Olric Rocks!"
127.0.0.1:3320>

Getting Started

With olricd, you can create an Olric cluster with a few commands. This is how to install olricd:

go install github.com/buraksezer/olric/cmd/[email protected]

Let's create a cluster with the following:

olricd -c <YOUR_CONFIG_FILE_PATH>

You can find the sample configuration file under cmd/olricd/olricd-local.yaml. It can perfectly run with single node. olricd also supports OLRICD_CONFIG environment variable to set configuration. Just like that:

OLRICD_CONFIG=<YOUR_CONFIG_FILE_PATH> olricd

Olric uses hashicorp/memberlist for failure detection and cluster membership. Currently, there are different ways to discover peers in a cluster. You can use a static list of nodes in your configuration. It's ideal for development and test environments. Olric also supports Consul, Kubernetes and all well-known cloud providers for service discovery. Please take a look at Service Discovery section for further information.

See Client-Server section to get more information about this deployment scenario.

Maintaining a list of peers manually

Basically, there is a list of nodes under memberlist block in the configuration file. In order to create an Olric cluster, you just need to add Host:Port pairs of the other nodes. Please note that the Port is the memberlist port of the peer. It is 3322 by default.

memberlist:
  peers:
    - "localhost:3322"

Thanks to hashicorp/memberlist, Olric nodes can share the full list of members with each other. So an Olric node can discover the whole cluster by using a single member address.

Embedding into your Go application.

See Samples section to learn how to embed Olric into your existing Golang application.

Operation Modes

Olric has two different operation modes.

Embedded Member

In Embedded Member Mode, members include both the application and Olric data and services. The advantage of the Embedded Member Mode is having a low-latency data access and locality.

Client-Server

In Client-Server Mode, Olric data and services are centralized in one or more servers, and they are accessed by the application through clients. You can have a cluster of servers that can be independently created and scaled. Your clients communicate with these members to reach to Olric data and services on them.

Client-Server deployment has advantages including more predictable and reliable performance, easier identification of problem causes and, most importantly, better scalability. When you need to scale in this deployment type, just add more Olric server members. You can address client and server scalability concerns separately.

Golang Client

The official Golang client is defined by the Client interface. There are two different implementations of that interface in this repository. EmbeddedClient provides a client implementation for embedded-member scenario, ClusterClient provides an implementation of the same interface for client-server deployment scenario. Obviously, you can use ClusterClient for your embedded-member deployments. But it's good to use EmbeddedClient provides a better performance due to localization of the queries.

See the client documentation on pkg.go.dev

Cluster Events

Olric can send push cluster events to cluster.events channel. Available cluster events:

  • node-join-event
  • node-left-event
  • fragment-migration-event
  • fragment-received-even

If you want to receive these events, set true to EnableClusterEventsChannel and subscribe to cluster.events channel. The default is false.

See events/cluster_events.go file to get more information about events.

Commands

Olric uses Redis protocol and supports Redis-style commands to query the database. You can use any Redis client, including redis-cli. The official Go client is a thin layer around go-redis/redis package. See Golang Client section for the documentation.

Distributed Map

DM.PUT

DM.PUT sets the value for the given key. It overwrites any previous value for that key.

DM.PUT dmap key value [ EX seconds | PX milliseconds | EXAT unix-time-seconds | PXAT unix-time-milliseconds ] [ NX | XX]

Example:

127.0.0.1:3320> DM.PUT my-dmap my-key value
OK

Options:

The DM.PUT command supports a set of options that modify its behavior:

  • EX seconds -- Set the specified expire time, in seconds.
  • PX milliseconds -- Set the specified expire time, in milliseconds.
  • EXAT timestamp-seconds -- Set the specified Unix time at which the key will expire, in seconds.
  • PXAT timestamp-milliseconds -- Set the specified Unix time at which the key will expire, in milliseconds.
  • NX -- Only set the key if it does not already exist.
  • XX -- Only set the key if it already exist.

Return:

  • Simple string reply: OK if DM.PUT was executed correctly.
  • KEYFOUND: (error) if the DM.PUT operation was not performed because the user specified the NX option but the condition was not met.
  • KEYNOTFOUND: (error) if the DM.PUT operation was not performed because the user specified the XX option but the condition was not met.

DM.GET

DM.GET gets the value for the given key. It returns (error)KEYNOTFOUND if the key doesn't exist.

DM.GET dmap key

Example:

127.0.0.1:3320> DM.GET dmap key
"value"

Return:

Bulk string reply: the value of key, or (error)KEYNOTFOUND when key does not exist.

DM.DEL

DM.DEL deletes values for the given keys. It doesn't return any error if the key does not exist.

DM.DEL dmap key [key...]

Example:

127.0.0.1:3320> DM.DEL dmap key1 key2
(integer) 2

Return:

  • Integer reply: The number of keys that were removed.

DM.EXPIRE

DM.EXPIRE updates or sets the timeout for the given key. It returns KEYNOTFOUND if the key doesn't exist. After the timeout has expired, the key will automatically be deleted.

The timeout will only be cleared by commands that delete or overwrite the contents of the key, including DM.DEL, DM.PUT, DM.GETPUT.

DM.EXPIRE dmap key seconds

Example:

127.0.0.1:3320> DM.EXPIRE dmap key 1
OK

Return:

  • Simple string reply: OK if DM.EXPIRE was executed correctly.
  • KEYNOTFOUND: (error) when key does not exist.

DM.PEXPIRE

DM.PEXPIRE updates or sets the timeout for the given key. It returns KEYNOTFOUND if the key doesn't exist. After the timeout has expired, the key will automatically be deleted.

The timeout will only be cleared by commands that delete or overwrite the contents of the key, including DM.DEL, DM.PUT, DM.GETPUT.

DM.PEXPIRE dmap key milliseconds

Example:

127.0.0.1:3320> DM.PEXPIRE dmap key 1000
OK

Return:

  • Simple string reply: OK if DM.EXPIRE was executed correctly.
  • KEYNOTFOUND: (error) when key does not exist.

DM.DESTROY

DM.DESTROY flushes the given DMap on the cluster. You should know that there is no global lock on DMaps. DM.PUT and DM.DESTROY commands may run concurrently on the same DMap.

DM.DESTROY dmap

Example:

127.0.0.1:3320> DM.DESTROY dmap
OK

Return:

  • Simple string reply: OK, if DM.DESTROY was executed correctly.

Atomic Operations

Operations on key/value pairs are performed by the partition owner. In addition, atomic operations are guarded by a lock implementation which can be found under internal/locker. It means that Olric guaranties consistency of atomic operations, if there is no network partition. Basic flow for DM.INCR:

  • Acquire the lock for the given key,
  • Call DM.GET to retrieve the current value,
  • Calculate the new value,
  • Call DM.PUT to set the new value,
  • Release the lock.

It's important to know that if you call DM.PUT and DM.GETPUT concurrently on the same key, this will break the atomicity.

internal/locker package is provided by Docker.

Important note about consistency:

You should know that Olric is a PA/EC (see Consistency and Replication Model) product. So if your network is stable, all the operations on key/value pairs are performed by a single cluster member. It means that you can be sure about the consistency when the cluster is stable. It's important to know that computer networks fail occasionally, processes crash and random GC pauses may happen. Many factors can lead a network partitioning. If you cannot tolerate losing strong consistency under network partitioning, you need to use a different tool for atomic operations.

See Hazelcast and the Mythical PA/EC System and Jepsen Analysis on Hazelcast 3.8.3 for more insight on this topic.

DM.INCR

DM.INCR atomically increments the number stored at key by delta. The return value is the new value after being incremented or an error.

DM.INCR dmap key delta

Example:

127.0.0.1:3320> DM.INCR dmap key 10
(integer) 10

Return:

  • Integer reply: the value of key after the increment.

DM.DECR

DM.DECR atomically decrements the number stored at key by delta. The return value is the new value after being incremented or an error.

DM.DECR dmap key delta

Example:

127.0.0.1:3320> DM.DECR dmap key 10
(integer) 0

Return:

  • Integer reply: the value of key after the increment.

DM.GETPUT

DM.GETPUT atomically sets key to value and returns the old value stored at the key.

DM.GETPUT dmap key value

Example:

127.0.0.1:3320> DM.GETPUT dmap key value-1
(nil)
127.0.0.1:3320> DM.GETPUT dmap key value-2
"value-1"

Return:

  • Bulk string reply: the old value stored at the key.

DM.INCRBYFLOAT

DM.INCRBYFLOAT atomically increments the number stored at key by delta. The return value is the new value after being incremented or an error.

DM.INCRBYFLOAT dmap key delta

Example:

127.0.0.1:3320> DM.PUT dmap key 10.50
OK
127.0.0.1:3320> DM.INCRBYFLOAT dmap key 0.1
"10.6"
127.0.0.1:3320> DM.PUT dmap key 5.0e3
OK
127.0.0.1:3320> DM.INCRBYFLOAT dmap key 2.0e2
"5200"

Return:

  • Bulk string reply: the value of key after the increment.

Locking

Important: The lock provided by DMap implementation is approximate and only to be used for non-critical purposes.

The DMap implementation is already thread-safe to meet your thread safety requirements. When you want to have more control on the concurrency, you can use DM.LOCK command. Olric borrows the locking algorithm from Redis. Redis authors propose the following algorithm:

The command is a simple way to implement a locking system with Redis.

A client can acquire the lock if the above command returns OK (or retry after some time if the command returns Nil), and remove the lock just using DEL.

The lock will be auto-released after the expire time is reached.

It is possible to make this system more robust modifying the unlock schema as follows:

Instead of setting a fixed string, set a non-guessable large random string, called token. Instead of releasing the lock with DEL, send a script that only removes the key if the value matches. This avoids that a client will try to release the lock after the expire time deleting the key created by another client that acquired the lock later.

Equivalent of SETNX command in Olric is DM.PUT dmap key value NX. DM.LOCK command are properly implements the algorithm which is proposed above.

You should know that this implementation is subject to the clustering algorithm. So there is no guarantee about reliability in the case of network partitioning. I recommend the lock implementation to be used for efficiency purposes in general, instead of correctness.

Important note about consistency:

You should know that Olric is a PA/EC (see Consistency and Replication Model) product. So if your network is stable, all the operations on key/value pairs are performed by a single cluster member. It means that you can be sure about the consistency when the cluster is stable. It's important to know that computer networks fail occasionally, processes crash and random GC pauses may happen. Many factors can lead a network partitioning. If you cannot tolerate losing strong consistency under network partitioning, you need to use a different tool for locking.

See Hazelcast and the Mythical PA/EC System and Jepsen Analysis on Hazelcast 3.8.3 for more insight on this topic.

DM.LOCK

DM.LOCK sets a lock for the given key. The acquired lock is only valid for the key in this DMap. It returns immediately if it acquires the lock for the given key. Otherwise, it waits until deadline.

DM.LOCK returns a token. You must keep that token to unlock the key. Using prefixed keys is highly recommended. If the key does already exist in the DMap, DM.LOCK will wait until the deadline is exceeded.

DM.LOCK dmap key seconds [ EX seconds | PX milliseconds ]

Options:

  • EX seconds -- Set the specified expire time, in seconds.
  • PX milliseconds -- Set the specified expire time, in milliseconds.

Example:

127.0.0.1:3320> DM.LOCK dmap lock.key 10
2363ec600be286cb10fbb35181efb029

Return:

  • Simple string reply: a token to unlock or lease the lock.
  • NOSUCHLOCK: (error) returned when the requested lock does not exist.
  • LOCKNOTACQUIRED: (error) returned when the requested lock could not be acquired.

DM.UNLOCK

DM.UNLOCK releases an acquired lock for the given key. It returns NOSUCHLOCK if there is no lock for the given key.

DM.UNLOCK dmap key token

Example:

127.0.0.1:3320> DM.UNLOCK dmap key 2363ec600be286cb10fbb35181efb029
OK

Return:

  • Simple string reply: OK if DM.UNLOCK was executed correctly.
  • NOSUCHLOCK: (error) returned when the lock does not exist.

DM.LOCKLEASE

DM.LOCKLEASE sets or updates the timeout of the acquired lock for the given key. It returns NOSUCHLOCK if there is no lock for the given key.

DM.LOCKLEASE accepts seconds as timeout.

DM.LOCKLEASE dmap key token seconds

Example:

127.0.0.1:3320> DM.LOCKLEASE dmap key 2363ec600be286cb10fbb35181efb029 100
OK

Return:

  • Simple string reply: OK if DM.UNLOCK was executed correctly.
  • NOSUCHLOCK: (error) returned when the lock does not exist.

DM.PLOCKLEASE

DM.PLOCKLEASE sets or updates the timeout of the acquired lock for the given key. It returns NOSUCHLOCK if there is no lock for the given key.

DM.PLOCKLEASE accepts milliseconds as timeout.

DM.LOCKLEASE dmap key token milliseconds

Example:

127.0.0.1:3320> DM.PLOCKLEASE dmap key 2363ec600be286cb10fbb35181efb029 1000
OK

Return:

  • Simple string reply: OK if DM.PLOCKLEASE was executed correctly.
  • NOSUCHLOCK: (error) returned when the lock does not exist.

DM.SCAN

DM.SCAN is a cursor based iterator. This means that at every call of the command, the server returns an updated cursor that the user needs to use as the cursor argument in the next call.

An iteration starts when the cursor is set to 0, and terminates when the cursor returned by the server is 0. The iterator runs locally on every partition. So you need to know the partition count. If the returned cursor is 0 for a particular partition, you have to start scanning the next partition.

DM.SCAN partID dmap cursor [ MATCH pattern | COUNT count ]

Example:

127.0.0.1:3320> DM.SCAN 3 bench 0
1) "96990"
2)  1) "memtier-2794837"
    2) "memtier-8630933"
    3) "memtier-6415429"
    4) "memtier-7808686"
    5) "memtier-3347072"
    6) "memtier-4247791"
    7) "memtier-3931982"
    8) "memtier-7164719"
    9) "memtier-4710441"
   10) "memtier-8892916"
127.0.0.1:3320> DM.SCAN 3 bench 96990
1) "193499"
2)  1) "memtier-429905"
    2) "memtier-1271812"
    3) "memtier-7835776"
    4) "memtier-2717575"
    5) "memtier-95312"
    6) "memtier-2155214"
    7) "memtier-123931"
    8) "memtier-2902510"
    9) "memtier-2632291"
   10) "memtier-1938450"

Publish-Subscribe

SUBSCRIBE, UNSUBSCRIBE and PUBLISH implement the Publish/Subscribe messaging paradigm where senders are not programmed to send their messages to specific receivers. Rather, published messages are characterized into channels, without knowledge of what (if any) subscribers there may be. Subscribers express interest in one or more channels, and only receive messages that are of interest, without knowledge of what (if any) publishers there are. This decoupling of publishers and subscribers can allow for greater scalability and a more dynamic network topology.

Important note: In an Olric cluster, clients can subscribe to every node, and can also publish to every other node. The cluster will make sure that published messages are forwarded as needed.

Source of this section: https://redis.io/commands/?group=pubsub

SUBSCRIBE

Subscribes the client to the specified channels.

SUBSCRIBE channel [channel...]

Once the client enters the subscribed state it is not supposed to issue any other commands, except for additional SUBSCRIBE, PSUBSCRIBE, UNSUBSCRIBE, PUNSUBSCRIBE, PING, and QUIT commands.

PSUBSCRIBE

Subscribes the client to the given patterns.

PSUBSCRIBE pattern [ pattern ...]

Supported glob-style patterns:

  • h?llo subscribes to hello, hallo and hxllo
  • h*llo subscribes to hllo and heeeello
  • h[ae]llo subscribes to hello and hallo, but not hillo
  • Use \ to escape special characters if you want to match them verbatim.

UNSUBSCRIBE

Unsubscribes the client from the given channels, or from all of them if none is given.

UNSUBSCRIBE [channel [channel ...]]

When no channels are specified, the client is unsubscribed from all the previously subscribed channels. In this case, a message for every unsubscribed channel will be sent to the client.

PUNSUBSCRIBE

Unsubscribes the client from the given patterns, or from all of them if none is given.

PUNSUBSCRIBE [pattern [pattern ...]]

When no patterns are specified, the client is unsubscribed from all the previously subscribed patterns. In this case, a message for every unsubscribed pattern will be sent to the client.

PUBSUB CHANNELS

Lists the currently active channels.

PUBSUB CHANNELS [pattern]

An active channel is a Pub/Sub channel with one or more subscribers (excluding clients subscribed to patterns).

If no pattern is specified, all the channels are listed, otherwise if pattern is specified only channels matching the specified glob-style pattern are listed.

PUBSUB NUMPAT

Returns the number of unique patterns that are subscribed to by clients (that are performed using the PSUBSCRIBE command).

PUBSUB NUMPAT

Note that this isn't the count of clients subscribed to patterns, but the total number of unique patterns all the clients are subscribed to.

Important note: In an Olric cluster, clients can subscribe to every node, and can also publish to every other node. The cluster will make sure that published messages are forwarded as needed. That said, PUBSUB's replies in a cluster only report information from the node's Pub/Sub context, rather than the entire cluster.

PUBSUB NUMSUB

Returns the number of subscribers (exclusive of clients subscribed to patterns) for the specified channels.

PUBSUB NUMSUB [channel [channel ...]]

Note that it is valid to call this command without channels. In this case it will just return an empty list.

Important note: In an Olric cluster, clients can subscribe to every node, and can also publish to every other node. The cluster will make sure that published messages are forwarded as needed. That said, PUBSUB's replies in a cluster only report information from the node's Pub/Sub context, rather than the entire cluster.

QUIT

Ask the server to close the connection. The connection is closed as soon as all pending replies have been written to the client.

QUIT

Cluster

CLUSTER.ROUTINGTABLE

CLUSTER.ROUTINGTABLE returns the latest view of the routing table. Simply, it's a data structure that maps partitions to members.

CLUSTER.ROUTINGTABLE

Example:

127.0.0.1:3320> CLUSTER.ROUTINGTABLE
 1) 1) (integer) 0
     2) 1) "127.0.0.1:3320"
     3) (empty array)
  2) 1) (integer) 1
     2) 1) "127.0.0.1:3320"
     3) (empty array)
  3) 1) (integer) 2
     2) 1) "127.0.0.1:3320"
     3) (empty array)

It returns an array of arrays.

Fields:

1) (integer) 0 <- Partition ID
  2) 1) "127.0.0.1:3320" <- Array of the current and previous primary owners
  3) (empty array) <- Array of backup owners. 

CLUSTER.MEMBERS

CLUSTER.MEMBERS returns an array of known members by the server.

CLUSTER.MEMBERS

Example:

127.0.0.1:3320> CLUSTER.MEMBERS
1) 1) "127.0.0.1:3320"
   2) (integer) 1652619388427137000
   3) "true"

Fields:

1) 1) "127.0.0.1:3320" <- Member's name in the cluster
   2) (integer) 1652619388427137000 <-Member's birthedate
   3) "true" <- Is cluster coordinator (the oldest node)

Others

PING

Returns PONG if no argument is provided, otherwise return a copy of the argument as a bulk. This command is often used to test if a connection is still alive, or to measure latency.

PING

STATS

The STATS command returns information and statistics about the server in JSON format. See stats/stats.go file.

Configuration

Olric supports both declarative and programmatic configurations. You can choose one of them depending on your needs. You should feel free to ask any questions about configuration and integration. Please see Support section.

Embedded-Member Mode

Programmatic Configuration

Olric provides a function to generate default configuration to use in embedded-member mode:

import "github.com/buraksezer/olric/config"
...
c := config.New("local")

The New function takes a parameter called env. It denotes the network environment and consumed by hashicorp/memberlist. Default configuration is good enough for distributed caching scenario. In order to see all configuration parameters, please take a look at this.

See Sample Code section for an introduction.

Declarative configuration with YAML format

You can also import configuration from a YAML file by using the Load function:

c, err := config.Load(path/to/olric.yaml)

A sample configuration file in YAML format can be found here. This may be the most appropriate way to manage the Olric configuration.

Client-Server Mode

Olric provides olricd to implement client-server mode. olricd gets a YAML file for the configuration. The most basic functionality of olricd is that translating YAML configuration into Olric's configuration struct. A sample olricd.yaml file is being provided here.

Network Configuration

In an Olric instance, there are two different TCP servers. One for Olric, and the other one is for memberlist. BindAddr is very critical to deploy a healthy Olric node. There are different scenarios:

  • You can freely set a domain name or IP address as BindAddr for both Olric and memberlist. Olric will resolve and use it to bind.
  • You can freely set localhost, 127.0.0.1 or ::1 as BindAddr in development environment for both Olric and memberlist.
  • You can freely set 0.0.0.0 as BindAddr for both Olric and memberlist. Olric will pick an IP address, if there is any.
  • If you don't set BindAddr, hostname will be used, and it will be resolved to get a valid IP address.
  • You can set a network interface by using Config.Interface and Config.MemberlistInterface fields. Olric will find an appropriate IP address for the given interfaces, if there is any.
  • You can set both BindAddr and interface parameters. In this case Olric will ensure that BindAddr is available on the given interface.

You should know that Olric needs a single and stable IP address to function properly. If you don't know the IP address of the host at the deployment time, you can set BindAddr as 0.0.0.0. Olric will very likely to find an IP address for you.

Service Discovery

Olric provides a service discovery interface which can be used to implement plugins.

We currently have a bunch of service discovery plugins for automatic peer discovery on cloud environments:

In order to get more info about installation and configuration of the plugins, see their GitHub page.

Timeouts

Olric nodes supports setting KeepAlivePeriod on TCP sockets.

Server-side:

config.KeepAlivePeriod

KeepAlivePeriod denotes whether the operating system should send keep-alive messages on the connection.

Client-side:

config.DialTimeout

Timeout for TCP dial. The timeout includes name resolution, if required. When using TCP, and the host in the address parameter resolves to multiple IP addresses, the timeout is spread over each consecutive dial, such that each is given an appropriate fraction of the time to connect.

config.ReadTimeout

Timeout for socket reads. If reached, commands will fail with a timeout instead of blocking. Use value -1 for no timeout and 0 for default. The default is config.DefaultReadTimeout

config.WriteTimeout

Timeout for socket writes. If reached, commands will fail with a timeout instead of blocking. The default is config.DefaultWriteTimeout

Architecture

Overview

Olric uses:

Olric distributes data among partitions. Every partition is being owned by a cluster member and may have one or more backups for redundancy. When you read or write a DMap entry, you transparently talk to the partition owner. Each request hits the most up-to-date version of a particular data entry in a stable cluster.

In order to find the partition which the key belongs to, Olric hashes the key and mod it with the number of partitions:

partID = MOD(hash result, partition count)

The partitions are being distributed among cluster members by using a consistent hashing algorithm. In order to get details, please see buraksezer/consistent.

When a new cluster is created, one of the instances is elected as the cluster coordinator. It manages the partition table:

  • When a node joins or leaves, it distributes the partitions and their backups among the members again,
  • Removes empty previous owners from the partition owners list,
  • Pushes the new partition table to all the members,
  • Pushes the partition table to the cluster periodically.

Members propagate their birthdate(POSIX time in nanoseconds) to the cluster. The coordinator is the oldest member in the cluster. If the coordinator leaves the cluster, the second oldest member gets elected as the coordinator.

Olric has a component called rebalancer which is responsible for keeping underlying data structures consistent:

  • Works on every node,
  • When a node joins or leaves, the cluster coordinator pushes the new partition table. Then, the rebalancer runs immediately and moves the partitions and backups to their new hosts,
  • Merges fragmented partitions.

Partitions have a concept called owners list. When a node joins or leaves the cluster, a new primary owner may be assigned by the coordinator. At any time, a partition may have one or more partition owners. If a partition has two or more owners, this is called fragmented partition. The last added owner is called primary owner. Write operation is only done by the primary owner. The previous owners are only used for read and delete.

When you read a key, the primary owner tries to find the key on itself, first. Then, queries the previous owners and backups, respectively. The delete operation works the same way.

The data(distributed map objects) in the fragmented partition is moved slowly to the primary owner by the rebalancer. Until the move is done, the data remains available on the previous owners. The DMap methods use this list to query data on the cluster.

Please note that, 'multiple partition owners' is an undesirable situation and the rebalancer component is designed to fix that in a short time.

Consistency and Replication Model

Olric is an AP product in the context of CAP theorem, which employs the combination of primary-copy and optimistic replication techniques. With optimistic replication, when the partition owner receives a write or delete operation for a key, applies it locally, and propagates it to the backup owners.

This technique enables Olric clusters to offer high throughput. However, due to temporary situations in the system, such as network failure, backup owners can miss some updates and diverge from the primary owner. If a partition owner crashes while there is an inconsistency between itself and the backups, strong consistency of the data can be lost.

Two types of backup replication are available: sync and async. Both types are still implementations of the optimistic replication model.

  • sync: Blocks until write/delete operation is applied by backup owners.
  • async: Just fire & forget.

Last-write-wins conflict resolution

Every time a piece of data is written to Olric, a timestamp is attached by the client. Then, when Olric has to deal with conflict data in the case of network partitioning, it simply chooses the data with the most recent timestamp. This called LWW conflict resolution policy.

PACELC Theorem

From Wikipedia:

In theoretical computer science, the PACELC theorem is an extension to the CAP theorem. It states that in case of network partitioning (P) in a distributed computer system, one has to choose between availability (A) and consistency (C) (as per the CAP theorem), but else (E), even when the system is running normally in the absence of partitions, one has to choose between latency (L) and consistency (C).

In the context of PACELC theorem, Olric is a PA/EC product. It means that Olric is considered to be consistent data store if the network is stable. Because the key space is divided between partitions and every partition is controlled by its primary owner. All operations on DMaps are redirected to the partition owner.

In the case of network partitioning, Olric chooses availability over consistency. So that you can still access some parts of the cluster when the network is unreliable, but the cluster may return inconsistent results.

Olric implements read-repair and quorum based voting system to deal with inconsistencies in the DMaps.

Readings on PACELC theorem:

Read-Repair on DMaps

Read repair is a feature that allows for inconsistent data to be fixed at query time. Olric tracks every write operation with a timestamp value and assumes that the latest write operation is the valid one. When you want to access a key/value pair, the partition owner retrieves all available copies for that pair and compares the timestamp values. The latest one is the winner. If there is some outdated version of the requested pair, the primary owner propagates the latest version of the pair.

Read-repair is disabled by default for the sake of performance. If you have a use case that requires a more strict consistency control than a distributed caching scenario, you can enable read-repair via the configuration.

Quorum-based replica control

Olric implements Read/Write quorum to keep the data in a consistent state. When you start a write operation on the cluster and write quorum (W) is 2, the partition owner tries to write the given key/value pair on its own data storage and on the replica nodes. If the number of successful write operations is below W, the primary owner returns ErrWriteQuorum. The read flow is the same: if you have R=2 and the owner only access one of the replicas, it returns ErrReadQuorum.

Simple Split-Brain Protection

Olric implements a technique called majority quorum to manage split-brain conditions. If a network partitioning occurs, and some members lost the connection to rest of the cluster, they immediately stops functioning and return an error to incoming requests. This behaviour is controlled by MemberCountQuorum parameter. It's default 1.

When the network healed, the stopped nodes joins again the cluster and fragmented partitions is merged by their primary owners in accordance with LWW policy. Olric also implements an ownership report mechanism to fix inconsistencies in partition distribution after a partitioning event.

Eviction

Olric supports different policies to evict keys from distributed maps.

Expire with TTL

Olric implements TTL eviction policy. It shares the same algorithm with Redis:

Periodically Redis tests a few keys at random among keys with an expire set. All the keys that are already expired are deleted from the keyspace.

Specifically this is what Redis does 10 times per second:

  • Test 20 random keys from the set of keys with an associated expire.
  • Delete all the keys found expired.
  • If more than 25% of keys were expired, start again from step 1.

This is a trivial probabilistic algorithm, basically the assumption is that our sample is representative of the whole key space, and we continue to expire until the percentage of keys that are likely to be expired is under 25%

When a client tries to access a key, Olric returns ErrKeyNotFound if the key is found to be timed out. A background task evicts keys with the algorithm described above.

Expire with MaxIdleDuration

Maximum time for each entry to stay idle in the DMap. It limits the lifetime of the entries relative to the time of the last read or write access performed on them. The entries whose idle period exceeds this limit are expired and evicted automatically. An entry is idle if no Get, Put, PutEx, Expire, PutIf, PutIfEx on it. Configuration of MaxIdleDuration feature varies by preferred deployment method.

Expire with LRU

Olric implements LRU eviction method on DMaps. Approximated LRU algorithm is borrowed from Redis. The Redis authors proposes the following algorithm:

It is important to understand that the eviction process works like this:

  • A client runs a new command, resulting in more data added.
  • Redis checks the memory usage, and if it is greater than the maxmemory limit , it evicts keys according to the policy.
  • A new command is executed, and so forth.

So we continuously cross the boundaries of the memory limit, by going over it, and then by evicting keys to return back under the limits.

If a command results in a lot of memory being used (like a big set intersection stored into a new key) for some time the memory limit can be surpassed by a noticeable amount.

Approximated LRU algorithm

Redis LRU algorithm is not an exact implementation. This means that Redis is not able to pick the best candidate for eviction, that is, the access that was accessed the most in the past. Instead it will try to run an approximation of the LRU algorithm, by sampling a small number of keys, and evicting the one that is the best (with the oldest access time) among the sampled keys.

Olric tracks access time for every DMap instance. Then it picks and sorts some configurable amount of keys to select keys for eviction. Every node runs this algorithm independently. The access log is moved along with the partition when a network partition is occured.

Configuration of eviction mechanisms

Here is a simple configuration block for olricd.yaml:

cache:
  numEvictionWorkers: 1
  maxIdleDuration: ""
  ttlDuration: "100s"
  maxKeys: 100000
  maxInuse: 1000000 # in bytes
  lRUSamples: 10
  evictionPolicy: "LRU" # NONE/LRU

You can also set cache configuration per DMap. Here is a simple configuration for a DMap named foobar:

dmaps:
  foobar:
    maxIdleDuration: "60s"
    ttlDuration: "300s"
    maxKeys: 500000 # in-bytes
    lRUSamples: 20
    evictionPolicy: "NONE" # NONE/LRU

If you prefer embedded-member deployment scenario, please take a look at config#CacheConfig and config#DMapCacheConfig for the configuration.

Lock Implementation

The DMap implementation is already thread-safe to meet your thread safety requirements. When you want to have more control on the concurrency, you can use LockWithTimeout and Lock methods. Olric borrows the locking algorithm from Redis. Redis authors propose the following algorithm:

The command is a simple way to implement a locking system with Redis.

A client can acquire the lock if the above command returns OK (or retry after some time if the command returns Nil), and remove the lock just using DEL.

The lock will be auto-released after the expire time is reached.

It is possible to make this system more robust modifying the unlock schema as follows:

Instead of setting a fixed string, set a non-guessable large random string, called token. Instead of releasing the lock with DEL, send a script that only removes the key if the value matches. This avoids that a client will try to release the lock after the expire time deleting the key created by another client that acquired the lock later.

Equivalent ofSETNX command in Olric is PutIf(key, value, IfNotFound). Lock and LockWithTimeout commands are properly implements the algorithm which is proposed above.

You should know that this implementation is subject to the clustering algorithm. So there is no guarantee about reliability in the case of network partitioning. I recommend the lock implementation to be used for efficiency purposes in general, instead of correctness.

Important note about consistency:

You should know that Olric is a PA/EC (see Consistency and Replication Model) product. So if your network is stable, all the operations on key/value pairs are performed by a single cluster member. It means that you can be sure about the consistency when the cluster is stable. It's important to know that computer networks fail occasionally, processes crash and random GC pauses may happen. Many factors can lead a network partitioning. If you cannot tolerate losing strong consistency under network partitioning, you need to use a different tool for locking.

See Hazelcast and the Mythical PA/EC System and Jepsen Analysis on Hazelcast 3.8.3 for more insight on this topic.

Storage Engine

Olric implements a GC-friendly storage engine to store large amounts of data on RAM. Basically, it applies an append-only log file approach with indexes. Olric inserts key/value pairs into pre-allocated byte slices (table in Olric terminology) and indexes that memory region by using Golang's built-in map. The data type of this map is map[uint64]uint64. When a pre-allocated byte slice is full Olric allocates a new one and continues inserting the new data into it. This design greatly reduces the write latency.

When you want to read a key/value pair from the Olric cluster, it scans the related DMap fragment by iterating over the indexes(implemented by the built-in map). The number of allocated byte slices should be small. So Olric would find the key immediately but technically, the read performance depends on the number of keys in the fragment. The effect of this design on the read performance is negligible.

The size of the pre-allocated byte slices is configurable.

Samples

In this section, you can find code snippets for various scenarios.

Embedded-member scenario

Distributed map

package main

import (
  "context"
  "fmt"
  "log"
  "time"

  "github.com/buraksezer/olric"
  "github.com/buraksezer/olric/config"
)

func main() {
  // Sample for Olric v0.5.x

  // Deployment scenario: embedded-member
  // This creates a single-node Olric cluster. It's good enough for experimenting.

  // config.New returns a new config.Config with sane defaults. Available values for env:
  // local, lan, wan
  c := config.New("local")

  // Callback function. It's called when this node is ready to accept connections.
  ctx, cancel := context.WithCancel(context.Background())
  c.Started = func() {
    defer cancel()
    log.Println("[INFO] Olric is ready to accept connections")
  }

  // Create a new Olric instance.
  db, err := olric.New(c)
  if err != nil {
    log.Fatalf("Failed to create Olric instance: %v", err)
  }

  // Start the instance. It will form a single-node cluster.
  go func() {
    // Call Start at background. It's a blocker call.
    err = db.Start()
    if err != nil {
      log.Fatalf("olric.Start returned an error: %v", err)
    }
  }()

  <-ctx.Done()

  // In embedded-member scenario, you can use the EmbeddedClient. It implements
  // the Client interface.
  e := db.NewEmbeddedClient()

  dm, err := e.NewDMap("bucket-of-arbitrary-items")
  if err != nil {
    log.Fatalf("olric.NewDMap returned an error: %v", err)
  }

  ctx, cancel = context.WithCancel(context.Background())

  // Magic starts here!
  fmt.Println("##")
  fmt.Println("Simple Put/Get on a DMap instance:")
  err = dm.Put(ctx, "my-key", "Olric Rocks!")
  if err != nil {
    log.Fatalf("Failed to call Put: %v", err)
  }

  gr, err := dm.Get(ctx, "my-key")
  if err != nil {
    log.Fatalf("Failed to call Get: %v", err)
  }

  // Olric uses the Redis serialization format.
  value, err := gr.String()
  if err != nil {
    log.Fatalf("Failed to read Get response: %v", err)
  }

  fmt.Println("Response for my-key:", value)
  fmt.Println("##")

  // Don't forget the call Shutdown when you want to leave the cluster.
  ctx, cancel = context.WithTimeout(context.Background(), 10*time.Second)
  defer cancel()

  err = db.Shutdown(ctx)
  if err != nil {
    log.Printf("Failed to shutdown Olric: %v", err)
  }
}

Publish-Subscribe

package main

import (
  "context"
  "fmt"
  "log"
  "time"

  "github.com/buraksezer/olric"
  "github.com/buraksezer/olric/config"
)

func main() {
  // Sample for Olric v0.5.x

  // Deployment scenario: embedded-member
  // This creates a single-node Olric cluster. It's good enough for experimenting.

  // config.New returns a new config.Config with sane defaults. Available values for env:
  // local, lan, wan
  c := config.New("local")

  // Callback function. It's called when this node is ready to accept connections.
  ctx, cancel := context.WithCancel(context.Background())
  c.Started = func() {
    defer cancel()
    log.Println("[INFO] Olric is ready to accept connections")
  }

  // Create a new Olric instance.
  db, err := olric.New(c)
  if err != nil {
    log.Fatalf("Failed to create Olric instance: %v", err)
  }

  // Start the instance. It will form a single-node cluster.
  go func() {
    // Call Start at background. It's a blocker call.
    err = db.Start()
    if err != nil {
      log.Fatalf("olric.Start returned an error: %v", err)
    }
  }()

  <-ctx.Done()

  // In embedded-member scenario, you can use the EmbeddedClient. It implements
  // the Client interface.
  e := db.NewEmbeddedClient()

  ps, err := e.NewPubSub()
  if err != nil {
    log.Fatalf("olric.NewPubSub returned an error: %v", err)
  }

  ctx, cancel = context.WithCancel(context.Background())

  // Olric implements a drop-in replacement of Redis Publish-Subscribe messaging
  // system. PubSub client is just a thin layer around go-redis/redis.
  rps := ps.Subscribe(ctx, "my-channel")

  // Get a message to read messages from my-channel
  msg := rps.Channel()

  go func() {
    // Publish a message here.
    _, err := ps.Publish(ctx, "my-channel", "Olric Rocks!")
    if err != nil {
      log.Fatalf("PubSub.Publish returned an error: %v", err)
    }
  }()

  // Consume messages
  rm := <-msg

  fmt.Printf("Received message: \"%s\" from \"%s\"", rm.Channel, rm.Payload)

  // Don't forget the call Shutdown when you want to leave the cluster.
  ctx, cancel = context.WithTimeout(context.Background(), 10*time.Second)
  defer cancel()

  err = e.Close(ctx)
  if err != nil {
    log.Printf("Failed to close EmbeddedClient: %v", err)
  }
}

Client-Server scenario

Distributed map

package main

import (
  "context"
  "fmt"
  "log"
  "time"

  "github.com/buraksezer/olric"
)

func main() {
  // Sample for Olric v0.5.x

  // Deployment scenario: client-server

  // NewClusterClient takes a list of the nodes. This list may only contain a
  // load balancer address. Please note that Olric nodes will calculate the partition owner
  // and proxy the incoming requests.
  c, err := olric.NewClusterClient([]string{"localhost:3320"})
  if err != nil {
    log.Fatalf("olric.NewClusterClient returned an error: %v", err)
  }

  // In client-server scenario, you can use the ClusterClient. It implements
  // the Client interface.
  dm, err := c.NewDMap("bucket-of-arbitrary-items")
  if err != nil {
    log.Fatalf("olric.NewDMap returned an error: %v", err)
  }

  ctx, cancel := context.WithCancel(context.Background())

  // Magic starts here!
  fmt.Println("##")
  fmt.Println("Simple Put/Get on a DMap instance:")
  err = dm.Put(ctx, "my-key", "Olric Rocks!")
  if err != nil {
    log.Fatalf("Failed to call Put: %v", err)
  }

  gr, err := dm.Get(ctx, "my-key")
  if err != nil {
    log.Fatalf("Failed to call Get: %v", err)
  }

  // Olric uses the Redis serialization format.
  value, err := gr.String()
  if err != nil {
    log.Fatalf("Failed to read Get response: %v", err)
  }

  fmt.Println("Response for my-key:", value)
  fmt.Println("##")

  // Don't forget the call Shutdown when you want to leave the cluster.
  ctx, cancel = context.WithTimeout(context.Background(), 10*time.Second)
  defer cancel()

  err = c.Close(ctx)
  if err != nil {
    log.Printf("Failed to close ClusterClient: %v", err)
  }
}

SCAN on DMaps

package main

import (
	"context"
	"fmt"
	"log"
	"time"

	"github.com/buraksezer/olric"
	"github.com/buraksezer/olric/config"
)

func main() {
	// Sample for Olric v0.5.x

	// Deployment scenario: embedded-member
	// This creates a single-node Olric cluster. It's good enough for experimenting.

	// config.New returns a new config.Config with sane defaults. Available values for env:
	// local, lan, wan
	c := config.New("local")

	// Callback function. It's called when this node is ready to accept connections.
	ctx, cancel := context.WithCancel(context.Background())
	c.Started = func() {
		defer cancel()
		log.Println("[INFO] Olric is ready to accept connections")
	}

	// Create a new Olric instance.
	db, err := olric.New(c)
	if err != nil {
		log.Fatalf("Failed to create Olric instance: %v", err)
	}

	// Start the instance. It will form a single-node cluster.
	go func() {
		// Call Start at background. It's a blocker call.
		err = db.Start()
		if err != nil {
			log.Fatalf("olric.Start returned an error: %v", err)
		}
	}()

	<-ctx.Done()

	// In embedded-member scenario, you can use the EmbeddedClient. It implements
	// the Client interface.
	e := db.NewEmbeddedClient()

	dm, err := e.NewDMap("bucket-of-arbitrary-items")
	if err != nil {
		log.Fatalf("olric.NewDMap returned an error: %v", err)
	}

	ctx, cancel = context.WithCancel(context.Background())

	// Magic starts here!
	fmt.Println("##")
	fmt.Println("Insert 10 keys")
	var key string
	for i := 0; i < 10; i++ {
		if i%2 == 0 {
			key = fmt.Sprintf("even:%d", i)
		} else {
			key = fmt.Sprintf("odd:%d", i)
		}
		err = dm.Put(ctx, key, nil)
		if err != nil {
			log.Fatalf("Failed to call Put: %v", err)
		}
	}

	i, err := dm.Scan(ctx)
	if err != nil {
		log.Fatalf("Failed to call Scan: %v", err)
	}

	fmt.Println("Iterate over all the keys")
	for i.Next() {
		fmt.Println(">> Key", i.Key())
	}

	i.Close()

	i, err = dm.Scan(ctx, olric.Match("^even:"))
	if err != nil {
		log.Fatalf("Failed to call Scan: %v", err)
	}

	fmt.Println("\n\nScan with regex: ^even:")
	for i.Next() {
		fmt.Println(">> Key", i.Key())
	}

	i.Close()

	// Don't forget the call Shutdown when you want to leave the cluster.
	ctx, cancel = context.WithTimeout(context.Background(), 10*time.Second)
	defer cancel()

	err = db.Shutdown(ctx)
	if err != nil {
		log.Printf("Failed to shutdown Olric: %v", err)
	}
}

Publish-Subscribe

package main

import (
  "context"
  "fmt"
  "log"
  "time"

  "github.com/buraksezer/olric"
)

func main() {
  // Sample for Olric v0.5.x

  // Deployment scenario: client-server

  // NewClusterClient takes a list of the nodes. This list may only contain a
  // load balancer address. Please note that Olric nodes will calculate the partition owner
  // and proxy the incoming requests.
  c, err := olric.NewClusterClient([]string{"localhost:3320"})
  if err != nil {
    log.Fatalf("olric.NewClusterClient returned an error: %v", err)
  }

  // In client-server scenario, you can use the ClusterClient. It implements
  // the Client interface.
  ps, err := c.NewPubSub()
  if err != nil {
    log.Fatalf("olric.NewPubSub returned an error: %v", err)
  }

  ctx, cancel := context.WithCancel(context.Background())

  // Olric implements a drop-in replacement of Redis Publish-Subscribe messaging
  // system. PubSub client is just a thin layer around go-redis/redis.
  rps := ps.Subscribe(ctx, "my-channel")

  // Get a message to read messages from my-channel
  msg := rps.Channel()

  go func() {
    // Publish a message here.
    _, err := ps.Publish(ctx, "my-channel", "Olric Rocks!")
    if err != nil {
      log.Fatalf("PubSub.Publish returned an error: %v", err)
    }
  }()

  // Consume messages
  rm := <-msg

  fmt.Printf("Received message: \"%s\" from \"%s\"", rm.Channel, rm.Payload)

  // Don't forget the call Shutdown when you want to leave the cluster.
  ctx, cancel = context.WithTimeout(context.Background(), 10*time.Second)
  defer cancel()

  err = c.Close(ctx)
  if err != nil {
    log.Printf("Failed to close ClusterClient: %v", err)
  }
}

Contributions

Please don't hesitate to fork the project and send a pull request or just e-mail me to ask questions and share ideas.

License

The Apache License, Version 2.0 - see LICENSE for more details.

About the name

The inner voice of Turgut ร–zben who is the main character of OฤŸuz Atay's masterpiece -The Disconnected-.

olric's People

Contributors

andrewwinterman avatar buraksezer avatar d1ngd0 avatar derekperkins avatar dunglas avatar hasit avatar johnstarich avatar justinfx avatar robinbraemer avatar shawnhsiung avatar tehsphinx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

olric's Issues

Data race in TestStream_PingPong

There is a data race problem in TestStream_PingPong. This have to be fixed before v0.3.0 release

=== RUN   TestStream_PingPong
2020/09/04 01:08:54 [INFO] Join completed. Synced with 0 initial nodes => olric.go:303
2020/09/04 01:08:54 [INFO] Routing table has been pushed by 127.0.0.1:45685 => routing.go:504
2020/09/04 01:08:54 [INFO] The cluster coordinator has been bootstrapped => olric.go:277
2020/09/04 01:08:54 [INFO] Olric bindAddr: 127.0.0.1, bindPort: 45685 => olric.go:375
2020/09/04 01:08:54 [INFO] Memberlist bindAddr: 192.168.1.4, bindPort: 3322 => olric.go:383
2020/09/04 01:08:54 [INFO] Cluster coordinator: 127.0.0.1:45685 => olric.go:387
2020/09/04 01:08:54 [INFO] Node name in the cluster: 127.0.0.1:45685 => olric.go:600
==================
WARNING: DATA RACE
Write at 0x00c0001809f0 by goroutine 70:
  github.com/buraksezer/olric/client.(*stream).listenStream()
      /home/buraksezer/go/src/github.com/buraksezer/olric/client/stream.go:82 +0x164
  github.com/buraksezer/olric/client.(*Client).createStream.func2()
      /home/buraksezer/go/src/github.com/buraksezer/olric/client/stream.go:172 +0x76

Previous read at 0x00c0001809f0 by goroutine 116:
  github.com/buraksezer/olric/client.(*stream).checkStreamAliveness()
      /home/buraksezer/go/src/github.com/buraksezer/olric/client/stream.go:108 +0x194
  github.com/buraksezer/olric/client.(*Client).createStream.func4()
      /home/buraksezer/go/src/github.com/buraksezer/olric/client/stream.go:184 +0x76

Goroutine 70 (running) created at:
  github.com/buraksezer/olric/client.(*Client).createStream()
      /home/buraksezer/go/src/github.com/buraksezer/olric/client/stream.go:170 +0x816
  github.com/buraksezer/olric/client.(*Client).addStreamListener()
      /home/buraksezer/go/src/github.com/buraksezer/olric/client/stream.go:236 +0x339
  github.com/buraksezer/olric/client.TestStream_EchoListener()
      /home/buraksezer/go/src/github.com/buraksezer/olric/client/stream_test.go:68 +0x2a4
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1108 +0x202

Goroutine 116 (running) created at:
  github.com/buraksezer/olric/client.(*Client).createStream()
      /home/buraksezer/go/src/github.com/buraksezer/olric/client/stream.go:182 +0x8b0
  github.com/buraksezer/olric/client.(*Client).addStreamListener()
      /home/buraksezer/go/src/github.com/buraksezer/olric/client/stream.go:236 +0x339
  github.com/buraksezer/olric/client.TestStream_EchoListener()
      /home/buraksezer/go/src/github.com/buraksezer/olric/client/stream_test.go:68 +0x2a4
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1108 +0x202
==================
2020/09/04 01:08:58 [INFO] Closing active streams => olric.go:622
2020/09/04 01:08:58 [INFO] Broadcasting a leave message => discovery.go:362
2020/09/04 01:08:58 [INFO] 127.0.0.1:45685 is gone => olric.go:642
    testing.go:1023: race detected during execution of test
--- FAIL: TestStream_PingPong (3.11s)
=== CONT  
    testing.go:1023: race detected during execution of test
FAIL
exit status 1

Drop persistence support

Drop persistence support. It was too early to add. BadgerDB is not suitable to fulfil our requirements. We may implement an in-house persistence support in the future. Of course, after implementing a proper fsck component.

Support: Configuration issues with multiple nodes

I have a number questions about the Config, now that I actually have Olric embedded in my application and working, and am spinning up 2 local instances to form a cluster experiment.

Question 1
I've found (just as you documented) that the memberlist configuration is complicated. There are a lot of ports that need to be allocated for each instance in order to not conflict with another instance and also so the instances can actually connect into a cluster. For each instance on the same host I have to set:

  1. Config.Name to something like 0.0.0.0:<unique port>
  2. Config.MemberlistConfig.BindPort to a unique port
  3. Config.Peers to point at the unique bind port of the other instance

Do I need to worry about Config.MemberlistConfig.AdvertisePort? It seems to not conflict when I dont adjust it on either host. But it WILL conflict and error if I am not using peers and two node instances on the same host use the same AdvertisePort.
What is the difference between the Name port and the Bind port, and do I really need to make sure to have 2 unique ports per instance?

Question 2
For Peers, is that just a seed list where I only have to point it at one other node for the new node to connect to the entire cluster? Or does every node need a complete list of every other node, like a memcached configuration?
When I start NodeA, I may not have started it with a known peer list. Then I start NodeB with a Peer list pointing at NodeA (great!). When I start NodeA up again, I need to make sure to point it at NodeB (I think?).
I'm hoping I only need a single other seed node in the peer list. My goal would be to somehow point the peer list at a single service discovery endpoint to find another node in the active cluster. Otherwise it would be more difficult to dynamically scale the service if you always need to know the address of the other nodes when you scale.
Aside from the seed node question, the node discovery problem may be outside the scope of Olric. Likely would I would end up doing is have each Olric node register itself with Consul. And then it would check Consul for one other node to use as its peer list.

Question 3
When I start local NodeA and NodeB, I can confirm that an item cached on NodeA can be retrieved by NodeB. However I can't seem to find the configuration option that gets NodeA to actually back up the item to NodeB. That is, NodeB isn't storing the cached item, so when NodeA goes down, the next request to NodeB is a cache miss. What is the combination of config options that will result in backing up the cached item to at least one other node?

Question 4
When I cache an item into NodeA, and then retrieve it for the first time in NodeB, the gob encoder in NodeB spits out an error gob: name not registered for interface: "*mylib.MyCachedItem", which results in a cache miss. When it then fills the cache, subsequent requests will now work correctly since the gob encoder knows the type. To fix this, I had to add this to my application:

func init() {
	gob.Register(&MyCachedItem{})
}

Should this be documented somewhere in Olric? The fact that it uses gob is kind of an implementation detail.

Thanks for the support!

"Stale DMap (backup: false) has been deleted" log output is a bit spammy

I've found that I needed to adjust my LogVerbosity to a value of 2, otherwise I was missing very important ERROR level messages that led to some bugs I had in my code. But when I adjust to this level, I get this bit of spam log output when I have deleted a key:

[INFO] Stale DMap (backup: false) has been deleted: NAME on PartID: 0 => dmap_delete.go:34
[INFO] Stale DMap (backup: false) has been deleted: NAME  on PartID: 1 => dmap_delete.go:34
...
[INFO] Stale DMap (backup: false) has been deleted: NAME  on PartID: 270 => dmap_delete.go:34

I guess what I am looking for is just WARN and higher, but there seems to be a mixture of INFO/ERROR levels between the verbosity of 2 and 3?

Production usage

I like this project and would like to use it in a heavy duty production environment.
How stable are the APIs and is it being used in production?

Failed to get final advertise address

Hi, @buraksezer . I got lots of failures when running test cases. FYI,

dmap_backup_test.go:26: 
Expected nil. Got: Failed to get final advertise address: 
No private IP address found, and explicit IP not provided

Cluster size

I need to sync configuration for about 50-100 servers around the globe. There is one writer and the rest are readers. Can this project handle this? So far, everything that uses RAFT turned out to be unusable for this type of setup and since I have not seen any mention of RAFT anywhere, I wonder if this library could work out? Also, is there a way to write snapshot into storage so when a machine goes down and reboots it does not need to fetch the entire data set from scratch but only changes?

Durable ?

It would be easy to add a simple badger backing store.

I ask this because this looks like a great redis replacement but without a backing store it's very risky to us.

Maybe however it's intended purely as a cache ? If so then even then a backing store is going to fix warm up times.

Keys starting with "e" can't be used

First of all: Thanks for this amazing KV store! It is tremendously helpful in building the distributed routing logic for gloeth.

Right now though, it does not seem to be possible to use a key starting with "e":

[127.0.0.1:3320] ยป use testing
use testing
[127.0.0.1:3320] ยป put a b
[127.0.0.1:3320] ยป get a
b
[127.0.0.1:3320] ยป put b a
[127.0.0.1:3320] ยป get b
a
[127.0.0.1:3320] ยป put e a
[127.0.0.1:3320] ยป get e
Failed to call get e on testing: key not found
[127.0.0.1:3320] ยป put e1234 a
[127.0.0.1:3320] ยป get e1234
Failed to call get e1234 on testing: key not found
[127.0.0.1:3320] ยป put "efss" a
[127.0.0.1:3320] ยป get "efss"
Failed to call get "efss" on testing: key not found
[127.0.0.1:3320] ยป get efss
Failed to call get efss on testing: key not found
[127.0.0.1:3320] ยป put esadfjaslfjalsfjlsdjflsjdfajksdjfla a
[127.0.0.1:3320] ยป get esadfjaslfjalsfjlsdjflsjdfajksdjfla
Failed to call get esadfjaslfjalsfjlsdjflsjdfajksdjfla on testing: key not found
[127.0.0.1:3320] ยป 
2020/03/26 20:56:08 [INFO] Routing table has been pushed by 0.0.0.0:3320 => routing.go:488
2020/03/26 20:56:08 [INFO] Stale DMap (backup: false) has been deleted: testing on PartID: 9 => dmap_delete.go:34
2020/03/26 20:56:08 [INFO] Stale DMap (backup: false) has been deleted: testing on PartID: 15 => dmap_delete.go:34
2020/03/26 20:56:08 [INFO] Stale DMap (backup: false) has been deleted: testing on PartID: 30 => dmap_delete.go:34

Thanks for any help!

Read Repair vs healing to satisfy ReplicaCount

During an upgrade of my application, which uses olric embedded, all replicas get restarted in quick succession.
Most keys will never be read, or will be read very infrequently ... which with read repair means that after an upgrade of my application the cache will be immediately "empty".

That behavior means that olric doesn't actually provide useful functionality for my use case - unless I add code that reads the entire keyspace after startup to effectively repair the cache redundancy before letting Kubernetes know that the Pod is started successfully.

I believe that olric could do this internally more efficiently and I also believe that such a functionality would be generally useful:

From the documentation it seems that olric could "easily" know that a part of the keyspace doesn't satisfy the requested ReplicaCount and actively transfer the data to the newly joined member to repair the cache in case a node restarts.

So this is a request for:

  • when joining a cluster, ask it to transfer some data to the new node to satisfy ReplicaCount
  • provide an API to detect when this initial sync is finished so that the embedding application can communicate to the Kubernetes API when it is safe to continue with the rollout
  • detect node joins/departures fast enough to make such a rollout fast enough
  • useful node identity in the context of a Kubernetes cluster where IP-addresses are basically useless

Implement Diagnostics and DiagnosticsPlugin interface to provide insight in all kinds of potential performance and stability issues

Hazelcast already has this feature:

Diagnostics is a debugging tool that provides insight in all kinds of potential performance and stability issues. The actual logic to provide such insights, is placed in the DiagnosticsPlugin interface.

Resources on this topic:

Manage request life-cycle and timeouts

We need to manage and document request life-cycle and request timeouts clearly. Currently we have requestTimeout directive in configuration, but it doesn't work properly.

Data race in internal/transport tests

Related with #49 Some of the tests don't pass when -race parameter is given. This have to be fixed before v0.3.0 relase.

Here is the logs:

=== RUN   TestClient_Request
=== RUN   TestClient_Request/Request_with_round-robin
=== RUN   TestClient_Request/Request_without_round-robin
=== RUN   TestClient_Request/Close_connection_pool
transport-test: 2020/08/31 00:59:01 [ERROR] End of the TCP connection: EOF => server.go:219
--- PASS: TestClient_Request (0.00s)
    --- PASS: TestClient_Request/Request_with_round-robin (0.00s)
    --- PASS: TestClient_Request/Request_without_round-robin (0.00s)
    --- PASS: TestClient_Request/Close_connection_pool (0.00s)
=== RUN   TestConnWithTimeout
=== RUN   TestConnWithTimeout/Connection_with_i/o_timeout
=== RUN   TestConnWithTimeout/Connection_without_i/o_timeout
transport-test: 2020/08/31 00:59:01 [DEBUG] Connection is busy, awaiting for 1s => server.go:114
transport-test: 2020/08/31 00:59:01 [ERROR] End of the TCP connection: connection closed => server.go:219
transport-test: 2020/08/31 00:59:01 [ERROR] End of the TCP connection: EOF => server.go:219
transport-test: 2020/08/31 00:59:01 [DEBUG] Connection is idle, closing => server.go:131
--- PASS: TestConnWithTimeout (0.12s)
    --- PASS: TestConnWithTimeout/Connection_with_i/o_timeout (0.02s)
    --- PASS: TestConnWithTimeout/Connection_without_i/o_timeout (0.00s)
=== RUN   TestConnWithTimeout_Disabled
transport-test: 2020/08/31 00:59:01 [ERROR] End of the TCP connection: connection closed => server.go:219
==================
WARNING: DATA RACE
Write at 0x00c000186378 by goroutine 36:
  internal/race.Write()
      /usr/local/go/src/internal/race/race.go:41 +0x125
  sync.(*WaitGroup).Wait()
      /usr/local/go/src/sync/waitgroup.go:128 +0x126
  github.com/buraksezer/olric/internal/transport.(*Server).Shutdown.func1()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/server.go:304 +0x3e

Previous read at 0x00c000186378 by goroutine 33:
  internal/race.Read()
      /usr/local/go/src/internal/race/race.go:37 +0x206
  sync.(*WaitGroup).Add()
      /usr/local/go/src/sync/waitgroup.go:71 +0x219
  github.com/buraksezer/olric/internal/transport.(*Server).listenAndServe()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/server.go:253 +0x277
  github.com/buraksezer/olric/internal/transport.(*Server).ListenAndServe()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/server.go:278 +0x285
  github.com/buraksezer/olric/internal/transport.TestConnWithTimeout_Disabled.func2()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/conn_with_timeout_test.go:119 +0x3c

Goroutine 36 (running) created at:
  github.com/buraksezer/olric/internal/transport.(*Server).Shutdown()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/server.go:303 +0x152
  github.com/buraksezer/olric/internal/transport.TestConnWithTimeout_Disabled.func3()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/conn_with_timeout_test.go:125 +0x64
  github.com/buraksezer/olric/internal/transport.TestConnWithTimeout_Disabled()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/conn_with_timeout_test.go:153 +0x5e5
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1108 +0x202

Goroutine 33 (finished) created at:
  github.com/buraksezer/olric/internal/transport.TestConnWithTimeout_Disabled()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/conn_with_timeout_test.go:118 +0x195
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1108 +0x202
==================
    testing.go:1023: race detected during execution of test
--- FAIL: TestConnWithTimeout_Disabled (0.01s)
=== RUN   TestServer_ListenAndServe
--- PASS: TestServer_ListenAndServe (0.00s)
=== RUN   TestServer_ProcessConn
=== RUN   TestServer_ProcessConn/process_DMapMessage
=== RUN   TestServer_ProcessConn/process_StreamMessage
=== RUN   TestServer_ProcessConn/process_PipelineMessage
=== RUN   TestServer_ProcessConn/process_SystemMessage
=== RUN   TestServer_ProcessConn/process_DTopicMessage
transport-test: 2020/08/31 00:59:01 [ERROR] End of the TCP connection: connection closed => server.go:219
--- PASS: TestServer_ProcessConn (0.01s)
    --- PASS: TestServer_ProcessConn/process_DMapMessage (0.00s)
    --- PASS: TestServer_ProcessConn/process_StreamMessage (0.00s)
    --- PASS: TestServer_ProcessConn/process_PipelineMessage (0.00s)
    --- PASS: TestServer_ProcessConn/process_SystemMessage (0.00s)
    --- PASS: TestServer_ProcessConn/process_DTopicMessage (0.00s)
=== RUN   TestServer_GracefulShutdown
transport-test: 2020/08/31 00:59:01 [DEBUG] Connection is busy, awaiting for 1s => server.go:114
transport-test: 2020/08/31 00:59:02 [DEBUG] Connection is still in-use. Aborting. => server.go:135
transport-test: 2020/08/31 00:59:02 [ERROR] Failed to process the incoming request: write tcp 127.0.0.1:37769->127.0.0.1:50620: use of closed network connection => server.go:222
transport-test: 2020/08/31 00:59:02 [ERROR] End of the TCP connection: connection closed => server.go:219
--- PASS: TestServer_GracefulShutdown (1.10s)
=== RUN   TestClient_CreateStream
==================
WARNING: DATA RACE
Read at 0x00c000102630 by goroutine 64:
  github.com/buraksezer/olric/internal/transport.(*Server).ListenAndServe()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/server.go:269 +0xa4
  github.com/buraksezer/olric/internal/transport.TestClient_CreateStream.func1()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/stream_test.go:30 +0x3c

Previous write at 0x00c000102630 by goroutine 63:
  github.com/buraksezer/olric/internal/transport.(*Server).SetDispatcher()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/server.go:99 +0x315
  github.com/buraksezer/olric/internal/transport.TestClient_CreateStream()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/stream_test.go:48 +0x234
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1108 +0x202

Goroutine 64 (running) created at:
  github.com/buraksezer/olric/internal/transport.TestClient_CreateStream()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/stream_test.go:29 +0x110
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1108 +0x202

Goroutine 63 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1159 +0x796
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:1430 +0xa6
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1108 +0x202
  testing.runTests()
      /usr/local/go/src/testing/testing.go:1428 +0x5aa
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:1338 +0x4eb
  main.main()
      _testmain.go:55 +0x236
==================
transport-test: 2020/08/31 00:59:02 [DEBUG] Connection is busy, awaiting for 1s => server.go:114
transport-test: 2020/08/31 00:59:02 [ERROR] End of the TCP connection: EOF => server.go:219
transport-test: 2020/08/31 00:59:02 [DEBUG] Connection is idle, closing => server.go:131
==================
WARNING: DATA RACE
Write at 0x00c00005aa40 by goroutine 63:
  github.com/buraksezer/olric/internal/transport.TestClient_CreateStream.func2()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/stream_test.go:36 +0x86
  github.com/buraksezer/olric/internal/transport.TestClient_CreateStream()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/stream_test.go:93 +0x619
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1108 +0x202

Previous write at 0x00c00005aa40 by goroutine 65:
  github.com/buraksezer/olric/internal/transport.TestClient_CreateStream.func4()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/stream_test.go:79 +0xb5

Goroutine 63 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1159 +0x796
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:1430 +0xa6
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1108 +0x202
  testing.runTests()
      /usr/local/go/src/testing/testing.go:1428 +0x5aa
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:1338 +0x4eb
  main.main()
      _testmain.go:55 +0x236

Goroutine 65 (finished) created at:
  github.com/buraksezer/olric/internal/transport.TestClient_CreateStream()
      /home/buraksezer/go/src/github.com/buraksezer/olric/internal/transport/stream_test.go:78 +0x564
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1108 +0x202
==================
    testing.go:1023: race detected during execution of test
--- FAIL: TestClient_CreateStream (0.11s)
=== CONT  
    testing.go:1023: race detected during execution of test
FAIL
exit status 1
FAIL	github.com/buraksezer/olric/internal/transport	1.362s

[Support] Distributed mode & NAT

I wanted to ask if it is possible to have a distributed setup where not all nodes can talk to each other, but every node can talk to at least one other node. An example:

 +---> Node 1 (10.0.0.1)
 |
 |
 +---> Node 2 (10.0.0.2)                            Node 4 (192.168.178.44, 84.191.14.77)
 |                                                             +
 |                                                             |
 +---+ Node 3 (10.0.0.3, 84.191.14.76)                         |
                                                               |
              ^                                                |
              |                                                |
              +-----------------+  Tunnel  <-------------------+

In the README, it is explained that "only one peer is required to discover the complete cluster". Does this mean that a setup such as the one above is possible? If not with the default configuration, is it possible to configure olric in such a way (like with full replication)?

Thanks for any help!

PS: My current distributed olric setup can be found in this GitHub Repo!

Move ServiceDiscovery plugin interface out of internal/ path

Currently the ServiceDiscovery interface is hidden within the internal/ import path, which means external plugin developers can't directly reference it when ensuring they fully implement it. An example of an issue is my olric-nats-plugin which started crashing after the plugin interface was changed. I have to manually consult the source code to see the new interface since the internal symbols won't resolve for me in my tooling.
Can this be moved into an exported location within the olric project?

Edit: It is not that the interface was changed, but rather my implementation is not matching and I can't easily line up why without the interface being public.

The Expire call may result in the key being lost

`

for i := 0; i < 10; i++ {
	//c := customType{}
	//c.Field1 = fmt.Sprintf("num: %d", i)
	//c.Field2 = uint64(i)
	c := "aaaa" + strconv.Itoa(i)
	err = dm.PutEx(strconv.Itoa(i), c, 15*time.Second)
	if err != nil {
		log.Printf("Put call failed: %v", err)
	}
}
// Read them again.
for i := 0; i < 10; i++ {
	val, err := dm.Get(strconv.Itoa(i))
	if err != nil {
		log.Printf("Get call failed: %v", err)
	}
	fmt.Println("Get call one:", strconv.Itoa(i), val, reflect.TypeOf(val))
	//fmt.Println(dm.Expire(strconv.Itoa(i), 55*time.Second))
}

time.Sleep(3 * time.Second)
// Read them again.
for i := 0; i < 10; i++ {
	val, err := dm.Get(strconv.Itoa(i))
	if err != nil {
		log.Printf("Get call failed: %v", err)
	}
	fmt.Println("Get call two:", val, reflect.TypeOf(val))
}

`
If I open this line of code:
//fmt.Println(dm.Expire(strconv.Itoa(i), 55*time.Second))
image

IPv4/v6/localhost binding confusion on Linux

Tried to set up 3-node cluster with predefined 4-node.
Looks like only way to make nodes work correctly is hardcoding ipv4-local address: 127.0.0.1

3 scenarios tested:


This is working version, everything is 127.0.0.1:

3.yaml.txt
2.yaml.txt
1.yaml.txt


This one is weird, when only olricd->bindAddr set to localhost:
3ns.yaml.txt
2ns.yaml.txt
1ns.yaml.txt

What happens is:

  • Start node1
  • Start node2
  • < node1 fails >
  • Restart node1
  • < node1-2 discovers each other >
  • Start node3
  • < All connected >
    But there are still failures in the log regarding the routing table due to ipv6 confusion:
2020/04/23 21:18:01 [ERROR] Failed to update routing table on [::1]:2224: dial tcp [::1]:2224: connect: connection refused => routing.go:272
2020/04/23 21:18:01 [ERROR] Failed to update routing table on [::1]:2222: dial tcp [::1]:2222: connect: connection refused => routing.go:272
2020/04/23 21:18:01 [ERROR] Failed to update routing table on [::1]:2223: dial tcp [::1]:2223: connect: connection refused => routing.go:272
2020/04/23 21:18:01 [ERROR] Failed to update routing table on cluster: dial tcp [::1]:2224: connect: connection refused => routing.go:314

This one just doesn't work, when all addresses are ipv6-local ::1:
3v6.yaml.txt
2v6.yaml.txt
1v6.yaml.txt

No node was able to discover each other.


Environment info:

~> cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 innixos
::1 localhost

~> grep '^hosts' /etc/nsswitch.conf
hosts:     files mymachines dns myhostname

Olric in docker container - memberlist "no private IP address found, and explicit IP not provided"

I'm looking to see if you have encountered the following issue while trying to run Olric in a container:

docker run -p 3320:3320 olricio/olricd:latest
Olric quits prematurely:
no private IP address found, and explicit IP not provided

Search a bit, this issue implies its a factor of the memberlist library and its gossip protocol, in trying to choose a private ip:
thanos-io/thanos#615

And I believe it may have to do with how my docker configuration is set up. If I run it with --net=host then it starts:

docker run --net=host -p 3320:3320 olricio/olricd:latest

So my questions are:

  1. Is this something you have encountered?
  2. Is this something where you have tips that can be added to the documentation?

Provide an API tho introspect the cluster state

In order to communicate liveness and readiness to the Kubernetes cluster, I would like to have some visibility into the olric cluster health.

For example:
If the current node can't communicate with the requested number of nodes to satisfy WriteQuorum I want to mark the current node as not ready in terms of Kubernetes so it stops receiving traffic from clients, or even as not live to get the pod restarted.

Distributed version of Stats

Currently, Stats command in the protocol and olric-stats tool only work on a cluster member. So if you run the following command:

olric-stats -a=cluster-member:port

It returns statistics for only one cluster member. This is not very useful most of the time. We need to aggregate all statistics.

-s/--summary would also be useful.

Refactor TCP server, protocol and operators to implement streams properly

We need to refactor the mentioned parts to implement streams without hacking.

  • Operators needs to access the underlying TCP connection
  • Operators should know TCP connection status. Is it closed?
  • Different message types are required: Stream or Request/Response
  • Request, Response and Stream should be different implementations of the same interface.

#36 and #8 need this to be completed.

Active anti-entropy system to repair inconsistencies in DMaps

We need to find an efficient way to implement an active anti-entropy system to find and repair inconsistencies in DMaps. A detailed analysis is going to be provided before implementation. Please don't hesitate to share your experience and ideas on this topic.

Preliminary version of DMap.Query function

We need to implement a preliminary version of Query function. It would be like the following:

dm, err := db.NewDMap("mydmap")
if err != nil {
	// handle error
}

q, err := dm.Query(olric.Q{"$onKey": {"$regexMatch": "/foo.?"}})
if err != nil {
	// handle error
}

q.Range(func(key string, value []byte) bool {
	err = dm.Delete(key)
	if err != nil {
		// handle error
	}
})

Currently there is no secondary index to query keys($onKeys) or fields in values($onValues). In this preliminary version, it will do full DMap scan in a distributed manner. Later on we can implement a secondary index to query keys and values. The first version will work only for keys.

See this thread for more info: #9 (comment)

Design an interface for different storage engine implementations

The current storage engine only supports key/value data and it has many public methods. These methods are used by the different parts of Olric. We need to standardize the package interface to improve code quality and to enable implementation of different storage engines such as a document store.

memcache protocol implementation

We need to implement memcache protocol in Olric to increase integration surface. Targed protocol is the text-based one. Firstly we need to investigate potential problems in the current implementation of DMap and related subsystems. Independent tasks should be spawn for improvements in the current implementation.

Protocol specification: https://github.com/memcached/memcached/blob/master/doc/protocol.txt
Memcache client implementation in Golang: https://github.com/bradfitz/gomemcache/memcache

This task should be implemented after merging #28. feature/issue-28 contains important improvements on TCP server implementation and flow control.

Any help would be appreciated.

Expose DMap API methods via HTTP

Currently we have only official Golang client to access DMaps. Python and Java clients are on the way but there are too many programming language and working environment. It's obviously we cannot build client libraries for all the languages. A possible HTTP API should be very useful for easy access.

Olric embedded in a Kubernetes StatefulSet throws network errors, rejects writes and uses too much CPU

The intended use case for Olric in my application is to act as a replicated cache, that doesn't need transactions or disk persistence, but needs to keep values in case of node/datacenter failures and through upgrades.

To achieve this I added olric embedded to my application which is deployed using a Kubernetes StatefulSet with 3 replicas.

To discover the other replicas of my application for olric I use the olric-cloud-plugin, and after some guessing and trial and error I got it to discover the other peers and have olric seem to form a cluster.

However with 3 replicas of my application, MinimumReplicaCount = 2 and ReplicationMode = AsyncReplicationMode olric seems to be unhappy:

  • writes often fail with network I/O timeouts
  • one member uses ~100% of a CPU all the time
  • memory use seems very high
  • not sure node cleanup works at all (relevant after application updates/node restarts,...)

I still need to try this use case with olricd to see if I can reproduce the problem there.

Now I know that my application (because it is still in it's early stages) has a dumb behavior: the data that needs to be cached comes from the kubernetes API server and all nodes receive the same data, and also all write the same values to the olric DMap.
You documentation leads me to believe this should be fine, because the last write will win.

I guess I'll have to write a sample application to demonstrate the problem... but I don't have access to the applications code right now, so I will add that later.

Redesign and reimplement FSCK

We have to redesign FSCK component. Detailed analysis and design notes will be shared under this issue.

Please feel free to join the design process.

Fix #7, #8 and #9 firstly.

Check partition count on routing table update

We need to check partition count on update to catch inconsistencies in configuration. When a node gets a routing table which contains a different number of partitions than the described in configuration, the new routing table will be ignored with a warning message.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.